Package obitools :: Module fasta
[hide private]
[frames] | no frames]

Module fasta

source code

fasta module provides functions to read and write sequences in fasta format.

Functions [hide private]
 
parseFastaDescription(ds, tagparser=re.compile(r'([a-zA-Z]\w*) *= *([^;]+);')) source code
 
_fastaJoinSeq(seqarray) source code
 
fastaParser(seq, bioseqfactory, tagparser=re.compile(r'([a-zA-Z]\w*) *= *([^;]+);'), joinseq=<function _fastaJoinSeq at 0x11826b0>)
Parse a fasta record.
source code
 
fastaNucParser(seq, tagparser=re.compile(r'([a-zA-Z]\w*) *= *([^;]+);'), joinseq=<function _fastaJoinSeq at 0x11826b0>) source code
 
fastaAAParser(seq, tagparser=re.compile(r'([a-zA-Z]\w*) *= *([^;]+);'), joinseq=<function _fastaJoinSeq at 0x11826b0>) source code
 
fastaIterator(file, bioseqfactory=<function bioSeqGenerator at 0x10b1030>, tagparser=re.compile(r'([a-zA-Z]\w*) *= *([^;]+);'), joinseq=<function _fastaJoinSeq at 0x11826b0>)
iterate through a fasta file sequence by sequence.
source code
 
fastaNucIterator(file, tagparser=re.compile(r'([a-zA-Z]\w*) *= *([^;]+);'))
iterate through a fasta file sequence by sequence.
source code
 
fastaAAIterator(file, tagparser=re.compile(r'([a-zA-Z]\w*) *= *([^;]+);'))
iterate through a fasta file sequence by sequence.
source code
str
formatFasta(data, gbmode=False)
Convert a seqence or a set of sequences in a string following the fasta format
source code
Variables [hide private]
  _parseFastaTag = re.compile(r'([a-zA-Z]\w*) *= *([^;]+);')
  __package__ = 'obitools'

Imports: genericEntryIteratorGenerator, bioSeqGenerator, BioSequence, AASequence, NucSequence, alignmentReader, universalOpen, re, fastaEntryIterator


Function Details [hide private]

fastaParser(seq, bioseqfactory, tagparser=re.compile(r'([a-zA-Z]\w*) *= *([^;]+);'), joinseq=<function _fastaJoinSeq at 0x11826b0>)

source code 

Parse a fasta record.

Parameters:
  • seq (list or tuple of str) - a sequence object containing all lines corresponding to one fasta sequence
  • bioseqfactory (a callable object) - a callable object return a BioSequence instance.
  • tagparser (regex instance) - a compiled regular expression usable to identify key, value couples from title line.
Returns:
a BioSequence instance

Attention: internal purpuse function

fastaIterator(file, bioseqfactory=<function bioSeqGenerator at 0x10b1030>, tagparser=re.compile(r'([a-zA-Z]\w*) *= *([^;]+);'), joinseq=<function _fastaJoinSeq at 0x11826b0>)

source code 

iterate through a fasta file sequence by sequence. Returned sequences by this iterator will be BioSequence instances

Parameters:
  • file (an iterable object or str) - a line iterator containing fasta data or a filename
  • bioseqfactory (a callable object) - a callable object return a BioSequence instance.
  • tagparser (regex instance) - a compiled regular expression usable to identify key, value couples from title line.
Returns:
an iterator on BioSequence instance

fastaNucIterator(file, tagparser=re.compile(r'([a-zA-Z]\w*) *= *([^;]+);'))

source code 

iterate through a fasta file sequence by sequence. Returned sequences by this iterator will be NucSequence instances

Parameters:
  • file (an iterable object) - a line iterator containint fasta data
  • tagparser (regex instance) - a compiled regular expression usable to identify key, value couples from title line.
Returns:
an iterator on NucBioSequence instance

fastaAAIterator(file, tagparser=re.compile(r'([a-zA-Z]\w*) *= *([^;]+);'))

source code 

iterate through a fasta file sequence by sequence. Returned sequences by this iterator will be AASequence instances

Parameters:
  • file (an iterable object) - a line iterator containing fasta data
  • tagparser (regex instance) - a compiled regular expression usable to identify key, value couples from title line.
Returns:
an iterator on AABioSequence instance

formatFasta(data, gbmode=False)

source code 

Convert a seqence or a set of sequences in a string following the fasta format

Parameters:
  • data (BioSequence instance or an iterable object on BioSequence instances) - sequence or a set of sequences
  • gbmode (bool) - if set to True identifier part of the title line follows recommendation from nbci to allow sequence indexing with the blast formatdb command.
Returns: str
a fasta formated string