Package obitools :: Package format :: Package genericparser :: Class GenericParser
[hide private]
[frames] | no frames]

Class GenericParser

source code


Instance Methods [hide private]
 
__init__(self, startEntry=None, endEntry=None, head=False, tail=False, strip=False, **parseAction)
x.__init__(...) initializes x; see x.__class__.__doc__ for signature
source code
 
addParseAction(self, name, dataMatcher, dataCleaner=None, cleanSub='')
Add a parse action to the generic parser.
source code
 
_buildREParser(self, dataMatcher, dataCleaner, cleanSub) source code
 
__call__(self, file) source code

Inherited from object: __delattr__, __format__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __repr__, __setattr__, __sizeof__, __str__, __subclasshook__

Properties [hide private]

Inherited from object: __class__

Method Details [hide private]

__init__(self, startEntry=None, endEntry=None, head=False, tail=False, strip=False, **parseAction)
(Constructor)

source code 

x.__init__(...) initializes x; see x.__class__.__doc__ for signature

Parameters:
  • startEntry (str or None) - a regular pattern matching the beginning of an entry
  • endEntry (str or None) - a regular pattern matching the end of an entry
  • head (bool) - indicate if an header is present before the first entry (as in many original genbank files)
  • tail (bool) - indicate if some extra informations are present after the last entry.
  • parseAction
Overrides: object.__init__

addParseAction(self, name, dataMatcher, dataCleaner=None, cleanSub='')

source code 

Add a parse action to the generic parser. A parse action allows to extract one information from an entry. A parse action is defined by a name and a method to extract this information from the full text entry.

A parse action can be defined following two ways.

  • via regular expression patterns
  • via dedicated function.

In the first case, you have to indicate at least the dataMatcher regular pattern. This pattern should match exactly the data part you want to retrieve. If cleanning of extra characters is needed. The second pattern dataCLeanner can be used to specifyed these characters.

In the second case you must provide a callable object (function) that extract and clean data from the text entry. This function should return an array containing all data retrevied even if no data or only one data is retrevied.

Parameters:
  • name (str) - name of the data extracted
  • dataMatcher (str or SRE_Pattern instance or a callable object) - a regular pattern matching the data or a callable object parsing the entry and returning a list of marched data
  • dataCleaner (str or SRE_Pattern instance or None) - a regular pattern matching part of the data to suppress.
  • cleanSub (str) - string used to replace dataCleaner matches. Default is an empty string