parser API reference¶
-
class
taxadb.parser.
Accession2TaxidParser
(acc_file=None, chunk=500, fast=False, **kwargs)[source]¶ Main parser class for nucl_xxx_accession2taxid files
This class is used to parse accession2taxid files.
Parameters: - acc_file (
str
) – File to parse - chunk (
int
) – Chunk insert size. Default 500 - fast (
bool
) – Directly load accession into database, do not check existence.
-
accession2taxid
(acc2taxid=None, chunk=None)[source]¶ Parses the accession2taxid files
- This method parses the accession2taxid file, build a dictionary,
- stores it in a list and yield for insertion in the database.
{ 'accession': accession_id_from_file, 'taxid': associated_taxonomic_id }
Parameters: - acc2taxid (
str
) – Path to acc2taxid input file (gzipped) - chunk (
int
) – Chunk size of entries to gather before yielding. Default 500 (set at object construction)
Yields: list – Chunk size of read entries
- acc_file (
-
class
taxadb.parser.
TaxaDumpParser
(nodes_file=None, names_file=None, **kwargs)[source]¶ Main parser class for ncbi taxdump files
This class is used to parse NCBI taxonomy files found in taxdump.gz archive
Parameters: - nodes_file (
str
) – Path to nodes.dmp file - names_file (
str
) – Path to names.dmp file
-
set_names_file
(names_file)[source]¶ Set names_file
Set the accession file to use
Parameters: names_file ( str
) – Nodes file to be setReturns: True Raises: SystemExit
– If names_file is None or not a file (check_file)
-
set_nodes_file
(nodes_file)[source]¶ Set nodes_file
Set the accession file to use
Parameters: nodes_file ( str
) – Nodes file to be setReturns: True Raises: SystemExit
– If nodes_file is None or not a file (check_file)
-
taxdump
(nodes_file=None, names_file=None)[source]¶ Parse .dmp files
- Parse nodes.dmp and names.dmp files (from taxdump.tgz) and insert
- taxons in Taxa table.
Parameters: - nodes_file (
str
) – Path to nodes.dmp file - names_file (
str
) – Path to names.dmp file
Returns: Zipped data from both files
Return type: list
- nodes_file (
-
class
taxadb.parser.
TaxaParser
(verbose=False)[source]¶ Base parser class for taxonomic files
-
__weakref__
¶ list of weak references to the object (if defined)
-