Error getting the taxid column in ecopcr
I have ncbi reference sequence data in a fasta format that was downloaded using nsdpy. I was able to import my reference file (sequences.fasta), and import the taxdump file as specified in the wolf tutorial. But when I run obi ecopcr, I get "Error getting the taxid column". See below for my code and the error. I do not know how to verify the "my_tax" files because of the obi format, I cannot view the contents.
Code:
obi import /ref_dbs/nsdpy/NSDPY_results/2023-08-11_12-06-02/fasta/sequences.fasta iDNAtest/nsdpy_refs
wget https://ftp.ncbi.nlm.nih.gov/pub/taxonomy/taxdump.tar.gz
obi import --taxdump taxdump.tar.gz iDNAtest/taxonomy/my_tax
obi ecopcr -e 3 -l 50 -L 160 -F CGGTTGGGGTGACCTCGGA -R GCTGTTATCCCTAGGGTAACT --taxonomy iDNAtest/taxonomy/my_tax iDNAtest/nsdpy_refs iDNAtest/16SMam_refs
The error message:
[ecopcr : INFO ] obi ecopcr DEBUG /tmp/pip-install-q9f2v4ku/obitools3_4d00853c37a9478b8d299eb6e1ee07f1/src/obi_ecopcr.c:816:obi_ecopcr, obi_errno = 0, errno = 2 : Error getting the taxid column Traceback (most recent call last): File "/tools/OBITools3/obi3-env/bin/obi", line 62, in <module> config[root_config_name]['module'].run(config) File "python/obitools3/commands/ecopcr.pyx", line 217, in obitools3.commands.ecopcr.run Exception: Error running ecopcr
Format of the nsdpy fasta file:
>OK183856.1 Cercopithecus erythrotis voucher T1768 16S large subunit ribosomal RNA gene, partial sequence; mitochondrial CTGCCTGCCCAGTGACACACGTTTAACGGCCGCGGTACCCTGACCGTGCAAAGGTAGCATAATCACTTGT TCTTTAAATAGGGACTCGTATGAATGGCATCACGAGGGTTTAACTGTCTCTTACTTTCAACCAGTGAAAT TGACCTGTCCGTGAAGAGACGGACATGAAACAATAAGACGAGAAGACCCTGTGGAGCTTCAATTTATTAG TACAACTAAAAACAACACAAACCAACAGGCCCTAAACCCCTACATCTGTGCTAAAAATTTTGGTTGGGGC GACCTCGGAGCACAACCAAACCTCCGAATAATCCACGCTAAGACTACACAAGTCAAAGCAAACTAACACC TACAATTGACCCAATAATTTGATCAACGGAACAAGTTACCCCAGGGATAACAGCGCAATTCTATTCTAGA GTCCATATCAACAATAGAGTTTACGACCTCGATGTTGGATCAGGATATCCTAATGGTGCAGCAGCTATCA AG
It looks like this differs from the format of the reference sequences in the wolf tutorial from EMBL. But I do not know how to get my reference database fasta in the OBITools3 format, and I don't see any info on how to download ncbi data to OBITools3. Any help in resolving this is much appreciated!