Skip to content

GitLab

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
O
OBITools3
  • Project overview
    • Project overview
    • Details
    • Activity
    • Releases
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 33
    • Issues 33
    • List
    • Boards
    • Labels
    • Service Desk
    • Milestones
  • Merge Requests 0
    • Merge Requests 0
  • Operations
    • Operations
    • Incidents
  • Analytics
    • Analytics
    • Repository
    • Value Stream
  • Wiki
    • Wiki
  • Members
    • Members
  • Collapse sidebar
  • Activity
  • Graph
  • Create a new issue
  • Commits
  • Issue Boards
  • OBITools
  • OBITools3
  • Issues
  • #132

Closed
Open
Opened Aug 18, 2023 by Tiff DeGroot@tldegroo

Error getting the taxid column in ecopcr

I have ncbi reference sequence data in a fasta format that was downloaded using nsdpy. I was able to import my reference file (sequences.fasta), and import the taxdump file as specified in the wolf tutorial. But when I run obi ecopcr, I get "Error getting the taxid column". See below for my code and the error. I do not know how to verify the "my_tax" files because of the obi format, I cannot view the contents.

Code: obi import /ref_dbs/nsdpy/NSDPY_results/2023-08-11_12-06-02/fasta/sequences.fasta iDNAtest/nsdpy_refs

wget https://ftp.ncbi.nlm.nih.gov/pub/taxonomy/taxdump.tar.gz

obi import --taxdump taxdump.tar.gz iDNAtest/taxonomy/my_tax

obi ecopcr -e 3 -l 50 -L 160 -F CGGTTGGGGTGACCTCGGA -R GCTGTTATCCCTAGGGTAACT --taxonomy iDNAtest/taxonomy/my_tax iDNAtest/nsdpy_refs iDNAtest/16SMam_refs

The error message: [ecopcr : INFO ] obi ecopcr DEBUG /tmp/pip-install-q9f2v4ku/obitools3_4d00853c37a9478b8d299eb6e1ee07f1/src/obi_ecopcr.c:816:obi_ecopcr, obi_errno = 0, errno = 2 : Error getting the taxid column Traceback (most recent call last): File "/tools/OBITools3/obi3-env/bin/obi", line 62, in <module> config[root_config_name]['module'].run(config) File "python/obitools3/commands/ecopcr.pyx", line 217, in obitools3.commands.ecopcr.run Exception: Error running ecopcr

Format of the nsdpy fasta file: >OK183856.1 Cercopithecus erythrotis voucher T1768 16S large subunit ribosomal RNA gene, partial sequence; mitochondrial CTGCCTGCCCAGTGACACACGTTTAACGGCCGCGGTACCCTGACCGTGCAAAGGTAGCATAATCACTTGT TCTTTAAATAGGGACTCGTATGAATGGCATCACGAGGGTTTAACTGTCTCTTACTTTCAACCAGTGAAAT TGACCTGTCCGTGAAGAGACGGACATGAAACAATAAGACGAGAAGACCCTGTGGAGCTTCAATTTATTAG TACAACTAAAAACAACACAAACCAACAGGCCCTAAACCCCTACATCTGTGCTAAAAATTTTGGTTGGGGC GACCTCGGAGCACAACCAAACCTCCGAATAATCCACGCTAAGACTACACAAGTCAAAGCAAACTAACACC TACAATTGACCCAATAATTTGATCAACGGAACAAGTTACCCCAGGGATAACAGCGCAATTCTATTCTAGA GTCCATATCAACAATAGAGTTTACGACCTCGATGTTGGATCAGGATATCCTAATGGTGCAGCAGCTATCA AG

It looks like this differs from the format of the reference sequences in the wolf tutorial from EMBL. But I do not know how to get my reference database fasta in the OBITools3 format, and I don't see any info on how to download ncbi data to OBITools3. Any help in resolving this is much appreciated!

Edited Aug 18, 2023 by Tiff DeGroot
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Assignee
Assign to
None
Milestone
None
Assign milestone
Time tracking
None
Due date
None
Reference: obitools/obitools3#132