Skip to content

GitLab

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
O
OBITools
  • Project overview
    • Project overview
    • Details
    • Activity
    • Releases
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 30
    • Issues 30
    • List
    • Boards
    • Labels
    • Service Desk
    • Milestones
  • Merge Requests 0
    • Merge Requests 0
  • Operations
    • Operations
    • Incidents
  • Analytics
    • Analytics
    • Repository
    • Value Stream
  • Wiki
    • Wiki
  • Members
    • Members
  • Collapse sidebar
  • Activity
  • Graph
  • Create a new issue
  • Commits
  • Issue Boards
  • OBITools
  • OBITools
  • Issues
  • #12

Closed
Open
Opened Jul 21, 2015 by Tobias Frøslev@tobias-froeslev

ngsfilter produces assigned-file with "forward_tag=None" or "reverse_tag=None"

My collegue discovered that running ngsfilter results in an "assigned-file" with many reads being assigned to samples but containing the attribute “forward_tag=None” or “reverse_tag=None” in the header.

I have now checked one of my recent “assigned.fastq”-files (product of running ngsfilter and, and that also contains many of these "pseudo-assigned” reads. (approx 4% of the total!)a

This indicates that ngsfilter assigns reads on the basis of only one matching tag.

Here you can see what my ngsfilter file looks like

Lichen F017R008 TACGACT:ATCGCGA GTGARTCATCGARTCTTTG TCCTCCGCTTATTGATATGC F at-sign

Lichen F103R064 CTTCCTT:GCATGGA GTGARTCATCGARTCTTTG TCCTCCGCTTATTGATATGC F at-sign

Lichen F020R009 ATCAGTC:CGCTCTC GTGARTCATCGARTCTTTG TCCTCCGCTTATTGATATGC F at-sign

Lichen F027R014 CTCTGCT:GCGTCAG GTGARTCATCGARTCTTTG TCCTCCGCTTATTGATATGC F at-sign

And here is the stats on the tags

$ obistat -c forward_tag -c reverse_tag lichen.assigned.fastq

lichen.assigned.fastq 100.0 % |#################################################| ] remain : 00:00:00

forward_tag reverse_tag count total

tacgact atcgcga 82157 82157

None gcgtcag 8303 8303

None atcgcga 1101 1101

None cgctctc 12055 12055

atcagtc cgctctc 667541 667541

cttcctt gcatgga 240579 240579

cttcctt None 4504 4504

None gcatgga 3343 3343

atcagtc None 9706 9706

tacgact None 945 945

PS: I also tested it on the wolf tutorial. Here it is also evident, although only one sample:

$ obistat -c forward_tag -c reverse_tag wolf.ali.assigned.fastq

wolf.ali.assigned.fastq 99.3 % |#################################################- ] remain : 00:00:00

forward_tag reverse_tag count total

gcctcct gcctcct 9851 9851

gaatatc gaatatc 14700 14700

aattaac aattaac 9056 9056

gcctcct None 1 1

gaagtag gaagtag 9724 9724

Looking at actual corresponding (raw) reads, it is evident that the tag (sequence) indicated as “None” is actually not present in the raw reads, and that the sequence should not be included as assigned

Are you aware of this potential problem?

(edit: changed to at-signs in the ngsfilter-file (this post) to "at-sign", as it interfered with the layout, and added some double new-lines to ease the reading)

Assignee
Assign to
None
Milestone
None
Assign milestone
Time tracking
None
Due date
None
Reference: obitools/obitools#12