Skip to content

GitLab

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
O
OBITools
  • Project overview
    • Project overview
    • Details
    • Activity
    • Releases
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 30
    • Issues 30
    • List
    • Boards
    • Labels
    • Service Desk
    • Milestones
  • Merge Requests 0
    • Merge Requests 0
  • Operations
    • Operations
    • Incidents
  • Analytics
    • Analytics
    • Repository
    • Value Stream
  • Wiki
    • Wiki
  • Members
    • Members
  • Collapse sidebar
  • Activity
  • Graph
  • Create a new issue
  • Commits
  • Issue Boards
  • OBITools
  • OBITools
  • Issues
  • #34

Closed
Open
Opened Jun 20, 2018 by Dani Díaz de Quijano@daniquijano

Why isn't 'obiclean_headcount > 0' the same as 'obiclean_head == True'?

I run obiclean on my lake water protist community data set without the option -H.

(I do it manually because after obiclean I would like to divide the dataset into abundant uniqs and rare ones. I will take only heads in the case of abundant community but I plan to keep singletons as well to explore the rare protist community, i.e. just to remove internals, using obigrep -p 'obiclean_internalcount == 0'.)

I understand the option -H Selects only sequences with the head status in a least one sample, the same way as using obigrep -p 'obiclean_head == True'. I got the same results doing it in both ways. My concern is maybe I didn't understand obiclean_head annotation or maybe there is a mistake. I understand and corroborated obiclean_head=True when at least one of the samples in obiclean_status is labelled 'h', and accordingly obiclean_headcount > 0. BUT I checked My fasta file and there are some sequences where I have e.g.: obiclean_status={'Nov1B': 'i', 'Sep2B': 's'} obiclean_internalcount=1 obiclean_singletoncount=1 obiclean_headcount=0 BUT, surprisingly: obiclean_head=True

It is not an isolated case: In my abundant sequences I get: 1883 unique sequences and 1226761 reads if I use obigrep -p 'obiclean_head == True', and I get 1264 unique sequences and 1210278 reads if I use obigrep -p 'obiclean_headcount > 0'

In my rare sequences I get: 6937 unique sequences and 24307 reads if I use obigrep -p 'obiclean_head == True', and I get 700 unique sequences and 3749 reads if I use obigrep -p 'obiclean_headcount > 0'

Why isn't 'obiclean_headcount > 0' the same as 'obiclean_head == True'? Is it safe to use 'obiclean_head == True' or the -H obiclean option to pick only heads up?

Thank you for your attention!

Assignee
Assign to
None
Milestone
None
Assign milestone
Time tracking
None
Due date
None
Reference: obitools/obitools#34