Why isn't 'obiclean_headcount > 0' the same as 'obiclean_head == True'? (#34) · Issues · OBITools / OBITools

Why isn't 'obiclean_headcount > 0' the same as 'obiclean_head == True'?

I run obiclean on my lake water protist community data set without the option -H.

(I do it manually because after obiclean I would like to divide the dataset into abundant uniqs and rare ones. I will take only heads in the case of abundant community but I plan to keep singletons as well to explore the rare protist community, i.e. just to remove internals, using obigrep -p 'obiclean_internalcount == 0'.)

I understand the option -H Selects only sequences with the head status in a least one sample, the same way as using obigrep -p 'obiclean_head == True'. I got the same results doing it in both ways. My concern is maybe I didn't understand obiclean_head annotation or maybe there is a mistake. I understand and corroborated obiclean_head=True when at least one of the samples in obiclean_status is labelled 'h', and accordingly obiclean_headcount > 0. BUT I checked My fasta file and there are some sequences where I have e.g.: obiclean_status={'Nov1B': 'i', 'Sep2B': 's'} obiclean_internalcount=1 obiclean_singletoncount=1 obiclean_headcount=0 BUT, surprisingly: obiclean_head=True

It is not an isolated case: In my abundant sequences I get: 1883 unique sequences and 1226761 reads if I use obigrep -p 'obiclean_head == True', and I get 1264 unique sequences and 1210278 reads if I use obigrep -p 'obiclean_headcount > 0'

In my rare sequences I get: 6937 unique sequences and 24307 reads if I use obigrep -p 'obiclean_head == True', and I get 700 unique sequences and 3749 reads if I use obigrep -p 'obiclean_headcount > 0'

Why isn't 'obiclean_headcount > 0' the same as 'obiclean_head == True'? Is it safe to use 'obiclean_head == True' or the -H obiclean option to pick only heads up?

Thank you for your attention!