Release 4.0.3
May 2nd, 2023. Release 4.0.3
New features
-
Adding of the function
contains
to the expression language for testing if a map contains a key. It can be used fromobibrep
to select only sequences occurring in a given sample :obigrep -p 'contains(annotations.merged_sample,"15a_F730814")' wolf_new_tag.fasta
-
Adding of a new command
obipcrtag
. It tags raw Illumina reads with the identifier of their corresponding sample. The tags added are the same as those added byobimultiplex
. The produced forward and reverse files can then be split into different files using theobidistribute
command.obitagpcr -F library_R1.fastq \ -R library_R2.fastq \ -t sample_ngsfilter.txt \ --out tagged_library.fastq \ --unidentified not_assigned.fastq
the command produced four files :
tagged_library_R1.fastq
andtagged_library_R2.fastq
containing the assigned reads andnot_assigned_R1.fastq
andnot_assigned_R2.fastq
containing the unassignable reads.the tagged library files can then be split using
obidistribute
:mkdir pcr_reads obidistribute --pattern "pcr_reads/sample_%s_R1.fastq" -c sample tagged_library_R1.fastq obidistribute --pattern "pcr_reads/sample_%s_R2.fastq" -c sample tagged_library_R2.fastq
-
Adding of two options --add-lca-in and --lca-error to
obiannotate
. These options aim to help during construction of reference database usingobipcr
. On obipcr output, it is commonly run obiuniq. To merge identical sequences annotated with different taxids, it is now possible to use the following strategie :obiuniq -m taxid myrefdb.obipcr.fasta \ | obiannotate -t taxdump --lca-error 0.05 --add-lca-in taxid \ > myrefdb.obipcr.unique.fasta
The
obiuniq
call merge identical sequences keeping track of the diversity of the taxonomic annotations in themerged_taxid
slot, whileobiannotate
loads a NCBI taxdump and computes the lowest common ancestor of the taxids represented inmerged_taxid
. By specifying --lca-error 0.05, we indicate that we allow for at most 5% of the taxids disagreeing with the computed LCA. The computed LCA is stored in the slot specified as a parameter of the option --add-lca-in. Scientific name and actual error rate corresponding to the estimated LCA are also stored in the sequence annotation.
Enhancement
- Rename the
forward_mismatches
andreverse_mismatches
from instanced byobimutiplex
intoforward_error
andreverse_error
to be coherent with the tags instanced byobipcr
Corrected bugs
- Correction of a bug in memory management and Slice recycling.
- Correction of the --fragmented option help and logging information
- Correction of a bug in
obiconsensus
leading into the deletion of a base close to the beginning of the consensus sequence.