Commit 713d621d by Eric Coissac

--no commit message

parent ad0b4e17
......@@ -17,13 +17,13 @@ How to analyze DNA metabarcoding data produced on Illumina sequencers using:
| use : |
| |
| - :doc:`obicount <scripts/obicount>` to count for the number|
| of sequence records in a file |
| of sequence records in a file |
| - :doc:`obihead <scripts/obihead>` and |
| :doc:`obitail <scripts/obitail>` to view the first |
| or last sequence records of a file |
| - :doc:`obistatobistat` to get some basic statistics (count,|
| mean, standard deviation) on the attributes |
| (key=value couples) in the fasta header of each |
| :doc:`obitail <scripts/obitail>` to view the first |
| or last sequence records of a file |
| - :doc:`obistat <scripts/obistat>` to get some basic |
| statistics (count, mean, standard deviation) on the |
| attributes (key=value couples) in the fasta header of each|
| sequence record (see The `extended OBITools fasta format` |
| in the :doc:`fasta format <fasta>` description) |
| - any *Unix* command such as ``less``, ``awk``, ``sort``, |
......@@ -47,12 +47,12 @@ The data needed to run the tutorial are the following:
- the file describing the primers and tags used for all samples sequenced:
* ``wolf_ngsfilter.txt``
The tags correspond to short and specific sequences added on the 5' end of each primer to distinguish the different samples
The tags correspond to short and specific sequences added on the 5' end of each primer to distinguish the different samples
- the file containing the reference database in fasta format:
* ``db_v05_r117.fasta``
This reference database has been extracted from the release 117 of EMBL using :doc:`ecoPCR <scripts/ecoPCR>`
This reference database has been extracted from the release 117 of EMBL using :doc:`ecoPCR <scripts/ecoPCR>`
- the NCBI taxonomy formatted in the :doc:`ecoPCR <scripts/ecoPCR>` format (see the :doc:`obiconvert <scripts/obiconvert>` utility for details) :
......@@ -191,7 +191,7 @@ it is convenient to work with uniq *sequences* instead of *reads*. To *dereplica
| 2. group strictly identical reads together |
| 3. output the sequence for each group and its count in the |
| original dataset (in this way, all duplicated reads are |
| removed) |
| removed) |
| |
| Definition adapted from [#]_ |
+-------------------------------------------------------------+
......@@ -225,6 +225,7 @@ The first sequence record of ``wolf.ali.ngs.uniq.fasta`` is:
The run of :doc:`obiuniq <scripts/obiuniq>` has added two key=values entries in the header of the fasta sequence :
- :py:mod:`merged_sample={'29a_F260619': 1}` : this sequence have been found once in a single sample
- :py:mod:`count=1` : the total number of counts for this sequence is 1
To keep only these two ``key=value`` informations, we can use the :doc:`obiannotate <scripts/obiannotate>` command:
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment