Commit f4a123cd authored by Celine Mercier's avatar Celine Mercier

updated the documentation with the special values, and the idea of

column directories and column group directories.
parent 32fd7b5a
......@@ -2,19 +2,19 @@
The OBItools3 Data Management System (OBIDMS)
*********************************************
A complete DNA Metabarcoding experiment rely on several kinds of data.
A complete DNA metabarcoding experiment relies on several kinds of data.
- The sequence data resulting of the PCR products sequencing,
- The sequence data resulting from the sequencing of the PCR products,
- The description of the samples including all their metadata,
- One or several refence database used for the taxonomical annotation
- One or several taxonomies.
- One or several reference databases used for the taxonomic annotation,
- One or several taxonomy databases.
Up to now each of these categories of data were stored in separate
files an nothing obliged to keep them together.
Up to now, each of these categories of data were stored in separate
files, and nothing made it mandatory to keep them together.
The `Data Management System` (DMS) of OBITools3 can be considered
as a basic database system.
The `Data Management System` (DMS) of OBITools3 can be regarded as a basic
database system.
OBIDMS UML
......@@ -25,11 +25,30 @@ OBIDMS UML
:download:`html version of the OBIDMS UML file <UML/ObiDMS_UML.class.violet.html>`
An OBIDMS directory consists of :
* OBIDMS column files
* OBIDMS release files
* OBIDMS dictionary files
* one OBIDMS history file
An OBIDMS directory contains :
* one `OBIDMS history file <#obidms-history-files>`_
* Two different kinds of directories :
* OBIDMS column directories
* OBIDMS column group directories containing OBIDMS column directories
OBIDMS column directories
=========================
OBIDMS column directories contain :
* all the different versions of one OBIDMS column, under the form of different files (`OBIDMS column files <#obidms-column-files>`_)
* one `OBIDMS release file <#obidms-release-files>`_
The directory name is the column attribute, or sub-attribute if the column directory is in a column group directory.
OBIDMS column group directories
===============================
OBIDMS column group directories contain OBIDMS column directories. They are used to store dictionary-like data, where
each key corresponds to an OBIDMS column.
The directory name is the dictionary attribute. Each key is considered a sub-attribute and is associated to its column.
OBIDMS column files
......@@ -38,7 +57,7 @@ OBIDMS column files
Each OBIDMS column file contains :
* a header of a size equal to a multiple of PAGESIZE (PAGESIZE being equal to 4096 bytes
on most systems) containing metadata
* one column of data with the same OBIType
* one column of data with the same `OBIType <types.html#obitypes>`_
Header
......@@ -48,27 +67,26 @@ The header of an OBIDMS column contains :
* Endian byte order
* Header size (PAGESIZE multiple)
*
* File status : Open/Closed
* Owner : PID of the process that created the file and is the only one allowed to modify it if it is open
* Number of lines (total or without the header?)
* OBIType
* Date of creation
* Version of the file
* Number of lines of data
* Number of lines of data used
* `OBIType <types.html#obitypes>`_ (type of the data)
* Date of creation of the file
* Version of the OBIDMS column
* The column name
* Eventual comments
Data
----
A column of data with the same OBIType.
A column of data with the same `OBIType <types.html#obitypes>`_.
Mandatory columns
-----------------
Some columns must exist in an OBIDMS directory :
* sequence identifiers column (type *OBIStr_t*)
* sequence identifiers column (type ``OBIStr_t``)
File name
......@@ -83,8 +101,7 @@ Example : ``count@3.odc``
Modifications
-------------
An OBIDMS column file can only be modified by the process that created it, if its status is set to Open. Those informations are
contained in the `header <#header>`_.
An OBIDMS column file can only be modified by the process that created it, and while its status is set to Open.
When a process wants to modify an OBIDMS column file that is closed, it must first clone it. Cloning creates a new version of the
file that belongs to the process, i.e., only that process can modify that file, as long as its status is set to Open. Once the process
......@@ -94,6 +111,8 @@ again.
That means that one column is stored in one file (if there is only one version)
or more (if there are several versions), and that there is one file per version.
All the versions of one column are stored in one directory.
Versioning
----------
......@@ -101,13 +120,13 @@ Versioning
The first version of a column file is numbered 0, and each new version increments that
number by 1.
The number of the latest version of an OBIDMS column is stored in an `OBIDMS release file <formats.html#obidms-release-files>`_.
The number of the latest version of an OBIDMS column is stored in the `OBIDMS release file <formats.html#obidms-release-files>`_ of its directory.
OBIDMS release files
====================
Each OBIDMS column is associated with an OBIDMS release file that contains the number of the latest
Each OBIDMS column is associated with an OBIDMS release file in its dorectory, that contains the number of the latest
version of the column.
File name
......@@ -139,20 +158,3 @@ operations ever done in the OBIDMS directory and the views in between them :
.. image:: ./images/history.png
:width: 150 px
:align: center
OBIType header file
========================
.. doxygenfile:: obitypes.h
OBIIntColumn header file
========================
.. doxygenfile:: obiintcolumn.h
OBIColumn header file
=====================
.. doxygenfile:: obicolumn.h
doc/source/UML/OBIDMS_UML.png

62.8 KB | W: | H:

doc/source/UML/OBIDMS_UML.png

65.9 KB | W: | H:

doc/source/UML/OBIDMS_UML.png
doc/source/UML/OBIDMS_UML.png
doc/source/UML/OBIDMS_UML.png
doc/source/UML/OBIDMS_UML.png
  • 2-up
  • Swipe
  • Onion skin
This source diff could not be displayed because it is too large. You can view the blob instead.
==============
Special values
==============
NA values
=========
All OBITypes have an associated NA (Not Available) value.
NA values are implemented by specifying an explicit NA value for each type, corresponding to the R standards:
* For the types ``OBIInt_t``, ``OBIBool_t``, ``OBIIdx_t`` and ``OBITaxid_t``, the NA value is ``INT_MIN``.
* For the type ``OBIChar_t``: the NA value is ``\0`` (?).
* For the type ``OBIStr_t`` : the NA value is a tab followed by a space.
* For the type ``OBIFloat_t``::
typedef union
{
double value;
unsigned int word[2];
} ieee_double;
static double NA_value(void)
{
volatile ieee_double x;
x.word[hw] = 0x7ff00000;
x.word[lw] = 1954;
return x.value;
}
Minimum and maximum values for ``OBIInt_t``
===========================================
* Maximum value : ``INT_MAX``
* Minimum value : ``INT_MIN(-1?)``
Infinity values for the type ``OBIFloat_t``
===========================================
* Positive infinity : ``INFINITY`` (should be defined in ``<math.h>``)
* Negative infinity : ``-INFINITY``
NaN value for the type ``OBIFloat_t``
=====================================
* NaN (Not a Number) value : ``NAN`` (should be defined in ``<math.h>`` but probably needs to be tested)
......@@ -6,20 +6,12 @@ OBITypes
.. image:: ./UML/OBITypes_UML.png
:download:`html version of the OBITypes UML file <UML/OBITypes_UML.class.violet.html>`
.. note::
All OBITypes have an associated NA (Not Available) value.
We have currently two ideas for implementing NA values:
- By specifying an explicit NA value for each type
- By adding to each column of an OBIDMS a bit vector
indicating if the value is defined or not.
.. toctree::
:maxdepth: 2
The elementary types <elementary>
The containers <containers>
Special values <specialvalues>
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment