Commit a339905e authored by Eric Coissac's avatar Eric Coissac

restructuration of the documentation

parents b0c9b295 1d39dcc5
......@@ -768,7 +768,7 @@ WARN_LOGFILE =
# spaces.
# Note: If this tag is empty the current directory is searched.
INPUT = /Users/celinemercier/Documents/workspace/OBITools3/src
INPUT = ../src
# This tag can be used to specify the character encoding of the source files
# that doxygen parses. Internally doxygen uses the UTF-8 encoding. Doxygen uses
......@@ -1085,7 +1085,7 @@ IGNORE_PREFIX =
# If the GENERATE_HTML tag is set to YES, doxygen will generate HTML output
# The default value is: YES.
GENERATE_HTML = NO
GENERATE_HTML = YES
# The HTML_OUTPUT tag is used to specify where the HTML docs will be put. If a
# relative path is entered the value of OUTPUT_DIRECTORY will be put in front of
......
*********************************************
The OBItools3 Data Management System (OBIDMS)
*********************************************
A complete DNA Metabarcoding experiment rely on several kinds of data.
- The sequence data resulting of the PCR products sequencing,
- The description of the samples including all their metadata,
- One or several refence database used for the taxonomical annotation
- One or several taxonomies.
Up to now each of these categories of data were stored in separate
files an nothing obliged to keep them together.
The `Data Management System` (DMS) of OBITools3 can be considered
as a basic database system.
OBIDMS UML
==========
.. image:: ./images/OBIDMS_UML.png
:download:`html version of the OBIDMS UML file <UML/ObiDMS_UML.class.violet.html>`
An OBIDMS directory consists of :
* OBIDMS column files
* OBIDMS release files
* OBIDMS dictionary files
* one OBIDMS history file
OBIDMS column files
===================
Each OBIDMS column file contains :
* a header of a size equal to a multiple of PAGESIZE (PAGESIZE being equal to 4096 bytes
on most systems) containing metadata
* one column of data with the same OBIType
Header
------
The header of an OBIDMS column contains :
* Endian byte order
* Header size (PAGESIZE multiple)
*
* File status : Open/Closed
* Owner : PID of the process that created the file and is the only one allowed to modify it if it is open
* Number of lines (total or without the header?)
* OBIType
* Date of creation
* Version of the file
* Eventual comments
Data
----
A column of data with the same OBIType.
Mandatory columns
-----------------
Some columns must exist in an OBIDMS directory :
* sequence identifiers column (type *OBIStr_t*)
File name
---------
Each file is named with the attribute associated to the data it contains, and the number of
its version, separated by an ``@``, and with the extension ``.odc``.
Example : ``count@3.odc``
Modifications
-------------
An OBIDMS column file can only be modified by the process that created it, if its status is set to Open. Those informations are
contained in the `header <#header>`_.
When a process wants to modify an OBIDMS column file that is closed, it must first clone it. Cloning creates a new version of the
file that belongs to the process, i.e., only that process can modify that file, as long as its status is set to Open. Once the process
has finished writing the new version of the column file, it sets the column file's status to Closed, and the file can never be modified
again.
That means that one column is stored in one file (if there is only one version)
or more (if there are several versions), and that there is one file per version.
Versioning
----------
The first version of a column file is numbered 1, and each new version increments that
number by 1.
The number of the latest version of an OBIDMS column is stored in an `OBIDMS release file <formats.html#obidms-release-files>`_.
OBIDMS release files
====================
Each OBIDMS column is associated with an OBIDMS release file that contains the number of the latest
version of the column.
File name
---------
OBIDMS release files are named with the attribute associated to the data contained in the column, and
have the extension ``.odr``.
Example : ``count.odr``
OBIDMS views
============
An OBIDMS view consists of a list of OBIDMS columns and lines. A view includes one version
of each mandatory column. Only one version of each column is included. All the columns of
one view contain the same number of lines in the same order.
OBIDMS history file
===================
An OBIDMS history file consists of an ordered list of views and commands, those commands leading
from one view to the next one.
This history can be represented in the form of a --- showing all the
operations ever done in the OBIDMS directory and the views in between them :
.. image:: ./images/history.png
:width: 150 px
:align: center
OBIIntColumn header file
========================
.. doxygenfile:: obiintcolumn.h
OBIColumn header file
=====================
.. doxygenfile:: obicolumn.h
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
===============
Container types
===============
Containers allow to manage collection of values of homogeneous type.
Three container types exist.
A container is a non-mutable structure once it has been locked.
Consequently just insert procedure are needed
Lists
-----
Correspond to an ordered list of values belonging an elementary type.
At its creation
Sets
----
Correspond to an unordered set of values belonging an elementary type.
Dictionaries
------------
Dictionaries allow to associate a `key` to a `value`. Values can be retrieved through its associated key.
Values must belong an elementary type and keys must be *OBIStr_t*.
#################
Data in OBITools3
#################
The OBITools3 inaugure a new way to manage DNA metabarcoding data.
They rely on a `Data management System` (DMS) that can be considered as
a simplified database system.
.. toctree::
:maxdepth: 2
The data management system <DMS>
The data types <types>
================
Elementary types
================
They correspond to simple values.
Atomic types
------------
========= ========= ============ ==============================
Type C type OBIType Definition
========= ========= ============ ==============================
integer int32_t OBIInt_t a signed integer value
float double OBIFloat_t a floating value
boolean ? OBIBool_t a boolean true/false value
char char OBIChar_t a character
index size_t OBIIdx_t an index in a data structure
========= ========= ============ ==============================
The composite types
-------------------
Character string type
.....................
================ ====== ======== ==================
Type C type OBIType Definition
================ ====== ======== ==================
Character string ? OBIStr_t a character string
================ ====== ======== ==================
The taxid type
..............
==================== ====== ========== ======================
Type C type OBIType Definition
==================== ====== ========== ======================
Taxonomic identifier size_t OBITaxid_t a taxonomic identifier
==================== ====== ========== ======================
#######
Formats
#######
*********************************************
The OBItools3 Data Management System (OBIDMS)
*********************************************
An OBIDMS directory consists of :
* OBIDMS column files
* OBIDMS view descriptions
* an OBIDMS history file
OBIDMS column files
===================
Each OBIDMS column file contains :
* a header of a size equal to a multiple of PAGESIZE (PAGESIZE being equal to 4096 bytes
on most systems) containing metadata
* one column of data of the same type
OBIDMS column files are read-only.
File name
---------
Each file is named with the attribute associated to the data it contains, and the number of
its version, separated by an underscore.
Example : ``count_0003``
.. todo::
Filename extension?
Header
------
The header of an OBIDMS column contains :
* Endian byte order
* PAGESIZE value / Size of the header
* Number of lines (total or without the header?)
* Data type (int, str...)
* Date of creation
* Version of the file
* Eventual comments
Data
----
A column of data of the same type.
Versioning
----------
Since OBIDMS column files are read-only, any modification leads to the creation of a new version
of the column file.
The first version of a column file is numbered 0001, and each new version increments that
number by 1.
Mandatory columns
-----------------
Some columns must exist in an OBIDMS directory :
* sequence identifiers column
OBIDMS history file
===================
An OBIDMS history file consists of data that can be represented in the form of a directed acyclic
graph presenting the history of all the operations ever done in the OBIDMS directory.
OBIDMS views
============
An OBIDMS view corresponds to a list of OBIDMS columns and lines. A view includes one version
of each mandatory column. Only one version of each column is included. All the columns of
one view contain the same number of lines in the same order.
OBIDMS ULM
==========
.. image:: ./images/OBIDMS_ULM.png
:download:`html version of the OBIDMS ULM file </ObiDMS_ULM.class.violet.html>`
OBIIntColumn header file
========================
.. doxygenfile:: obiintcolumn.h
......@@ -100,12 +100,12 @@ Naming conventions
******************
.. todo::
Look for usual naming conventions
Look for common naming conventions
*****************
Programming rules
*****************
* The *int* type should never be used
*
......@@ -8,9 +8,10 @@ OBITools3 documentation
.. toctree::
:maxdepth: 2
Programming guidelines <guidelines>
Formats <formats>
Data structures <data>
Pistes de reflexion <pistes>
Indices and tables
......@@ -19,4 +20,3 @@ Indices and tables
* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`
###################
Pistes de reflexion
###################
******************************
Ce que l'on veut pouvoir faire
******************************
* Gerer les valeurs manquantes
* Modifier une colonne en cours d'ecriture (mmap)
* Ajouter des valeurs a la fin d'une colonne en cours d'ecriture (mmap)
******
Divers
******
* Si l'ordre d'une colonne est change, elle est reecrite (pas d'index).
* Truc pour verrouiller l'acces en lecture a un programme a la fois...
\ No newline at end of file
********
OBITypes
********
.. image:: ./images/OBITypes_UML.png
:download:`html version of the OBITypes UML file <UML/OBITypes_UML.class.violet.html>`
.. note::
All OBITypes have an associated NA (Not Available) value.
We have currently two ideas for implementing NA values:
- By specifying an explicit NA value for each type
- By adding to each column of an OBIDMS a bit vector
indicating if the value is defined or not.
.. toctree::
:maxdepth: 2
The elementary types <elementary>
The containers <containers>
/****************************************************************************
* OBIColumn header file *
****************************************************************************/
/**
* @file obicolumn.h
* @author Celine Mercier
* @date 12 May 2015
* @brief Header file for the shared elements of all the OBIColumn structures.
*/
#ifndef OBICOLUMN_H_
#define OBICOLUMN_H_
#include <stdio.h>
/**
* @brief enum OBIDataType for the data type of the OBIColumns.
*/
typedef enum OBIDataType {
OBI_VOID = 0, /**< data type not specified */
OBI_INT32, /**< type int32_t */
OBI_INT64, /**< type int64_t */
OBI_UINT32, /**< type uint32_t */
OBI_UNIT64, /**< type uint64_t */
OBI_STRING /**< type char* */
} OBIDataType_t, *OBIDataType_p;
/**
* @brief OBIColumnHeader structure.
*/
typedef struct OBIColumnHeader {
bool little_endian_order; /**< endian byte order :
- True : little endian
- False: big endian
*/
int header_size_value; /**< size of the header: a multiple of PAGESIZE */
int line_count; /**< number of lines of data */
OBIDataType_t data_type; /**< type of the data */
char* creation_date; /**< date of creation of the file */
int version; /**< version of the OBIColumn */
char* comments; /**< comments */>
} OBIIntColumn_t, *OBIIntColumn_p;
#endif /* OBICOLUMN_H_ */
/****************************************************************************
* OBIIntColumn prototype file *
* OBIIntColumn header file *
****************************************************************************/
/**
......@@ -13,35 +13,69 @@
#define OBIINTCOLUMN_H_
#include <stdio.h>
#include <stdlib.h>
#include "obicolumn.h"
/**
* @brief enum for the OBIDataType.
* @brief OBIIntColumn stucture.
*/
typedef enum OBIDataType {
OBI_VOID = 0, /**< data type not specified */
OBI_INT32, /**< type int32_t */
OBI_INT64, /**< type int64_t */
OBI_UINT32, /**< type uint32_t */
OBI_UNIT64, /**< type uint64_t */
OBI_STRING /**< type str */
} OBIDataType_t, *OBIDataType_p;
typedef struct OBIIntcolumn {
} OBIIntColumn_t, *OBIIntColumn_p;
/**
* @brief Bried description of the function.
* @brief Reads a memory block from an OBIIntColumn file.
*
* Longer description of the function. This part may refer to the parameters
* of the function, like @p parameter.
* @param parameter Description of the first parameter of the function.
* @return Describe what the function returns.
* @see http://website/
* @note Something to note.
* @warning Warning.
* @param OBIIntColumn The OBIColumn that should be read.
* @return A pointer on the first element of the block.
*/
int* readMemoryBlock(int* OBIIntcolumn, int startingPosition, int memoryBlockSize);
/**
* @brief Writes a memory block line by line at the end of a file.
*/
void writeMemoryBlock(int* OBIIntcolumn, int memoryBlockSize, char* fileName);
/**
* @brief Returns one line of an OBIColumn.
*/
int readLine(int line);
/**
* @brief Creates a new file for a new version of the OBIColumn.
*/
void newColumn(OBIIntcolumn* column);
/**
* @brief malloc for OBIIntColumn.
*/
int* OBIIntMalloc();
/**
* @brief realloc for OBIIntColumn.
*/
int* OBIIntRealloc();
/**
* @brief free for OBIIntColumn.
*/
void test(int parameter);
void OBIIntFree();
#endif /* OBIINTCOLUMN_H_ */
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment