Skip to content

  • Projects
  • Groups
  • Snippets
  • Help
  • This project
    • Loading...
  • Sign in / Register
O
OBITools3
  • Overview
    • Overview
    • Details
    • Activity
    • Cycle Analytics
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
    • Charts
  • Issues 15
    • Issues 15
    • List
    • Board
    • Labels
    • Milestones
  • Merge Requests 0
    • Merge Requests 0
  • Wiki
    • Wiki
  • Members
    • Members
  • Collapse sidebar
  • Activity
  • Graph
  • Charts
  • Create a new issue
  • Commits
  • Issue Boards
  • OBITools
  • OBITools3
  • Issues
  • #79

Closed
Open
Opened Apr 23, 2020 by Celine Mercier@mercier 
  • Report abuse
  • New issue
Report abuse New issue

Dictionary efficiency issue

Handling of huge dictionaries (typically merged information like merged taxids in reference databases with hundreds of thousands of taxids or merged samples in datasets with thousands of samples) is not efficient as it creates big files that are not mapping-friendly (and occupy a lot of disk space).

There is already a solution half implemented in the form of dictionaries stored as characters strings, but the API to parse them in C is not implemented, so it's not or rarely used. This would be the fastest solution to finish to implement, but eventually a better solution could be developed (e.g. using hash tables implemented in a way that makes the most of the mapping behaviour).

Assignee
Assign to
None
Milestone
None
Assign milestone
Time tracking
None
Due date
No due date
2
Labels
critical enhancement
Assign labels
  • View labels
Reference: obitools/obitools3#79