Abstract: Method and apparatus for the coding and selective access of compressed genomic sequence data produced by genomic sequencing machines. The coding process is based on aligning sequence reads with respect to pre-existing or constructed reference sequences, on classifying and coding the sequence reads by means of sets of descriptors, and further partitioning the descriptor sets into access units of different types. Efficient selective access to specific genomic regions with the guarantee of retrieving all sequence reads mapped to those regions, is provided by: signaling the type of data mapping configuration used to store or transmit the descriptor sets, determining the minimum number of access units that need to be retrieved and decoded to access a genomic region, providing a master index table that contain all information for optimizing the data access process.
Type:
Grant
Filed:
July 11, 2017
Date of Patent:
September 19, 2023
Assignee:
Genomsys SA
Inventors:
Mohamed Khoso Baluch, Claudio Alberti, Giorgio Zoia, Daniele Renzi
Abstract: Method and apparatus for the indexing of genome sequence data produced by genome sequencing machines. The proposed method can be applied both to raw sequence data produced by sequencing machines and to those sequence reads that cannot be mapped on any reference sequence according to specific matching criteria. This invention describes a method to partition and index unaligned sequence reads to enable browsing and efficient selective access.
Type:
Grant
Filed:
July 11, 2017
Date of Patent:
August 2, 2022
Assignee:
GenomSys SA
Inventors:
Claudio Alberti, Giorgio Zoia, Daniele Renzi, Mohamed Khoso Baluch
Abstract: Method and system for storing and accessing genomic data. Genomic sequencing data are partitioned into access units of different types based on the predictability of the contained data. Access units are classified in different types and the structuring enables selective access and selective processing of genomic data.
Abstract: The storage or transmission of genomic data is realized by employing a structured compressed genomic dataset in a file or in a stream of genomic data. Selective access to the data, or subsets of the data, corresponding to specific genomic regions is achieved by employing user-defined labels based on data classification and a specific indexing mechanism.
Type:
Application
Filed:
February 14, 2017
Publication date:
February 6, 2020
Applicant:
GENOMSYS SA
Inventors:
Mohamed Khoso Baluch, Giorgio Zoia, Daniele Renzi
Abstract: Method and apparatus for the coding and selective access of compressed genomic sequence data produced by genomic sequencing machines. The coding process is based on aligning sequence reads with respect to pre-existing or constructed reference sequences, on classifying and coding the sequence reads by means of sets of descriptors, and further partitioning the descriptor sets into access units of different types. Efficient selective access to specific genomic regions with the guarantee of retrieving all sequence reads mapped to those regions, is provided by: signaling the type of data mapping configuration used to store or transmit the descriptor sets, determining the minimum number of access units that need to be retrieved and decoded to access a genomic region, providing a master index table that contain all information for optimizing the data access process.
Type:
Application
Filed:
July 11, 2017
Publication date:
February 6, 2020
Applicant:
GENOMSYS SA
Inventors:
Mohamed Khoso Baluch, Claudio Alberti, Giorgio Zoia, Daniele Renzi