Patents by Inventor Guillaume Alexandre Pascal Rizk
Guillaume Alexandre Pascal Rizk has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20240062853Abstract: Methods, systems, and computer programs for compressing nucleic acid sequence data. A method can include obtaining nucleic acid sequence data representing: (i) a read sequence, and (ii) a plurality of quality scores, determining whether the read sequence includes at least one “N” base, based on a determination that the read sequence includes at least one “N” base, generating, by one or more computers, a first encoding data set by using a first encoding process to encode each set of four quality scores of the read sequence into a single byte of memory, and using a second encoding process to encode the first encoded data set, thereby compressing the data to be compressed.Type: ApplicationFiled: August 23, 2023Publication date: February 22, 2024Inventor: Guillaume Alexandre Pascal Rizk
-
Patent number: 11776663Abstract: Methods, systems, and computer programs for compressing nucleic acid sequence data. A method can include obtaining nucleic acid sequence data representing: (i) a read sequence, and (ii) a plurality of quality scores, determining whether the read sequence includes at least one “N” base, based on a determination that the read sequence includes at least one “N” base, generating, by one or more computers, a first encoding data set by using a first encoding process to encode each set of four quality scores of the read sequence into a single byte of memory, and using a second encoding process to encode the first encoded data set, thereby compressing the data to be compressed.Type: GrantFiled: October 27, 2022Date of Patent: October 3, 2023Assignee: Illumina, Inc.Inventor: Guillaume Alexandre Pascal Rizk
-
Publication number: 20230290443Abstract: Methods, systems, and apparatus, including computer programs encoded on computer-storage media, for software-accelerated genomic data read mapping. In some implementations, software-accelerated genomic data read mapping includes obtaining a first k-mer seed from a genomic data read; generating a genomic signature based on a first k-mer seed; determining a reference sequence location using a hash data structure based on the genomic signature; determining a number of mismatches; based on determining the number of mismatches includes one or more mismatches, obtaining, by the one or more computers, a set of k-mer seeds from the genomic data read; and based on the set of k-mer seeds from the genomic data read, selecting, by the one or more computers, an actual alignment for the genomic data read.Type: ApplicationFiled: March 3, 2023Publication date: September 14, 2023Inventor: Guillaume Alexandre Pascal Rizk
-
Publication number: 20230084414Abstract: Methods, systems, apparatus, and computer programs are disclosed for software-accelerated genomic data read mapping. In one aspect, the methods can include actions of obtaining a k-mer seed from a genomic data read, generating a genomic signature based on the obtained k-mer seed, determining a reference sequence location that match at least a portion of the k-mer seed using a hash data structure, wherein the hash data structure comprises N data cells comprising a first portion storing a predetermined genomic signature and a second portion storing a value that corresponds to a first occurrence of a reference sequence location that match at least a portion of the k-mer seed from which the predetermined genomic signature was derived, and selecting the determined reference sequence location as an actual alignment for the obtained k-mer seed based on one or more alignment scores.Type: ApplicationFiled: November 15, 2022Publication date: March 16, 2023Inventor: Guillaume Alexandre Pascal Rizk
-
Publication number: 20230040143Abstract: Methods, systems, and computer programs for compressing nucleic acid sequence data. A method can include obtaining nucleic acid sequence data representing: (i) a read sequence, and (ii) a plurality of quality scores, determining whether the read sequence includes at least one “N” base, based on a determination that the read sequence includes at least one “N” base, generating, by one or more computers, a first encoding data set by using a first encoding process to encode each set of four quality scores of the read sequence into a single byte of memory, and using a second encoding process to encode the first encoded data set, thereby compressing the data to be compressed.Type: ApplicationFiled: October 27, 2022Publication date: February 9, 2023Inventor: Guillaume Alexandre Pascal Rizk
-
Publication number: 20220415441Abstract: The invention relates to a reference-based method for the compression of genome sequence data produced by a sequencing machine. The sequences of nucleotides or bases, that have been previously aligned to a reference sequence, are determined to be perfectly mapped, imperfectly mapped or unmapped with the reference sequence; and then coded according to said determination. The determining step comprises comparing, for each imperfectly mapped sequence, the number of mismatches between said sequence and the reference sequence with a reference threshold value, and encoding the imperfectly mapped sequences according to distinct encoding processes, depending on the result of said comparison method for the compression of genome sequence data produced by a sequencing machine.Type: ApplicationFiled: September 11, 2020Publication date: December 29, 2022Inventor: Guillaume Alexandre Pascal Rizk
-
Patent number: 11527307Abstract: Methods, systems, and computer programs for compressing nucleic acid sequence data. A method can include obtaining nucleic acid sequence data representing: (i) a read sequence, and (ii) a plurality of quality scores, determining whether the read sequence includes at least one “N” base, based on a determination that the read sequence does not include at least one “N” base, generating a first encoded data set by using a first encoding process to encode each of the quality scores of the read sequence using a base-(x minus 1) number, where x is an integer representing a number of different quality scores used by the nucleic acid sequencing device, and using a second encoding process to encode the first encoded data set, thereby compressing the data to be compressed.Type: GrantFiled: November 5, 2021Date of Patent: December 13, 2022Assignee: Illumina, Inc.Inventor: Guillaume Alexandre Pascal Rizk
-
Patent number: 11521707Abstract: Methods, systems, apparatus, and computer programs are disclosed for software-accelerated genomic data read mapping. In one aspect, the methods can include actions of obtaining a k-mer seed from a genomic data read, generating a genomic signature based on the obtained k-mer seed, determining a reference sequence location that match at least a portion of the k-mer seed using a hash data structure, wherein the hash data structure comprises N data cells comprising a first portion storing a predetermined genomic signature and a second portion storing a value that corresponds to a first occurrence of a reference sequence location that match at least a portion of the k-mer seed from which the predetermined genomic signature was derived, and selecting the determined reference sequence location as an actual alignment for the obtained k-mer seed based on one or more alignment scores.Type: GrantFiled: September 15, 2021Date of Patent: December 6, 2022Assignee: Illumina, Inc.Inventor: Guillaume Alexandre Pascal Rizk
-
Publication number: 20220139502Abstract: Methods, systems, and computer programs for compressing nucleic acid sequence data. A method can include obtaining nucleic acid sequence data representing: (i) a read sequence, and (ii) a plurality of quality scores, determining whether the read sequence includes at least one “N” base, based on a determination that the read sequence does not include at least one “N” base, generating a first encoded data set by using a first encoding process to encode each of the quality scores of the read sequence using a base-(x minus 1) number, where x is an integer representing a number of different quality scores used by the nucleic acid sequencing device, and using a second encoding process to encode the first encoded data set, thereby compressing the data to be compressed.Type: ApplicationFiled: November 5, 2021Publication date: May 5, 2022Inventor: Guillaume Alexandre Pascal Rizk
-
Publication number: 20220084625Abstract: Methods, systems, apparatus, and computer programs are disclosed for software-accelerated genomic data read mapping. In one aspect, the methods can include actions of obtaining a k-mer seed from a genomic data read, generating a genomic signature based on the obtained k-mer seed, determining a reference sequence location that match at least a portion of the k-mer seed using a hash data structure, wherein the hash data structure comprises N data cells comprising a first portion storing a predetermined genomic signature and a second portion storing a value that corresponds to a first occurrence of a reference sequence location that match at least a portion of the k-mer seed from which the predetermined genomic signature was derived, and selecting the determined reference sequence location as an actual alignment for the obtained k-mer seed based on one or more alignment scores.Type: ApplicationFiled: September 15, 2021Publication date: March 17, 2022Inventor: Guillaume Alexandre Pascal Rizk