Patents by Inventor Guillaume Alexandre Pascal Rizk

Guillaume Alexandre Pascal Rizk has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240062853
    Abstract: Methods, systems, and computer programs for compressing nucleic acid sequence data. A method can include obtaining nucleic acid sequence data representing: (i) a read sequence, and (ii) a plurality of quality scores, determining whether the read sequence includes at least one “N” base, based on a determination that the read sequence includes at least one “N” base, generating, by one or more computers, a first encoding data set by using a first encoding process to encode each set of four quality scores of the read sequence into a single byte of memory, and using a second encoding process to encode the first encoded data set, thereby compressing the data to be compressed.
    Type: Application
    Filed: August 23, 2023
    Publication date: February 22, 2024
    Inventor: Guillaume Alexandre Pascal Rizk
  • Patent number: 11776663
    Abstract: Methods, systems, and computer programs for compressing nucleic acid sequence data. A method can include obtaining nucleic acid sequence data representing: (i) a read sequence, and (ii) a plurality of quality scores, determining whether the read sequence includes at least one “N” base, based on a determination that the read sequence includes at least one “N” base, generating, by one or more computers, a first encoding data set by using a first encoding process to encode each set of four quality scores of the read sequence into a single byte of memory, and using a second encoding process to encode the first encoded data set, thereby compressing the data to be compressed.
    Type: Grant
    Filed: October 27, 2022
    Date of Patent: October 3, 2023
    Assignee: Illumina, Inc.
    Inventor: Guillaume Alexandre Pascal Rizk
  • Publication number: 20230290443
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer-storage media, for software-accelerated genomic data read mapping. In some implementations, software-accelerated genomic data read mapping includes obtaining a first k-mer seed from a genomic data read; generating a genomic signature based on a first k-mer seed; determining a reference sequence location using a hash data structure based on the genomic signature; determining a number of mismatches; based on determining the number of mismatches includes one or more mismatches, obtaining, by the one or more computers, a set of k-mer seeds from the genomic data read; and based on the set of k-mer seeds from the genomic data read, selecting, by the one or more computers, an actual alignment for the genomic data read.
    Type: Application
    Filed: March 3, 2023
    Publication date: September 14, 2023
    Inventor: Guillaume Alexandre Pascal Rizk
  • Publication number: 20230084414
    Abstract: Methods, systems, apparatus, and computer programs are disclosed for software-accelerated genomic data read mapping. In one aspect, the methods can include actions of obtaining a k-mer seed from a genomic data read, generating a genomic signature based on the obtained k-mer seed, determining a reference sequence location that match at least a portion of the k-mer seed using a hash data structure, wherein the hash data structure comprises N data cells comprising a first portion storing a predetermined genomic signature and a second portion storing a value that corresponds to a first occurrence of a reference sequence location that match at least a portion of the k-mer seed from which the predetermined genomic signature was derived, and selecting the determined reference sequence location as an actual alignment for the obtained k-mer seed based on one or more alignment scores.
    Type: Application
    Filed: November 15, 2022
    Publication date: March 16, 2023
    Inventor: Guillaume Alexandre Pascal Rizk
  • Publication number: 20230040143
    Abstract: Methods, systems, and computer programs for compressing nucleic acid sequence data. A method can include obtaining nucleic acid sequence data representing: (i) a read sequence, and (ii) a plurality of quality scores, determining whether the read sequence includes at least one “N” base, based on a determination that the read sequence includes at least one “N” base, generating, by one or more computers, a first encoding data set by using a first encoding process to encode each set of four quality scores of the read sequence into a single byte of memory, and using a second encoding process to encode the first encoded data set, thereby compressing the data to be compressed.
    Type: Application
    Filed: October 27, 2022
    Publication date: February 9, 2023
    Inventor: Guillaume Alexandre Pascal Rizk
  • Publication number: 20220415441
    Abstract: The invention relates to a reference-based method for the compression of genome sequence data produced by a sequencing machine. The sequences of nucleotides or bases, that have been previously aligned to a reference sequence, are determined to be perfectly mapped, imperfectly mapped or unmapped with the reference sequence; and then coded according to said determination. The determining step comprises comparing, for each imperfectly mapped sequence, the number of mismatches between said sequence and the reference sequence with a reference threshold value, and encoding the imperfectly mapped sequences according to distinct encoding processes, depending on the result of said comparison method for the compression of genome sequence data produced by a sequencing machine.
    Type: Application
    Filed: September 11, 2020
    Publication date: December 29, 2022
    Inventor: Guillaume Alexandre Pascal Rizk
  • Patent number: 11527307
    Abstract: Methods, systems, and computer programs for compressing nucleic acid sequence data. A method can include obtaining nucleic acid sequence data representing: (i) a read sequence, and (ii) a plurality of quality scores, determining whether the read sequence includes at least one “N” base, based on a determination that the read sequence does not include at least one “N” base, generating a first encoded data set by using a first encoding process to encode each of the quality scores of the read sequence using a base-(x minus 1) number, where x is an integer representing a number of different quality scores used by the nucleic acid sequencing device, and using a second encoding process to encode the first encoded data set, thereby compressing the data to be compressed.
    Type: Grant
    Filed: November 5, 2021
    Date of Patent: December 13, 2022
    Assignee: Illumina, Inc.
    Inventor: Guillaume Alexandre Pascal Rizk
  • Patent number: 11521707
    Abstract: Methods, systems, apparatus, and computer programs are disclosed for software-accelerated genomic data read mapping. In one aspect, the methods can include actions of obtaining a k-mer seed from a genomic data read, generating a genomic signature based on the obtained k-mer seed, determining a reference sequence location that match at least a portion of the k-mer seed using a hash data structure, wherein the hash data structure comprises N data cells comprising a first portion storing a predetermined genomic signature and a second portion storing a value that corresponds to a first occurrence of a reference sequence location that match at least a portion of the k-mer seed from which the predetermined genomic signature was derived, and selecting the determined reference sequence location as an actual alignment for the obtained k-mer seed based on one or more alignment scores.
    Type: Grant
    Filed: September 15, 2021
    Date of Patent: December 6, 2022
    Assignee: Illumina, Inc.
    Inventor: Guillaume Alexandre Pascal Rizk
  • Publication number: 20220139502
    Abstract: Methods, systems, and computer programs for compressing nucleic acid sequence data. A method can include obtaining nucleic acid sequence data representing: (i) a read sequence, and (ii) a plurality of quality scores, determining whether the read sequence includes at least one “N” base, based on a determination that the read sequence does not include at least one “N” base, generating a first encoded data set by using a first encoding process to encode each of the quality scores of the read sequence using a base-(x minus 1) number, where x is an integer representing a number of different quality scores used by the nucleic acid sequencing device, and using a second encoding process to encode the first encoded data set, thereby compressing the data to be compressed.
    Type: Application
    Filed: November 5, 2021
    Publication date: May 5, 2022
    Inventor: Guillaume Alexandre Pascal Rizk
  • Publication number: 20220084625
    Abstract: Methods, systems, apparatus, and computer programs are disclosed for software-accelerated genomic data read mapping. In one aspect, the methods can include actions of obtaining a k-mer seed from a genomic data read, generating a genomic signature based on the obtained k-mer seed, determining a reference sequence location that match at least a portion of the k-mer seed using a hash data structure, wherein the hash data structure comprises N data cells comprising a first portion storing a predetermined genomic signature and a second portion storing a value that corresponds to a first occurrence of a reference sequence location that match at least a portion of the k-mer seed from which the predetermined genomic signature was derived, and selecting the determined reference sequence location as an actual alignment for the obtained k-mer seed based on one or more alignment scores.
    Type: Application
    Filed: September 15, 2021
    Publication date: March 17, 2022
    Inventor: Guillaume Alexandre Pascal Rizk