Patents by Inventor Deniz Yorukoglu

Deniz Yorukoglu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Quality score compression for improving downstream genotyping accuracy

Publication number: 20240004838

Abstract: This disclosure provides for a highly-efficient and scalable compression tool that compresses quality scores, preferably by capitalizing on sequence redundancy. In one embodiment, compression is achieved by smoothing a large fraction of quality score values based on k-mer neighborhood of their corresponding positions in read sequences. The approach exploits the intuition that any divergent base in a k-mer likely corresponds to either a single-nucleotide polymorphism (SNP) or sequencing error; thus, a preferred approach is to only preserve quality scores for probable variant locations and compress quality scores of concordant bases, preferably by resetting them to a default value. By viewing individual read datasets through the lens of k-mer frequencies in a corpus of reads, the approach herein ensures that compression “lossiness” does not affect accuracy in a deleterious way.

Type: Application

Filed: September 19, 2023

Publication date: January 4, 2024

Inventors: Bonnie Berger Leighton, Deniz Yorukoglu, Yun William Yu, Jian Peng
Quality score compression apparatus and method for improving downstream accuracy

Patent number: 11762813

Abstract: This disclosure provides for a highly-efficient and scalable compression tool that compresses quality scores, preferably by capitalizing on sequence redundancy. In one embodiment, compression is achieved by smoothing a large fraction of quality score values based on k-mer neighborhood of their corresponding positions in read sequences. The approach exploits the intuition that any divergent base in a k-mer likely corresponds to either a single-nucleotide polymorphism (SNP) or sequencing error; thus, a preferred approach is to only preserve quality scores for probable variant locations and compress quality scores of concordant bases, preferably by resetting them to a default value. By viewing individual read datasets through the lens of k-mer frequencies in a corpus of reads, the approach herein ensures that compression “lossiness” does not affect accuracy in a deleterious way.

Type: Grant

Filed: February 5, 2019

Date of Patent: September 19, 2023

Inventors: Bonnie Berger Leighton, Deniz Yorukoglu, Yun William Yu, Jian Peng
Compressively-accelerated read mapping framework for next-generation sequencing

Patent number: 11632125

Abstract: A method of compressive read mapping. A high-resolution homology table is created for the reference genomic sequence, preferably by mapping the reference to itself. Once the homology table is created, the reads are compressed to eliminate full or partial redundancies across reads in the dataset. Preferably, compression is achieved through self-mapping of the read dataset. Next, a coarse mapping from the compressed read data to the reference is performed. Each read link generated represents a cluster of substrings from one or more reads in the dataset and stores their differences from a locus in the reference. Preferably, read links are further expanded to obtain final mapping results through traversal of the homology table, and final mapping results are reported. As compared to prior techniques, substantial speed-up gains are achieved through the compressive read mapping technique due to efficient utilization of redundancy within read sequences as well as the reference.

Type: Grant

Filed: June 8, 2021

Date of Patent: April 18, 2023

Inventors: Bonnie Berger Leighton, Deniz Yorukoglu, Jian Peng
Compressively-accelerated read mapping framework for next-generation sequencing

Publication number: 20210297090

Abstract: A method of compressive read mapping. A high-resolution homology table is created for the reference genomic sequence, preferably by mapping the reference to itself. Once the homology table is created, the reads are compressed to eliminate full or partial redundancies across reads in the dataset. Preferably, compression is achieved through self-mapping of the read dataset. Next, a coarse mapping from the compressed read data to the reference is performed. Each read link generated represents a cluster of substrings from one or more reads in the dataset and stores their differences from a locus in the reference. Preferably, read links are further expanded to obtain final mapping results through traversal of the homology table, and final mapping results are reported. As compared to prior techniques, substantial speed-up gains are achieved through the compressive read mapping technique due to efficient utilization of redundancy within read sequences as well as the reference.

Type: Application

Filed: June 8, 2021

Publication date: September 23, 2021

Inventors: Bonnie Berger Leighton, Deniz Yorukoglu, Jian Peng
Compressively-accelerated read mapping framework for next-generation sequencing

Patent number: 11031950

Abstract: A method of compressive read mapping. A high-resolution homology table is created for the reference genomic sequence, preferably by mapping the reference to itself. Once the homology table is created, the reads are compressed to eliminate full or partial redundancies across reads in the dataset. Preferably, compression is achieved through self-mapping of the read dataset. Next, a coarse mapping from the compressed read data to the reference is performed. Each read link generated represents a cluster of substrings from one or more reads in the dataset and stores their differences from a locus in the reference. Preferably, read links are further expanded to obtain final mapping results through traversal of the homology table, and final mapping results are reported. As compared to prior techniques, substantial speed-up gains are achieved through the compressive read mapping technique due to efficient utilization of redundancy within read sequences as well as the reference.

Type: Grant

Filed: March 12, 2019

Date of Patent: June 8, 2021

Inventors: Bonnie Berger Leighton, Deniz Yorukoglu, Jian Peng
Compressively-accelerated read mapping framework for next-generation sequencing

Publication number: 20190348998

Abstract: A method of compressive read mapping. A high-resolution homology table is created for the reference genomic sequence, preferably by mapping the reference to itself. Once the homology table is created, the reads are compressed to eliminate full or partial redundancies across reads in the dataset. Preferably, compression is achieved through self-mapping of the read dataset. Next, a coarse mapping from the compressed read data to the reference is performed. Each read link generated represents a cluster of substrings from one or more reads in the dataset and stores their differences from a locus in the reference. Preferably, read links are further expanded to obtain final mapping results through traversal of the homology table, and final mapping results are reported. As compared to prior techniques, substantial speed-up gains are achieved through the compressive read mapping technique due to efficient utilization of redundancy within read sequences as well as the reference.

Type: Application

Filed: March 12, 2019

Publication date: November 14, 2019

Inventors: Bonnie Berger Leighton, Deniz Yorukoglu, Jian Peng
Quality score compression for improving downstream genotyping accuracy

Publication number: 20190171625

Abstract: This disclosure provides for a highly-efficient and scalable compression tool that compresses quality scores, preferably by capitalizing on sequence redundancy. In one embodiment, compression is achieved by smoothing a large fraction of quality score values based on k-mer neighborhood of their corresponding positions in read sequences. The approach exploits the intuition that any divergent base in a k-mer likely corresponds to either a single-nucleotide polymorphism (SNP) or sequencing error; thus, a preferred approach is to only preserve quality scores for probable variant locations and compress quality scores of concordant bases, preferably by resetting them to a default value. By viewing individual read datasets through the lens of k-mer frequencies in a corpus of reads, the approach herein ensures that compression “lossiness” does not affect accuracy in a deleterious way.

Type: Application

Filed: February 5, 2019

Publication date: June 6, 2019

Inventors: Bonnie Berger Leighton, Deniz Yorukoglu, Yun William Yu, Jian Peng
Compressively-accelerated read mapping framework for next-generation sequencing

Patent number: 10230390

Abstract: A method of compressive read mapping. A high-resolution homology table is created for the reference genomic sequence, preferably by mapping the reference to itself. Once the homology table is created, the reads are compressed to eliminate full or partial redundancies across reads in the dataset. Preferably, compression is achieved through self-mapping of the read dataset. Next, a coarse mapping from the compressed read data to the reference is performed. Each read link generated represents a cluster of substrings from one or more reads in the dataset and stores their differences from a locus in the reference. Preferably, read links are further expanded to obtain final mapping results through traversal of the homology table, and final mapping results are reported. As compared to prior techniques, substantial speed-up gains are achieved through the compressive read mapping technique due to efficient utilization of redundancy within read sequences as well as the reference.

Type: Grant

Filed: August 27, 2015

Date of Patent: March 12, 2019

Inventors: Bonnie Berger Leighton, Deniz Yorukoglu, Jian Peng
Quality score compression for improving downstream genotyping accuracy

Patent number: 10198454

Abstract: This disclosure provides for a highly-efficient and scalable compression tool that compresses quality scores, preferably by capitalizing on sequence redundancy. In one embodiment, compression is achieved by smoothing a large fraction of quality score values based on k-mer neighborhood of their corresponding positions in read sequences. The approach exploits the intuition that any divergent base in a k-mer likely corresponds to either a single-nucleotide polymorphism (SNP) or sequencing error; thus, a preferred approach is to only preserve quality scores for probable variant locations and compress quality scores of concordant bases, preferably by resetting them to a default value. By viewing individual read datasets through the lens of k-mer frequencies in a corpus of reads, the approach herein ensures that compression “lossiness” does not affect accuracy in a deleterious way.

Type: Grant

Filed: April 27, 2015

Date of Patent: February 5, 2019

Inventors: Bonnie Berger Leighton, Deniz Yorukoglu, Yun William Yu, Jian Peng
Quality score compression for improving downstream genotyping accuracy

Publication number: 20170147597

Abstract: This disclosure provides for a highly-efficient and scalable compression tool that compresses quality scores, preferably by capitalizing on sequence redundancy. In one embodiment, compression is achieved by smoothing a large fraction of quality score values based on k-mer neighborhood of their corresponding positions in read sequences. The approach exploits the intuition that any divergent base in a k-mer likely corresponds to either a single-nucleotide polymorphism (SNP) or sequencing error; thus, a preferred approach is to only preserve quality scores for probable variant locations and compress quality scores of concordant bases, preferably by resetting them to a default value. By viewing individual read datasets through the lens of k-mer frequencies in a corpus of reads, the approach herein ensures that compression “lossiness” does not affect accuracy in a deleterious way.

Type: Application

Filed: April 27, 2015

Publication date: May 25, 2017

Inventors: Bonnie Berger Leighton, Deniz Yorukoglu, Y. William Yu, Jian Peng
Compressively-accelerated read mapping framework for next-generation sequencing

Publication number: 20160191076

Abstract: A method of compressive read mapping. A high-resolution homology table is created for the reference genomic sequence, preferably by mapping the reference to itself. Once the homology table is created, the reads are compressed to eliminate full or partial redundancies across reads in the dataset. Preferably, compression is achieved through self-mapping of the read dataset. Next, a coarse mapping from the compressed read data to the reference is performed. Each read link generated represents a cluster of substrings from one or more reads in the dataset and stores their differences from a locus in the reference. Preferably, read links are further expanded to obtain final mapping results through traversal of the homology table, and final mapping results are reported. As compared to prior techniques, substantial speed-up gains are achieved through the compressive read mapping technique due to efficient utilization of redundancy within read sequences as well as the reference.

Type: Application

Filed: August 27, 2015

Publication date: June 30, 2016

Inventors: Bonnie Berger Leighton, Deniz Yorukoglu, Jian Peng
Compressively-accelerated read mapping

Publication number: 20140108323

Abstract: A genomic read dataset is mapped from multiple individuals to a reference genome in a time- and storage-efficient manner. The approach begins by building a set of data structures that collectively represents a knowledge base of similarity information. The knowledge base comprises a set of data structures that, when combined, intrinsically represent all reads to whole-reference match (similarity) information for a reference genome. After this knowledge base is generated, it is then accessed and used in a mapping decision layer. The mapping layer taps into the similarity knowledge within the set of data structures to decide on the mappings and report them, thereby avoiding redundant and unnecessary computations that would otherwise be necessary to find matches and report mappings for each read individually. The approach exploits the redundancy in the read datasets to enable significant speed-up of the sequence matching layer, which preferably is performed collectively for all reads.

Type: Application

Filed: October 14, 2013

Publication date: April 17, 2014

Inventors: Bonnie Berger Leighton, Deniz Yörükoglu