Patents by Inventor Keith D. Noto

Keith D. Noto has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Determining data inheritance of data segments

Patent number: 12367221

Abstract: A computing device may receive a target data instance. The computing device may identify a plurality of matched segments that match to the target data instance for at least a threshold length. The computing device may define, based on overlapping of the matched segments, the target data instance as a plurality of data string ranges, wherein each divided data string is matched to a set of overlapping matched segments. The computing device may apply an iterative clustering algorithm to group the plurality of data string ranges based on values of a similarity metric among data string ranges that are assigned to a given group. The computing device may attribute a first set of data string ranges that are assigned to a first group to a first inheritance.

Type: Grant

Filed: June 25, 2024

Date of Patent: July 22, 2025

Assignee: Ancestry.com DNA, LLC

Inventor: Keith D. Noto
Haplotype phasing models

Patent number: 12334191

Abstract: Novel haplotype cluster Markov models are used to phase genomic samples. After the models are built, they rapidly and accurately phase new samples without requiring that the new samples be used to re-build the models. The models set transition probabilities such that the probability for an appearance of any allele within any haplotype is a non-zero number. Furthermore, the most unlikely pairs of haplotypes are discarded from each model at each level until ? of the likelihood mass at each level is discarded. The models are also constructed such that contributing windows of SNPs partially overlap so that phasing decisions near one of the extreme ends of any model is are not significantly determinative of the phase. Additionally, the models are configured such that two or more nodes can be merged during the building/updating procedure to consolidate haplotype clusters having similar distributions.

Type: Grant

Filed: April 29, 2020

Date of Patent: June 17, 2025

Assignee: Ancestry.com DNA, LLC

Inventors: Catherine Ann Ball, Keith D. Noto, Kenneth G. Chahine, Mathew J. Barber, Yong Wang
DETERMINING DATA INHERITANCE OF DATA SEGMENTS

Publication number: 20240411781

Abstract: A computing device may receive a target data instance. The computing device may identify a plurality of matched segments that match to the target data instance for at least a threshold length. The computing device may define, based on overlapping of the matched segments, the target data instance as a plurality of data string ranges, wherein each divided data string is matched to a set of overlapping matched segments. The computing device may apply an iterative clustering algorithm to group the plurality of data string ranges based on values of a similarity metric among data string ranges that are assigned to a given group. The computing device may attribute a first set of data string ranges that are assigned to a first group to a first inheritance.

Type: Application

Filed: June 25, 2024

Publication date: December 12, 2024

Inventor: Keith D. Noto
Ancestral human genomes

Patent number: 12148507

Abstract: Described are computational methods to reconstruct the chromosomes (and genomes) of ancestors given genetic data, IBD information, and full or partial pedigree information of some number of their descendants.

Type: Grant

Filed: December 3, 2019

Date of Patent: November 19, 2024

Assignee: Ancestry.com DNA, LLC

Inventors: Julie M. Granka, Keith D. Noto
Local genetic ethnicity determination system

Patent number: 12086735

Abstract: An input sample SNP genotype is divided into a plurality of windows, each including a sequence of SNPs. For each window, a diploid hidden Markov Model (HMM) is built and from a haplotype Markov Model (MM). The diploid HMM for a window is used to determine the probability that the window corresponds to a pair of labels (e.g., ethnicity labels). An inter-window HMM, with a set of states for each window, is built based on the diploid HMMs for each window. Labels are assigned to the input sample genotype based on the inter-window HMM.

Type: Grant

Filed: January 8, 2020

Date of Patent: September 10, 2024

Assignee: Ancestry.com DNA, LLC

Inventors: Keith D. Noto, Yong Wang
Determining data inheritance of data segments

Patent number: 12050629

Abstract: A computing device may receive a target data instance. The computing device may identify a plurality of matched segments that match to the target data instance for at least a threshold length. The computing device may define, based on overlapping of the matched segments, the target data instance as a plurality of data string ranges, wherein each divided data string is matched to a set of overlapping matched segments. The computing device may apply an iterative clustering algorithm to group the plurality of data string ranges based on values of a similarity metric among data string ranges that are assigned to a given group. The computing device may attribute a first set of data string ranges that are assigned to a first group to a first inheritance.

Type: Grant

Filed: October 6, 2023

Date of Patent: July 30, 2024

Assignee: Ancestry.com DNA, LLC

Inventor: Keith D. Noto
Scoring method for matches based on age probability

Patent number: 12045219

Abstract: Disclosed herein relates to a method that improves the accuracy of producing family trees. The DNA of a target individual is processed to find a matching individual. Using the known family tree of the matching individual, multiple candidate family trees are generated with multiple proposed placements for the target individual. For each candidate family tree, a genetic likelihood for a proposed relationship and the other DNA test takers in the family tree. A birth-year probability is determined by identifying a most recent common ancestor (MRCA). The birth-year probability is based on the number of years between the target individual and the matching individual and a normal distribution of ages for parent-child age differences in a population. The genetic likelihood is converted to a genetic probability so that it can be compared with or added to the birth-year probability. Based on the two probabilities, the candidate family trees are sorted.

Type: Grant

Filed: November 23, 2022

Date of Patent: July 23, 2024

Assignee: Ancestry.com DNA, LLC

Inventors: Jingwen Pei, Keith D. Noto
Global ancestry determination system

Patent number: 12040054

Abstract: An input genotype is divided into a plurality of windows, each including a sequence of SNPs. For each window, a diploid HMM is computed based on genotypes and/or phased haplotypes to determine a probability of a haplotype sequence being associated with a particular label. For example, the diploid HMM for a window is used to determine the emission probability that the window corresponds to a set of labels. An inter-window HMM, with a set of states for each window, is computed. Labels are assigned to the input genotype based on the inter-window HMM. Upper and lower bounds are estimated to produce a range of likely percentage values an input can be assigned to a given label. Confidence values are determined indicating a likelihood that an individual inherits DNA from a certain population. Maps are generated with polygons representing regions where a measure of ethnicity of population falls within specific ranges.

Type: Grant

Filed: May 13, 2020

Date of Patent: July 16, 2024

Assignee: ANCESTRY.COM DNA, LLC

Inventors: Shiya Song, Keith D. Noto, Yong Wang
CATALOG-BASED DATA INHERITANCE DETERMINATION

Publication number: 20240061886

Abstract: A computing server may generate a catalog of overrepresented data strings from a database that stores a plurality of data instances. An overrepresented data string is a data string that matches to a number of data instances and the number exceeds a number threshold. The computing server may receive a target data instance that is to be compared to a related data instance. The computing server may determine one or more matched data strings that match between the target data instance and the related data instance. The computing server may compare the matched data strings to the catalog to exclude a subset of matched data strings that are matched to the overrepresented data strings. The computing server may determine a total length of the matched data strings excluding the subset of matched data strings that are matched to the overrepresented data strings.

Type: Application

Filed: August 18, 2023

Publication date: February 22, 2024

Inventor: Keith D. Noto
DETECTING IBD EFFICIENTLY USING A DISTRIBUTED SYSTEM

Publication number: 20230317300

Abstract: Disclosed herein relates to a method that uses the RAM of multiple servers to increase the efficiency of identifying segments of a target dataset that match segments of other datasets in a database. An encoding system may encode large genetic datasets to produce pairs of bitmap sequence pairs that correspond to an encoding scheme. The servers each store portions of the database in their hard drives based on a shared characteristic of the genetic datasets in the database, such as ethnicity or location of birth. The servers encode data from their hard drives and sustain the encoded data in their RAM. A target, or query, individual is input for matching. The servers match the encoded data of the target individual with encoded data in their flash drives and can determine a relationship. The servers sustain the encoded data in RAM to compare against subsequent target individuals.

Type: Application

Filed: November 21, 2022

Publication date: October 5, 2023

Inventor: Keith D. Noto
IDENTIFICATION OF MATCHED SEGMENTED IN PAIRED DATASETS

Publication number: 20220382730

Abstract: Disclosed herein relates to processes that identify segments of a target dataset that match segments of other datasets in a database. A computing server may encode the target dataset to generate a pair of encoded target bitmap sequences based on an encoding scheme. The encoding scheme defines encoding values based on homogeneity between the pair of data value sequences. The computing server may compare the pair of encoded target bitmap sequences with other pairs of encoded bitmap sequences to identify homogeneous mismatched locations. A homogeneous mismatched location may be a location where the target dataset and the other dataset in comparison are both homogeneous but have different types of homogeneity at the location. The computing server may identify a matched segment between the target dataset and one of the other datasets based on the homogeneous mismatched locations identified. The matched segment is contained within two homogeneous mismatched locations.

Type: Application

Filed: May 26, 2022

Publication date: December 1, 2022

Inventor: Keith D. Noto
Identifying ancestral relationships using a continuous stream of input

Patent number: 11335435

Abstract: Identification of inheritance-by-descent haplotype matches between individuals is described. A set of tables including word match, haplotypes and segment match tables are populated. DNA samples are received and stored. A word identification module extracts haplotype values from each sample. The word match table is indexed according to the unique combination of position and haplotype. Each column represents a different sample, and each cell indicates whether that sample includes that haplotype at that position. The haplotypes table includes the raw haplotype data for each sample. The segment match table is indexed by sample identifier, and columns represent other samples. Each cell is populated to indicate for each identified sample pair which position range(s) include matching haplotypes for both samples. The tables are persistently stored in databases of the matching system. As new sample data is received, each table is updated to include the newly received samples, and additional matching takes place.

Type: Grant

Filed: October 4, 2018

Date of Patent: May 17, 2022

Assignee: Ancestry.com DNA, LLC

Inventors: Jake Kelly Byrnes, Aaron Ling, Keith D. Noto, Jeremy Pollack, Catherine Ann Ball, Kenneth Gregory Chahine
GENETIC AND GENEALOGICAL ANALYSIS FOR IDENTIFICATION OF BIRTH LOCATION AND SURNAME INFORMATION

Publication number: 20210183474

Abstract: A system identifies ancestral birth locations or surnames estimated to be associated with an individual's ancestors using an individual's genetic sample. The system identifies users who are genetic matches to the individual and determines whether and how often a birth location or surname appears in the pedigrees of those users. Birth locations or surnames that appear frequently throughout the pedigrees of genetically matching users may represent birth locations or surnames that are affiliated with the individual's ancestors. The system determines whether the frequency of appearance of a birth location or surname is statistically significant to eliminate biases for certain birth locations or surnames that appear more frequently than others. The birth location or surname may be provided to the individual based on an also-determined enrichment score.

Type: Application

Filed: February 24, 2021

Publication date: June 17, 2021

Inventors: Amir R. Kermany, Julie M. Granka, Keith D. Noto
Genetic and genealogical analysis for identification of birth location and surname information

Patent number: 10957422

Abstract: A system identifies ancestral birth locations or surnames estimated to be associated with an individual's ancestors using an individual's genetic sample. The system identifies users who are genetic matches to the individual and determines whether and how often a birth location or surname appears in the pedigrees of those users. Birth locations or surnames that appear frequently throughout the pedigrees of genetically matching users may represent birth locations or surnames that are affiliated with the individual's ancestors. The system determines whether the frequency of appearance of a birth location or surname is statistically significant to eliminate biases for certain birth locations or surnames that appear more frequently than others. The birth location or surname may be provided to the individual based on an also-determined enrichment score.

Type: Grant

Filed: July 6, 2016

Date of Patent: March 23, 2021

Assignee: Ancestry.com DNA, LLC

Inventors: Amir R. Kermany, Julie M. Granka, Keith D. Noto
CLUSTERING OF MATCHED SEGMENTS TO DETERMINE LINKAGE OF DATASET IN A DATABASE

Publication number: 20210034647

Abstract: A computer-implemented method for linking individuals' datasets in a database may include receiving a target individual dataset of a target individual and a plurality of additional individual datasets. A computing server may generate a plurality of sub-cluster pairs of first parental groups and second parental groups. At least one of sub-cluster pairs includes a first parental group of matched segments and a second parental group of matched segments. A computing server may link the first parental groups and the second parental groups across the plurality of sub-cluster pairs to generate at least one super-cluster of a parental side. A computing server may assign metadata to one or more additional individual datasets of the plurality of additional individual datasets. The metadata may specify that the one or more additional individual datasets are connected to the target individual dataset by the parental side of the super-cluster.

Type: Application

Filed: July 23, 2020

Publication date: February 4, 2021

Inventors: Thi Hong Luong Nguyen, Jingwen Pei, Harendra Guturu, Keith D. Noto
HAPLOTYPE PHASING MODELS

Publication number: 20200303035

Abstract: Novel haplotype cluster Markov models are used to phase genomic samples. After the models are built, they rapidly and accurately phase new samples without requiring that the new samples be used to re-build the models. The models set transition probabilities such that the probability for an appearance of any allele within any haplotype is a non-zero number. Furthermore, the most unlikely pairs of haplotypes are discarded from each model at each level until ? of the likelihood mass at each level is discarded. The models are also constructed such that contributing windows of SNPs partially overlap so that phasing decisions near one of the extreme ends of any model is are not significantly determinative of the phase. Additionally, the models are configured such that two or more nodes can be merged during the building/updating procedure to consolidate haplotype clusters having similar distributions.

Type: Application

Filed: April 29, 2020

Publication date: September 24, 2020

Inventors: Catherine Ann Ball, Keith D. Noto, Kenneth G. Chahine, Mathew J. Barber, Yong Wang
REDUCING ERROR IN PREDICTED GENETIC RELATIONSHIPS

Publication number: 20200286591

Abstract: System, computer program products, and methods are disclosed for estimating a degree of ancestral relatedness between two individuals. The haplotype data for a population of individuals is divided into segment windows based on genetic markers, and matched segments for the haplotype data are generated. Each matched segment having a first cM width that exceeds a threshold cM width is included in counting the matched segments in each segment window. A weight associated with each segment window is estimated based on the count of matched segments in the associated segment window. A weighted sum of per-window cM widths for each matched segment is calculated based on the first cM width and the weights associated with the segment windows of the matched segment. The weighted sum of per-window cM widths are used to estimate a degree of ancestral relatedness between two individuals.

Type: Application

Filed: May 27, 2020

Publication date: September 10, 2020

Inventors: Mathew J. Barber, Yong Wang, Keith D. Noto, Kenneth G. Chahine, Catherine Ann Ball
GLOBAL ANCESTRY DETERMINATION SYSTEM

Publication number: 20200286579

Abstract: An input genotype is divided into a plurality of windows, each including a sequence of SNPs. For each window, a diploid HMM is computed based on genotypes and/or phased haplotypes to determine a probability of a haplotype sequence being associated with a particular label. For example, the diploid HMM for a window is used to determine the emission probability that the window corresponds to a set of labels. An inter-window HMM, with a set of states for each window, is computed. Labels are assigned to the input genotype based on the inter-window HMM. Upper and lower bounds are estimated to produce a range of likely percentage values an input can be assigned to a given label. Confidence values are determined indicating a likelihood that an individual inherits DNA from a certain population. Maps are generated with polygons representing regions where a measure of ethnicity of population falls within specific ranges.

Type: Application

Filed: May 13, 2020

Publication date: September 10, 2020

Inventors: Shiya Song, Keith D. Noto, Yong Wang
Reducing error in predicted genetic relationships

Patent number: 10720229

Abstract: System, computer program products, and methods are disclosed for estimating a degree of ancestral relatedness between two individuals. The haplotype data for a population of individuals is divided into segment windows based on genetic markers, and matched segments for the haplotype data are generated. Each matched segment having a first cM width that exceeds a threshold cM width is included in counting the matched segments in each segment window. A weight associated with each segment window is estimated based on the count of matched segments in the associated segment window. A weighted sum of per-window cM widths for each matched segment is calculated based on the first cM width and the weights associated with the segment windows of the matched segment. The weighted sum of per-window cM widths are used to estimate a degree of ancestral relatedness between two individuals.

Type: Grant

Filed: October 14, 2015

Date of Patent: July 21, 2020

Assignee: Ancestry.com DNA, LLC

Inventors: Mathew J Barber, Yong Wang, Keith D. Noto, Kenneth G. Chahine, Catherine Ann Ball
Global ancestry determination system

Patent number: 10692587

Abstract: An input genotype is divided into a plurality of windows, each including a sequence of SNPs. For each window, a diploid HMM is computed based on genotypes and/or phased haplotypes to determine a probability of a haplotype sequence being associated with a particular label. For example, the diploid HMM for a window is used to determine the emission probability that the window corresponds to a set of labels. An inter-window HMM, with a set of states for each window, is computed. Labels are assigned to the input genotype based on the inter-window HMM. Upper and lower bounds are estimated to produce a range of likely percentage values an input can be assigned to a given label. Confidence values are determined indicating a likelihood that an individual inherits DNA from a certain population. Maps are generated with polygons representing regions where a measure of ethnicity of population falls within specific ranges.

Type: Grant

Filed: September 11, 2019

Date of Patent: June 23, 2020

Assignee: Ancestry.com DNA, LLC

Inventors: Shiya Song, Keith D. Noto, Yong Wang

1 2 next