Patents by Inventor Keith D. Noto

Keith D. Noto has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240061886
    Abstract: A computing server may generate a catalog of overrepresented data strings from a database that stores a plurality of data instances. An overrepresented data string is a data string that matches to a number of data instances and the number exceeds a number threshold. The computing server may receive a target data instance that is to be compared to a related data instance. The computing server may determine one or more matched data strings that match between the target data instance and the related data instance. The computing server may compare the matched data strings to the catalog to exclude a subset of matched data strings that are matched to the overrepresented data strings. The computing server may determine a total length of the matched data strings excluding the subset of matched data strings that are matched to the overrepresented data strings.
    Type: Application
    Filed: August 18, 2023
    Publication date: February 22, 2024
    Inventor: Keith D. Noto
  • Publication number: 20230317300
    Abstract: Disclosed herein relates to a method that uses the RAM of multiple servers to increase the efficiency of identifying segments of a target dataset that match segments of other datasets in a database. An encoding system may encode large genetic datasets to produce pairs of bitmap sequence pairs that correspond to an encoding scheme. The servers each store portions of the database in their hard drives based on a shared characteristic of the genetic datasets in the database, such as ethnicity or location of birth. The servers encode data from their hard drives and sustain the encoded data in their RAM. A target, or query, individual is input for matching. The servers match the encoded data of the target individual with encoded data in their flash drives and can determine a relationship. The servers sustain the encoded data in RAM to compare against subsequent target individuals.
    Type: Application
    Filed: November 21, 2022
    Publication date: October 5, 2023
    Inventor: Keith D. Noto
  • Publication number: 20220382730
    Abstract: Disclosed herein relates to processes that identify segments of a target dataset that match segments of other datasets in a database. A computing server may encode the target dataset to generate a pair of encoded target bitmap sequences based on an encoding scheme. The encoding scheme defines encoding values based on homogeneity between the pair of data value sequences. The computing server may compare the pair of encoded target bitmap sequences with other pairs of encoded bitmap sequences to identify homogeneous mismatched locations. A homogeneous mismatched location may be a location where the target dataset and the other dataset in comparison are both homogeneous but have different types of homogeneity at the location. The computing server may identify a matched segment between the target dataset and one of the other datasets based on the homogeneous mismatched locations identified. The matched segment is contained within two homogeneous mismatched locations.
    Type: Application
    Filed: May 26, 2022
    Publication date: December 1, 2022
    Inventor: Keith D. Noto
  • Patent number: 11335435
    Abstract: Identification of inheritance-by-descent haplotype matches between individuals is described. A set of tables including word match, haplotypes and segment match tables are populated. DNA samples are received and stored. A word identification module extracts haplotype values from each sample. The word match table is indexed according to the unique combination of position and haplotype. Each column represents a different sample, and each cell indicates whether that sample includes that haplotype at that position. The haplotypes table includes the raw haplotype data for each sample. The segment match table is indexed by sample identifier, and columns represent other samples. Each cell is populated to indicate for each identified sample pair which position range(s) include matching haplotypes for both samples. The tables are persistently stored in databases of the matching system. As new sample data is received, each table is updated to include the newly received samples, and additional matching takes place.
    Type: Grant
    Filed: October 4, 2018
    Date of Patent: May 17, 2022
    Assignee: Ancestry.com DNA, LLC
    Inventors: Jake Kelly Byrnes, Aaron Ling, Keith D. Noto, Jeremy Pollack, Catherine Ann Ball, Kenneth Gregory Chahine
  • Publication number: 20210183474
    Abstract: A system identifies ancestral birth locations or surnames estimated to be associated with an individual's ancestors using an individual's genetic sample. The system identifies users who are genetic matches to the individual and determines whether and how often a birth location or surname appears in the pedigrees of those users. Birth locations or surnames that appear frequently throughout the pedigrees of genetically matching users may represent birth locations or surnames that are affiliated with the individual's ancestors. The system determines whether the frequency of appearance of a birth location or surname is statistically significant to eliminate biases for certain birth locations or surnames that appear more frequently than others. The birth location or surname may be provided to the individual based on an also-determined enrichment score.
    Type: Application
    Filed: February 24, 2021
    Publication date: June 17, 2021
    Inventors: Amir R. Kermany, Julie M. Granka, Keith D. Noto
  • Patent number: 10957422
    Abstract: A system identifies ancestral birth locations or surnames estimated to be associated with an individual's ancestors using an individual's genetic sample. The system identifies users who are genetic matches to the individual and determines whether and how often a birth location or surname appears in the pedigrees of those users. Birth locations or surnames that appear frequently throughout the pedigrees of genetically matching users may represent birth locations or surnames that are affiliated with the individual's ancestors. The system determines whether the frequency of appearance of a birth location or surname is statistically significant to eliminate biases for certain birth locations or surnames that appear more frequently than others. The birth location or surname may be provided to the individual based on an also-determined enrichment score.
    Type: Grant
    Filed: July 6, 2016
    Date of Patent: March 23, 2021
    Assignee: Ancestry.com DNA, LLC
    Inventors: Amir R. Kermany, Julie M. Granka, Keith D. Noto
  • Publication number: 20210034647
    Abstract: A computer-implemented method for linking individuals' datasets in a database may include receiving a target individual dataset of a target individual and a plurality of additional individual datasets. A computing server may generate a plurality of sub-cluster pairs of first parental groups and second parental groups. At least one of sub-cluster pairs includes a first parental group of matched segments and a second parental group of matched segments. A computing server may link the first parental groups and the second parental groups across the plurality of sub-cluster pairs to generate at least one super-cluster of a parental side. A computing server may assign metadata to one or more additional individual datasets of the plurality of additional individual datasets. The metadata may specify that the one or more additional individual datasets are connected to the target individual dataset by the parental side of the super-cluster.
    Type: Application
    Filed: July 23, 2020
    Publication date: February 4, 2021
    Inventors: Thi Hong Luong Nguyen, Jingwen Pei, Harendra Guturu, Keith D. Noto
  • Publication number: 20200303035
    Abstract: Novel haplotype cluster Markov models are used to phase genomic samples. After the models are built, they rapidly and accurately phase new samples without requiring that the new samples be used to re-build the models. The models set transition probabilities such that the probability for an appearance of any allele within any haplotype is a non-zero number. Furthermore, the most unlikely pairs of haplotypes are discarded from each model at each level until ? of the likelihood mass at each level is discarded. The models are also constructed such that contributing windows of SNPs partially overlap so that phasing decisions near one of the extreme ends of any model is are not significantly determinative of the phase. Additionally, the models are configured such that two or more nodes can be merged during the building/updating procedure to consolidate haplotype clusters having similar distributions.
    Type: Application
    Filed: April 29, 2020
    Publication date: September 24, 2020
    Inventors: Catherine Ann Ball, Keith D. Noto, Kenneth G. Chahine, Mathew J. Barber, Yong Wang
  • Publication number: 20200286579
    Abstract: An input genotype is divided into a plurality of windows, each including a sequence of SNPs. For each window, a diploid HMM is computed based on genotypes and/or phased haplotypes to determine a probability of a haplotype sequence being associated with a particular label. For example, the diploid HMM for a window is used to determine the emission probability that the window corresponds to a set of labels. An inter-window HMM, with a set of states for each window, is computed. Labels are assigned to the input genotype based on the inter-window HMM. Upper and lower bounds are estimated to produce a range of likely percentage values an input can be assigned to a given label. Confidence values are determined indicating a likelihood that an individual inherits DNA from a certain population. Maps are generated with polygons representing regions where a measure of ethnicity of population falls within specific ranges.
    Type: Application
    Filed: May 13, 2020
    Publication date: September 10, 2020
    Inventors: Shiya Song, Keith D. Noto, Yong Wang
  • Publication number: 20200286591
    Abstract: System, computer program products, and methods are disclosed for estimating a degree of ancestral relatedness between two individuals. The haplotype data for a population of individuals is divided into segment windows based on genetic markers, and matched segments for the haplotype data are generated. Each matched segment having a first cM width that exceeds a threshold cM width is included in counting the matched segments in each segment window. A weight associated with each segment window is estimated based on the count of matched segments in the associated segment window. A weighted sum of per-window cM widths for each matched segment is calculated based on the first cM width and the weights associated with the segment windows of the matched segment. The weighted sum of per-window cM widths are used to estimate a degree of ancestral relatedness between two individuals.
    Type: Application
    Filed: May 27, 2020
    Publication date: September 10, 2020
    Inventors: Mathew J. Barber, Yong Wang, Keith D. Noto, Kenneth G. Chahine, Catherine Ann Ball
  • Patent number: 10720229
    Abstract: System, computer program products, and methods are disclosed for estimating a degree of ancestral relatedness between two individuals. The haplotype data for a population of individuals is divided into segment windows based on genetic markers, and matched segments for the haplotype data are generated. Each matched segment having a first cM width that exceeds a threshold cM width is included in counting the matched segments in each segment window. A weight associated with each segment window is estimated based on the count of matched segments in the associated segment window. A weighted sum of per-window cM widths for each matched segment is calculated based on the first cM width and the weights associated with the segment windows of the matched segment. The weighted sum of per-window cM widths are used to estimate a degree of ancestral relatedness between two individuals.
    Type: Grant
    Filed: October 14, 2015
    Date of Patent: July 21, 2020
    Assignee: Ancestry.com DNA, LLC
    Inventors: Mathew J Barber, Yong Wang, Keith D. Noto, Kenneth G. Chahine, Catherine Ann Ball
  • Patent number: 10692587
    Abstract: An input genotype is divided into a plurality of windows, each including a sequence of SNPs. For each window, a diploid HMM is computed based on genotypes and/or phased haplotypes to determine a probability of a haplotype sequence being associated with a particular label. For example, the diploid HMM for a window is used to determine the emission probability that the window corresponds to a set of labels. An inter-window HMM, with a set of states for each window, is computed. Labels are assigned to the input genotype based on the inter-window HMM. Upper and lower bounds are estimated to produce a range of likely percentage values an input can be assigned to a given label. Confidence values are determined indicating a likelihood that an individual inherits DNA from a certain population. Maps are generated with polygons representing regions where a measure of ethnicity of population falls within specific ranges.
    Type: Grant
    Filed: September 11, 2019
    Date of Patent: June 23, 2020
    Assignee: Ancestry.com DNA, LLC
    Inventors: Shiya Song, Keith D. Noto, Yong Wang
  • Patent number: 10679729
    Abstract: Novel haplotype cluster Markov models are used to phase genomic samples. After the models are built, they rapidly and accurately phase new samples without requiring that the new samples be used to re-build the models. The models set transition probabilities such that the probability for an appearance of any allele within any haplotype is a non-zero number. Furthermore, the most unlikely pairs of haplotypes are discarded from each model at each level until c of the likelihood mass at each level is discarded. The models are also constructed such that contributing windows of SNPs partially overlap so that phasing decisions near one of the extreme ends of any model is are not significantly determinative of the phase. Additionally, the models are configured such that two or more nodes can be merged during the building/updating procedure to consolidate haplotype clusters having similar distributions.
    Type: Grant
    Filed: October 19, 2015
    Date of Patent: June 9, 2020
    Assignee: Ancestry.com DNA, LLC
    Inventors: Catherine Ann Ball, Keith D. Noto, Kenneth G. Chahine, Mathew J. Barber, Yong Wang
  • Publication number: 20200160202
    Abstract: An input sample SNP genotype is divided into a plurality of windows, each including a sequence of SNPs. For each window, a diploid hidden Markov Model (HMM) is built and from a haplotype Markov Model (MM). The diploid HMM for a window is used to determine the probability that the window corresponds to a pair of labels (e.g., ethnicity labels). An inter-window HMM, with a set of states for each window, is built based on the diploid HMMs for each window. Labels are assigned to the input sample genotype based on the inter-window HMM.
    Type: Application
    Filed: January 8, 2020
    Publication date: May 21, 2020
    Inventors: Keith D. Noto, Yong Wang
  • Publication number: 20200098445
    Abstract: Described are computational methods to reconstruct the chromosomes (and genomes) of ancestors given genetic data, IBD information, and full or partial pedigree information of some number of their descendants
    Type: Application
    Filed: December 3, 2019
    Publication date: March 26, 2020
    Inventors: Julie M. Granka, Keith D. Noto
  • Publication number: 20200082903
    Abstract: An input genotype is divided into a plurality of windows, each including a sequence of SNPs. For each window, a diploid HMM is computed based on genotypes and/or phased haplotypes to determine a probability of a haplotype sequence being associated with a particular label. For example, the diploid HMM for a window is used to determine the emission probability that the window corresponds to a set of labels. An inter-window HMM, with a set of states for each window, is computed. Labels are assigned to the input genotype based on the inter-window HMM. Upper and lower bounds are estimated to produce a range of likely percentage values an input can be assigned to a given label. Confidence values are determined indicating a likelihood that an individual inherits DNA from a certain population. Maps are generated with polygons representing regions where a measure of ethnicity of population falls within specific ranges.
    Type: Application
    Filed: September 11, 2019
    Publication date: March 12, 2020
    Inventors: Shiya Song, Keith D. Noto, Yong Wang
  • Patent number: 10558930
    Abstract: An input sample SNP genotype is divided into a plurality of windows, each including a sequence of SNPs. For each window, a diploid hidden Markov Model (HMM) is built and from a haplotype Markov Model (MM). The diploid HMM for a window is used to determine the probability that the window corresponds to a pair of labels (e.g., ethnicity labels). An inter-window HMM, with a set of states for each window, is built based on the diploid HMMs for each window. Labels are assigned to the input sample genotype based on the inter-window HMM.
    Type: Grant
    Filed: July 13, 2016
    Date of Patent: February 11, 2020
    Assignee: Ancestry.com DNA, LLC
    Inventors: Keith D. Noto, Yong Wang
  • Patent number: 10504611
    Abstract: Described are computational methods to reconstruct the chromosomes (and genomes) of ancestors given genetic data, IBD information, and full or partial pedigree information of some number of their descendants.
    Type: Grant
    Filed: October 19, 2015
    Date of Patent: December 10, 2019
    Assignee: Ancestry.com DNA, LLC
    Inventors: Julie M. Granka, Keith D. Noto
  • Publication number: 20190139624
    Abstract: Identification of inheritance-by-descent haplotype matches between individuals is described. A set of tables including word match, haplotypes and segment match tables are populated. DNA samples are received and stored. A word identification module extracts haplotype values from each sample. The word match table is indexed according to the unique combination of position and haplotype. Each column represents a different sample, and each cell indicates whether that sample includes that haplotype at that position. The haplotypes table includes the raw haplotype data for each sample. The segment match table is indexed by sample identifier, and columns represent other samples. Each cell is populated to indicate for each identified sample pair which position range(s) include matching haplotypes for both samples. The tables are persistently stored in databases of the matching system. As new sample data is received, each table is updated to include the newly received samples, and additional matching takes place.
    Type: Application
    Filed: October 4, 2018
    Publication date: May 9, 2019
    Inventors: Jake Kelly Byrnes, Aaron Ling, Keith D. Noto, Jeremy Pollack, Catherine Ann Ball, Kenneth Gregory Chahine
  • Patent number: 10114922
    Abstract: Identification of inheritance-by-descent haplotype matches between individuals is described. A set of tables including word match, haplotypes and segment match tables are populated. DNA samples are received and stored. A word identification module extracts haplotype values from each sample. The word match table is indexed according to the unique combination of position and haplotype. Each column represents a different sample, and each cell indicates whether that sample includes that haplotype at that position. The haplotypes table includes the raw haplotype data for each sample. The segment match table is indexed by sample identifier, and columns represent other samples. Each cell is populated to indicate for each identified sample pair which position range(s) include matching haplotypes for both samples. The tables are persistently stored in databases of the matching system. As new sample data is received, each table is updated to include the newly received samples, and additional matching takes place.
    Type: Grant
    Filed: September 17, 2013
    Date of Patent: October 30, 2018
    Assignee: Ancestry.com DNA, LLC
    Inventors: Jake Kelly Byrnes, Aaron Ling, Keith D. Noto, Jeremy Pollack, Catherine Ann Ball, Kenneth Gregory Chahine