Patents by Inventor Kai-How FARH
Kai-How FARH has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20230207052Abstract: A computer-implemented method of quantifying a strength of association of genes associated with a phenotype and a contribution of rare variants to a phenotype response by calculating a weighted burden score for a plurality of associated genes with a specified phenotype, wherein the burden score identifies identifying consequential, non-random association in a cohort between carrier status of each of the associated genes and a phenotype response to presence in the associated genes of one or more rare pathogenic variants. Respective effective strength scores are determined for the consequential, non-random association for genes selected from the associated genes based on respective burden scores at per-gene resolution.Type: ApplicationFiled: October 18, 2022Publication date: June 29, 2023Applicant: ILLUMINA, INC.Inventors: Petko Plamenov FIZIEV, Jeremy Francis MCRAE, Kai-How FARH
-
Publication number: 20230207047Abstract: The technology disclosed relates to generating species-differentiable evolutionary profiles using a weighting logic. In particular, the technology disclosed relates to determining a weighted summary statistic for a given residue category at a given position in a multiple sequence alignment based on one or more weights of one or more sequences in the multiple sequence alignment that have a residue of the given residue category at the given position.Type: ApplicationFiled: October 27, 2022Publication date: June 29, 2023Applicants: Illumina, Inc., Illumina Cambridge LimitedInventors: Tobias HAMP, Kai-How FARH
-
Publication number: 20230207051Abstract: A first reference genome is segmented into a plurality of bins and high-quality sequenced reads are mapped on a bin-by-bin basis to the plurality of bins in the first reference genome, and a second reference genome is segmented into a plurality of bins and high-quality sequenced reads are mapped on a bin-by-bin basis to the plurality of bins in the second reference genome. A best-mapped bin is identified in the second reference genome based on the greatest degree of match between the best-mapped bin in the second reference genome and a corresponding bin in the first reference genome.Type: ApplicationFiled: September 23, 2022Publication date: June 29, 2023Applicants: Illumina, Inc., Illumina Cambridge LimitedInventors: Hong GAO, Tobias HAMP, Joshua Goodwin Jon MCMASTER-SCHRAIBER, Laksshman SUNDARAM, Kai-How FARH
-
Publication number: 20230207054Abstract: The technology disclosed relates to a deep learning network system for evolutionary conservation prediction. In one implementation, the system includes a first model for processing a first multiple sequence alignment that aligns a query sequence with a masked base at a target position to N non-query sequences and predicting a first identity of the masked base at the target position. The system also includes a second model for processing a second multiple sequence alignment that aligns the query sequence to M non-query sequences, where M>N, and predicting a second identity of the masked base at the target position. The system further includes an evolutionary conservation determination logic configured to measure an evolutionary conservation of the masked base at the target position based on the first and second identities of the masked base.Type: ApplicationFiled: September 16, 2022Publication date: June 29, 2023Applicant: Illumina, Inc.Inventors: Sabrina RASHID, Kai-How FARH
-
Publication number: 20230207057Abstract: The technology disclosed relates to determining feasibility of using a reference genome of a non-target species for variant calling a sample of a target species. In particular, the technology disclosed relates to mapping sequenced reads of a sample of a target species to a reference genome of a non-target species to detect a first set of variants in the sequenced reads of the sample of the target species, and mapping the sequenced reads of the sample of the target species to a reference genome of a pseudo-target species to detect a second set of variants in the sequenced reads of the sample of the target species.Type: ApplicationFiled: September 23, 2022Publication date: June 29, 2023Applicants: Illumina, Inc., Illumina Cambridge LimitedInventors: Hong GAO, Tobias HAMP, Joshua Goodwin Jon MCMASTER-SCHRAIBER, Laksshman SUNDARAM, Kai-How FARH
-
Publication number: 20230207067Abstract: A computer-implemented method of performing an optimized burden test for a particular gene, in which an optimal combination of a maximum allele count and a minimum pathogenicity score threshold that maximize significance of burden testing for rare deleterious variants are determined using a grid search protocol. Each combination of maximum allele count and minimum pathogenicity score threshold is tested with a t-test to obtain effect size and p-value. The combination of allele count and pathogenicity score threshold with the most significant p-value is selected as the optimal parameters for the rare deleterious variant burden test for a particular gene.Type: ApplicationFiled: October 18, 2022Publication date: June 29, 2023Applicant: ILLUMINA, INC.Inventors: Petko Plamenov FIZIEV, Jeremy Francis MCRAE, Kai-How FARH
-
Publication number: 20230207060Abstract: The technology disclosed relates to accessing a multiple sequence alignment that aligns a query residue sequence to a plurality of non-query residue sequences, applying a set of periodically-spaced masks to a first set of residues at a first set of positions in the multiple sequence alignment, and cropping a portion of the multiple sequence alignment that includes the set of periodically-spaced masks at the first set of positions, and a second set of residues at a second set of positions in the multiple sequence alignment to which the set of periodically-spaced masks is not applied. The first set of residues includes a residue-of-interest at a position-of-interest in the query residue sequence.Type: ApplicationFiled: October 27, 2022Publication date: June 29, 2023Applicants: Illumina, Inc., Illumina Cambridge LimitedInventors: Tobias HAMP, Anastasia Susanna Dagmar DIETRICH, Yibing WU, Jeffrey Mark EDE, Kai-How FARH
-
Publication number: 20230207064Abstract: The technology disclosed relates to a system for inter-model prediction score recalibration. The system includes a first model that generates, based on evolutionary conservation summary statistics of amino acids in a reference protein sequence, a first set of pathogenicity scores with rankings for variants that mutate the reference sequence to alternate protein sequences. The system further includes a second model that generates, based on epistasis expressed by amino acid patterns spanning a multiple sequence alignment aligning the reference sequence to non-target sequences, a second set of pathogenicity scores with rankings for the variants. The system further includes a rank loss determination logic that determines a rank loss parameter by comparing the two sets of rankings, a loss function reconfiguration logic that reconfigures a loss function based on the rank loss parameter, and a training logic that uses the reconfigured loss function to train the first model.Type: ApplicationFiled: September 16, 2022Publication date: June 29, 2023Applicants: Illumina, Inc., Illumina Cambridge LimitedInventors: Tobias HAMP, Kai-How FARH
-
Publication number: 20230207061Abstract: A system comprises chunking logic that chunks (or splits) a multiple sequence alignment (MSA) into chunks, first attention logic that attends to a representation of the chunks and produces a first attention output, first aggregation logic that produces a first aggregated output that contains those features in the first attention output that correspond to masked residues in the plurality of masked residues, mask revelation logic that produces an informed output based on the first aggregated output and a Boolean mask, second attention logic that attends to the informed output and produces a second attention output based on masked residues revealed by the Boolean mask, second aggregation logic that produces a second aggregated output that contains those features in the second attention output that correspond to masked residues concealed by the Boolean mask, and output logic that produces identifications of the masked residues based on the second aggregated output.Type: ApplicationFiled: October 27, 2022Publication date: June 29, 2023Applicants: Illumina, Inc., Illumina Cambridge LimitedInventors: Tobias HAMP, Anastasia Susanna Dagmar DIETRICH, Yibing WU, Jeffrey Mark EDE, Kai-How FARH
-
Publication number: 20230207132Abstract: A computer-implemented method of predicting phenotypic shift in response to usage of a plurality of drugs on a plurality of phenotypes of a cohort of individuals with a plurality of confounders. The cohort of individuals has associated phenotype measurements, covariate measurements, and drug usage patterns for two separate time points. The phenotype measurements for the first and second time points are covariate-corrected and drug-usage corrected through the use of biostatistics.Type: ApplicationFiled: October 18, 2022Publication date: June 29, 2023Applicant: ILLUMINA, INC.Inventors: Petko Plamenov FIZIEV, Jeremy Francis MCRAE, Kai-How FARH
-
Publication number: 20230207055Abstract: The technology disclosed relates to identifying differential selective constraint on a gene-by-gene basis between a target species and one or more non-target species. The disclosed systems and methods can use a population genetics model wherein an average selection coefficient per gene per species is estimated and further applied to estimate selective constraint. The disclosed systems and methods can use a generalized linear mixed model wherein depletion of missense variants per gene per species is estimated and further applied to estimate selective constraint. In some cases, the disclosed systems and methods can use various combinations of the components from the population genetics model or the generalized linear mixed model to identify the intersection of genes classified as having differential selective constraint by numerous approaches for validation purposes.Type: ApplicationFiled: December 28, 2022Publication date: June 29, 2023Inventors: Hong GAO, Joshua Goodwin Jon MCMASTER-SCHRAIBER, Kai-How FARH
-
Publication number: 20230207058Abstract: The technology disclosed relates to variant calling of sequenced reads of a sample of a target species against a reference genome of a pseudo-target species. Low-quality variants are identified as false positive variants that are present in the second set of variants but absent from the first set of variants.Type: ApplicationFiled: September 23, 2022Publication date: June 29, 2023Applicants: Illumina, Inc., Illumina Cambridge LimitedInventors: Hong GAO, Tobias HAMP, Joshua Goodwin Jon MCMASTER-SCHRAIBER, Laksshman SUNDARAM, Kai-How FARH
-
Publication number: 20230108368Abstract: The technology disclosed relates to training a pathogenicity predictor.Type: ApplicationFiled: September 26, 2022Publication date: April 6, 2023Applicants: Illumina, Inc., Illumina Cambridge LimitedInventors: Tobias HAMP, Hong GAO, Kai-How FARH
-
Publication number: 20230108241Abstract: The technology disclosed relates to determining pathogenicity of nucleotide variants. In particular, the technology disclosed relates to specifying a particular amino acid at a particular position in a protein as a gap amino acid, and specifying remaining amino acids at remaining positions in the protein as non-gap amino acids, generating a gapped spatial representation of the protein that includes spatial configurations of the non-gap amino acids, and excludes a spatial configuration of the gap amino acid, determining an evolutionary conservation at the particular position of respective amino acids of respective amino acid classes based at least in part on the gapped spatial representation, and based at least in part on the evolutionary conservation of the respective amino acids, determining a pathogenicity of respective nucleotide variants that respectively substitute the particular amino acid with the respective amino acids in alternate representations of the protein.Type: ApplicationFiled: September 26, 2022Publication date: April 6, 2023Applicants: Illumina, Inc., Illumina Cambridge LimitedInventors: Tobias HAMP, Hong GAO, Kai-How FARH
-
Publication number: 20230059877Abstract: The technology disclosed relates to constructing a convolutional neural network-based classifier for variant classification. In particular, it relates to training a convolutional neural network-based classifier on training data using a backpropagation-based gradient update technique that progressively match outputs of the convolutional network network-based classifier with corresponding ground truth labels. The convolutional neural network-based classifier comprises groups of residual blocks, each group of residual blocks is parameterized by a number of convolution filters in the residual blocks, a convolution window size of the residual blocks, and an atrous convolution rate of the residual blocks, the size of convolution window varies between groups of residual blocks, the atrous convolution rate varies between groups of residual blocks. The training data includes benign training examples and pathogenic training examples of translated sequence pairs generated from benign variants and pathogenic variants.Type: ApplicationFiled: October 20, 2022Publication date: February 23, 2023Applicant: Illumina, Inc.Inventors: Kishore JAGANATHAN, Kai-How FARH, Sofia KYRIAZOPOULOU PANAGIOTOPOULOU, Jeremy Francis MCRAE
-
Publication number: 20230047347Abstract: The technology disclosed describes determination of which elements of a sequence are nearest to uniformly spaced cells in a grid, where the elements have element coordinates, and the cells have dimension-wise cell indices and cell coordinates. The determination includes generating an element-to-cells mapping that maps, to each of the elements, a subset of the cells. The subset of the cells mapped to a particular element in the sequence includes a nearest cell in the grid and one or more neighborhood cells in the grid, and the nearest cell is selected based on matching element coordinates of the particular element to the cell coordinates. The determination further includes generating a cell-to-elements mapping that maps, to each of the cells, a subset of the elements, and using the cell-to-elements mapping to determine, for each of the cells, a nearest element in the sequence.Type: ApplicationFiled: October 26, 2022Publication date: February 16, 2023Applicants: Illumina, Inc., Illumina Cambridge LimitedInventors: Tobias HAMP, Hong GAO, Kai-How FARH
-
Publication number: 20230044917Abstract: The technology disclosed relates to a variant pathogenicity prediction network. The variant pathogenicity classifier includes memory, a variant encoding sub-network, a protein contact map generation sub-network, and a pathogenicity scoring sub-network. The memory stores a reference amino acid sequence of a protein, and an alternative amino acid sequence of the protein that contains a variant amino acid caused by a variant nucleotide. The variant encoding sub-network is configured to process the alternative amino acid sequence, and generate a processed representation of the alternative amino acid sequence. The protein contact map generation sub-network is configured to process the reference amino acid sequence and the processed representation of the alternative amino acid sequence, and generate a protein contact map of the protein. The pathogenicity scoring sub-network is configured to process the protein contact map, and generate a pathogenicity indication of the variant amino acid.Type: ApplicationFiled: July 28, 2022Publication date: February 9, 2023Applicant: ILLUMINA, INC.Inventors: Chen CHEN, Hong GAO, Laksshman S. SUNDARAM, Kai-How FARH
-
Publication number: 20230045003Abstract: The technology disclosed relates to a variant pathogenicity classifier. The variant pathogenicity classifier comprises memory and runtime logic. The memory stores (i) a reference amino acid sequence of a protein, (ii) an alternative amino acid sequence of the protein that contains a variant amino acid caused by a variant nucleotide, and (iii) a protein contact map of the protein. The runtime logic has access to the memory, and is configured to provide (i) the reference amino acid sequence, (ii) the alternative amino acid sequence, and (iii) the protein contact map as input to a first neural network, and to cause the first neural network to generate a pathogenicity indication of the variant amino acid as output in response to processing (i) the reference amino acid sequence, (ii) the alternative amino acid sequence, and (iii) the protein contact map.Type: ApplicationFiled: July 28, 2022Publication date: February 9, 2023Applicant: ILLUMINA, INC.Inventors: Chen CHEN, Hong GAO, Laksshman S. SUNDARAM, Kai-How FARH
-
Patent number: 11538555Abstract: The technology disclosed relates to determining pathogenicity of nucleotide variants. In particular, the technology disclosed relates to specifying a particular amino acid at a particular position in a protein as a gap amino acid, and specifying remaining amino acids at remaining positions in the protein as non-gap amino acids. The technology disclosed further relates to generating a gapped spatial representation of the protein that includes spatial configurations of the non-gap amino acids, and excludes a spatial configuration of the gap amino acid, and determining a pathogenicity of a nucleotide variant based at least in part on the gapped spatial representation, and a representation of an alternate amino acid created by the nucleotide variant at the particular position.Type: GrantFiled: November 22, 2021Date of Patent: December 27, 2022Assignees: Illumina, Inc., Illumina Cambridge LimitedInventors: Tobias Hamp, Hong Gao, Kai-How Farh
-
Publication number: 20220406411Abstract: An artificial intelligence-based system comprises an input preparation module that accesses a sequence database and generates an input base sequence. The input base sequence comprises a target base sequence with target bases, wherein the target base sequence is flanked by a right base sequence with downstream context bases, and a left base sequence with upstream context bases. A sequence-to-sequence model processes the input base sequence and generates an alternative representation of the input base sequence. An output module processes the alternative representation of the input base sequence and produces at least one per-base output for each of the target bases in the target base sequence. The per-base output specifies, for a corresponding target base, signal levels of a plurality of epigenetic tracks.Type: ApplicationFiled: September 18, 2020Publication date: December 22, 2022Applicant: Illumina, Inc.Inventors: Sofia KYRIAZOPOULOU PANAGIOTOPOULOU, Kai-How FARH