Patents by Inventor Yee Him Cheung

Yee Him Cheung has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240095218
    Abstract: A method for compressing data includes obtaining a compression schema customized to a format of a delimited text file, and using the compression schema to parse the delimited text file into a plurality of data blocks, split each of the data blocks into a plurality of data units for efficient selective access, and compress the plurality of data units in the plurality of data blocks using different compression algorithms for improved compression ratio. The delimited file is split into a plurality of data blocks based on the region definitions in the schema. Each of the plurality of data blocks is split into the plurality of data units based on its respective data unit size specified in the schema. The plurality of data units in each of the plurality of data blocks are compressed using the different compression algorithms indicated by the compression instructions in the schema.
    Type: Application
    Filed: October 15, 2020
    Publication date: March 21, 2024
    Inventor: Yee Him CHEUNG
  • Patent number: 11916576
    Abstract: A method for controlling compression of data includes accessing genomic annotation data in one of a plurality of first file formats, extracting attributes from the genomic annotation data, dividing the genomic annotation data into multiple chunks, and processing the extracted attributes and chunks into correlated information. The method also includes selecting different compressors for the attributes and chunks identified in the correlated information and generating a file in a second file format that includes the correlated information and information indicative of the different compressors for the chunks and attributes indicated in the correlated information. The information indicative of the different compressors is processed into the second file format to allow selective decompression of the attributes and chunks indicated in correlated information.
    Type: Grant
    Filed: October 17, 2020
    Date of Patent: February 27, 2024
    Assignee: Koninklijke Philips N.V.
    Inventors: Shubham Chandak, Yee Him Cheung
  • Publication number: 20240038326
    Abstract: A method (100) for characterizing a relevance of one or more genes or pathways to a disease of an individual, comprising: (i) obtaining (110) a phenotype profile for the individual, comprising phenotypic characteristics, and differential gene and protein expression information; (ii) identifying (120) one or more database of stored phenotype profiles similar to the individual phenotype profile; (iii) determining (130) a relevance of a genetic pathway to the individual phenotype profile, based at least in part on a similarity between the genetic pathway's known disease/phenotype associations and a phenotype profile of the individual; (iv) determining (140) a relevance of a gene to the individual phenotype profile, based at least in part on a similarity between the gene's known disease/phenotype associations and a phenotype profile of the individual; and (v) reporting (150) one or more genetic pathways and/or one or more genes most relevant to the individual phenotype profile.
    Type: Application
    Filed: November 20, 2020
    Publication date: February 1, 2024
    Inventors: Yee Him CHEUNG, Jie Wu, Nevenka Dimitrova
  • Publication number: 20240013864
    Abstract: A method (100) for compressing and decompressing a data file, comprising: (i) receiving (120) a data file for compression comprising a plurality of different attributes; (ii) identifying (130) a first attribute of the plurality of different attributes; (iii) selecting (140) a plurality of compression types and/or configurations; (iv) compressing (150) at least some of the data from the received data file for the identified first attribute using each of the selected plurality of compression types and/or configurations; (v) determining (160) which one of the selected plurality of compression types and/or configurations is most suitable for compression; (vi) generating (170) a compression parameter data structure comprising an identification of the selected plurality of compression types and/or configurations; (vii) compressing (180) the data from the received data file for the first attribute to generate a compressed data file; and (viii) storing (190) the compression parameter data structure and the compressed
    Type: Application
    Filed: November 11, 2021
    Publication date: January 11, 2024
    Inventor: Yee Him Cheung
  • Patent number: 11854694
    Abstract: In patient cohort identification, clustering (30) of patients is performed using a patient comparison metric dependent on a set of features (24). Information is displayed on sample patients who are similar or dissimilar to a query patient according to the clustering. User inputted comparison values are received comparing the sample patients with the query patient. The set of features and/or feature weights are adjusted to generate an adjusted patient comparison metric having improved agreement with the user inputted comparison values. The clustering is repeated using the adjusted patient comparison metric. A patient cohort is identified from a cluster (34) containing the query patient produced by the last clustering repetition. The information on the sample patients may be shown by simultaneously displaying two or more graphical modality representations (70, 72, 74) each plotting the sample patients and the query patient against two or more features of the modality.
    Type: Grant
    Filed: March 8, 2017
    Date of Patent: December 26, 2023
    Assignee: KONINKLIJKE PHILIPS N.V.
    Inventors: Vartika Agrawal, Alexander Ryan Mankovich, Nevenka Dimitrova, Nilanjana Banerjee, Yee Him Cheung, Johanna Maria De Bont, Jozef Hieronymus Maria Raijmakers
  • Publication number: 20230377692
    Abstract: A method (100) for storing genomic data within a data structure comprising a file structure, comprising: (i) receiving (120) a genomic dataset comprising a plurality of fields or attributes of different data types; (ii) generating (130) an information metadata structure for the genomic dataset, comprising one or more of: information about an annotation table, including one or more user profiles and associated profile permission; analytics information configured to facilitate verification of data reproducibility; access history for the genomic dataset, configured to facilitate data traceability; and linkage information defining a relationship between the annotation table and one or more data objects; (ii) compressing (140) the genomic data and information metadata using a compression algorithm; and (iv) storing (150) the compressed genomic dataset and information metadata in a container data structure; wherein some or all of the annotation table is encrypted.
    Type: Application
    Filed: October 4, 2021
    Publication date: November 23, 2023
    Inventor: Yee Him Cheung
  • Publication number: 20230335224
    Abstract: A method (100) comprising: receiving (120) a genomic dataset comprising genomic data of one or more of a plurality of fields or attributes of different data; generating (130) a protection metadata structure for the genomic dataset, comprising one or more of: (i) specifications for selective encryption of one or more data components and regions of genomic data in an annotation table; (ii) specifications for selective signing of one or more data components and regions of genomic data in the annotation table; (iii) user key information; and (iv) access control policy; compressing (140) the genomic data and the protection metadata structure using one or more compression algorithms to generate a compressed genomic dataset and compressed protection metadata structure; and storing (150) the compressed genomic dataset and the compressed protection metadata structure in a container data structure in memory.
    Type: Application
    Filed: September 29, 2021
    Publication date: October 19, 2023
    Inventor: Yee Him Cheung
  • Publication number: 20230253074
    Abstract: A method and a system for decoding MPEG-G encoded data of genomic information, including: receiving MPEG-G encoded data; extracting encoding parameters; selecting an arithmetic decoding type based upon the extracted encoding parameters; selecting a predictor type specifying the method to obtain probabilities of symbols which were used for arithmetically encoding the data, based upon the extracted encoding parameters; selecting arithmetic coding contexts based upon the extracted encoding parameters; and decoding the encoded data using the selected predictor and the selected arithmetic coding contexts.
    Type: Application
    Filed: June 30, 2021
    Publication date: August 10, 2023
    Inventors: Shubham Chandak, Yee Him Cheung
  • Publication number: 20230223110
    Abstract: A method for calling variants in genetic data includes sorting nodes in a graph-based reference genome, assigning identification information to the sorted nodes, assigning depth values to respective ones of the sorted nodes, determining a reference genome path and one or more variation paths, and determining one or more variants in the graph-based reference genome based on the depth values assigned to nodes on the one or more variation paths.
    Type: Application
    Filed: September 24, 2020
    Publication date: July 13, 2023
    Inventors: Zafar Ahmad, Alex Ryan Mankovich, Yee Him Cheung
  • Publication number: 20230207069
    Abstract: A method (100) for packaging genomic data within a file structure, the method comprising: (i) receiving (110) a genomic dataset comprising genomic data; (ii) extracting (120) a plurality of attributes from the genomic dataset, wherein each of the plurality of attributes is defined within an attribute information table of the data structure; (iii) breaking (130) each attribute into a plurality of chunks of a predetermined size; (iv) indexing (140) each of the plurality of chunks in the master index of the data structure; (v) compressing (150) each of the plurality of chunks individually; and (vi) packaging (160) each compressed chunk within an allocated location as defined by the master index; wherein the data structure is configured such that each of the plurality of chunks can be decompressed individually.
    Type: Application
    Filed: March 31, 2021
    Publication date: June 29, 2023
    Inventor: Yee Him Cheung
  • Publication number: 20230061214
    Abstract: A system (400) configured to generate a variant profile and a gene expression profile from a single cell sample, comprising: variant validation data and gene expression comparison data; single cell DNA sequencing data comprising a plurality of verified variants; single cell RNA sequencing data comprising a gene expression profile for the sample; a processor (420) configured to: (i) validate the identified variants using the variant validation data by: comparing the identified variant to the validation data; and assigning a validated classification status to the variant if the variant corresponds to the validation data; (ii) compare the obtained gene expression data to the obtained expression comparison data; and (iii) generate, based on the comparison and using a projection function, a final gene expression profile for the single cell sample; and a user interface (440) configured to provide a report comprising the identified variants and the generated final gene expression profile.
    Type: Application
    Filed: January 13, 2021
    Publication date: March 2, 2023
    Inventors: Jie WU, Yee Him CHEUNG
  • Publication number: 20230053844
    Abstract: A method for compressing information includes accessing a read of genomic sequencing data, aligning the read to a reference, generating alignment data based on alignment of the read, obtaining a set of contexts based on the alignment data, and compressing quality values corresponding to the alignment data based on the set of contexts. The alignment data may provide an indication of errors in the genomic sequencing data, and each of the quality values may provide an indication of a probability of error at one or more bases in the genomic sequencing data.
    Type: Application
    Filed: January 27, 2021
    Publication date: February 23, 2023
    Inventors: Shubham Chandak, Yee Him Cheung
  • Publication number: 20230011085
    Abstract: A method (100) for determining a copy number variation (CNV) profile, comprising: (i) receiving (110) sparse genome sequencing data; (ii) determining (120) an unadjusted CNV profile; (iii) normalizing (130) the unadjusted CNV profile; (iv) receiving (140) a range for possible ploidy and for a possible contamination rate; (v) determining (150) adjusted segmentation values for the CNV profile; (vi) determining (160) a plurality of adjustment scores comprising a distance between an adjusted segmentation value and a closest whole integer for a CNV call; (vii) comparing (170) the determined plurality of adjustment scores to one or more predetermined factors for selecting a CNV profile best fit; (viii) selecting (180) one of the plurality of adjustment scores as a best fit for the copy number variation profile of the tumor cells of the tumor; (ix) generating (190) an adjusted CNV profile report; and (x) reporting (192) the generated adjusted CNV profile report.
    Type: Application
    Filed: December 3, 2020
    Publication date: January 12, 2023
    Inventors: Jie WU, Yee Him CHEUNG, Nevenka Dimitrova
  • Publication number: 20220406406
    Abstract: A method (100) for characterizing a functional impact of a plurality of variants, comprising: obtaining (110) information comprising at least a plurality of variants, gene expression information, copy number variation, and epigenetic effects; determining (120) a splice status for the variant; determining (130) a variant-based expression regulation status, comprising whether the variant has an effect on gene expression; determining (140) a gene-based expression regulation status, comprising an indication of whether the variant has a functional impact on a target gene; determining (150) a gene-based copy number variant (CNV) and epigenetic impact status, comprising whether one or both has an impact on expression of a gene; adjusting (160), based on the CNV and epigenetic impact status, the variant-based and/or the gene-based expression regulation status; and reporting (170) at least the adjusted variant-based and/or the adjusted gene-based expression regulation status for each of a plurality of variants and/or
    Type: Application
    Filed: November 26, 2020
    Publication date: December 22, 2022
    Inventors: Yee Him CHEUNG, Jie Wu, Nevenka Dimitrova
  • Publication number: 20220399079
    Abstract: A method (100) for characterizing variant expression status for variants identified from a genomic sample, comprising: (i) obtaining (110) DNA sequencing data for the genomic sample; (ii) obtaining (110) RNA sequencing data for the genomic sample, wherein the obtained RNA sequencing data further comprises expression data for each variant; (iii) merging (130) the aligned DNA and RNA sequencing data into a merged alignment; (iv) identifying (140) a plurality of variants relative to the reference genome to generate a set of variants; (v) characterizing (150) an RNA-editing and/or expression status for each of at least a plurality of variants, wherein the expression status comprises one of a plurality of allele-specific expression categorizations comprising expression information for an alternative allele of the variant and expression information for a reference allele of the variant if there is one; and (vi) generating (160) a report comprising the characterized expression status for the variants.
    Type: Application
    Filed: November 5, 2020
    Publication date: December 15, 2022
    Inventors: Yee Him CHEUNG, Jie WU, Nevenka DIMITROVA
  • Publication number: 20220382904
    Abstract: A method of securely accessing a database with sensitive data, such as the clinical information of patients, by a client in a privacy-preserving manner, including: communicating with the server to obtain tags for specific attribute-value pairs when the client is authorized to make a query; imposing a tag quota per client and restricting tag generation to authorized query terms with valid digital signatures from a third-party authority; storing the tags and their associated query terms in confidence for future queries; sending a combination of tags that define the terms of a conjunctive query over a secure channel to a proxy; receiving from the proxy encrypted coefficients of a polynomial whose roots are indices to the query results; decrypting the encrypted coefficients in a first protocol with the server; calculating the roots of the polynomial based upon the decrypted coefficients and discarding any superfluous roots; obtaining the encrypted records associated with the calculated roots from the proxy; and d
    Type: Application
    Filed: August 21, 2020
    Publication date: December 1, 2022
    Inventors: Yee Him CHEUNG, Alex Ryan MANKOVICH
  • Publication number: 20220368347
    Abstract: A method for controlling compression of data includes accessing genomic annotation data in one of a plurality of first file formats, extracting attributes from the genomic annotation data, dividing the genomic annotation data into multiple chunks, and processing the extracted attributes and chunks into correlated information. The method also includes selecting different compressors for the attributes and chunks identified in the correlated information and generating a file in a second file format that includes the correlated information and information indicative of the different compressors for the chunks and attributes indicated in the correlated information. The information indicative of the different compressors is processed into the second file format to allow selective decompression of the attributes and chunks indicated in correlated information.
    Type: Application
    Filed: October 17, 2020
    Publication date: November 17, 2022
    Inventors: Shubham CHANDAK, Yee Him CHEUNG
  • Publication number: 20220359038
    Abstract: A method for storing, by a processor, a genomic graph representing a plurality of individual genomes, including: storing a linear representation of a reference genome in a data storage; receiving a first genome; identifying variations in the first genome from the reference genome; generating graph edges for each variation in the first genome from the reference genome; generating for each generated graph edge: an edge identifier that uniquely identifies the current edge in the genome graph; a start edge identifier that identifies the edge from which the current edge branches out; a start position that indicates the position on the start edge that serves as an anchoring point for the current edge; an end edge identifier that identifies the edge into which the current edge joins in; an end position that indicates the position on the end edge that serves as an anchoring point for the current edge; and a sequence indicating the nucleotide sequence of the current edge; and storing the edge identifier, start edge id
    Type: Application
    Filed: September 29, 2020
    Publication date: November 10, 2022
    Inventor: Yee Him Cheung
  • Publication number: 20220310271
    Abstract: A computer-implemented method for constructing a state transition graph, wherein the method includes obtaining data that includes treatment history and clinical data of a cohort of patients; and generating, by the one or more computing devices, individual treatment pathways for individual patients of the cohort of patients using the treatment history and clinical data for the individual patients; wherein the individual treatment pathways are generated using user-defined parameters including: one or more qualifying events; one or more response states to the one or more qualifying events; and one or more reversible or collapsible events. The method additionally includes constructing a state transition graph that represents multiple aligned and merged individual treatment pathways including the one or more qualifying events, the one or more response states to the one or more qualifying events and the one or more reversible or collapsible events.
    Type: Application
    Filed: August 21, 2020
    Publication date: September 29, 2022
    Inventors: Yee Him CHEUNG, Alex Ryan MANKOVICH
  • Publication number: 20220277857
    Abstract: A computer-implemented method for constructing a state transition graph, wherein the method includes obtaining data that includes treatment history and clinical data of a cohort of patients; and generating, by the one or more computing devices, individual treatment pathways for individual patients of the cohort of patients using the treatment history and clinical data for the individual patients; wherein the individual treatment pathways are generated using user-defined parameters including: one or more qualifying events; one or more response states to the one or more qualifying events; and one or more reversible or collapsible events. The method additionally includes constructing a state transition graph that represents multiple aligned and merged individual treatment pathways including the one or more qualifying events, the one or more response states to the one or more qualifying events and the one or more reversible or collapsible events.
    Type: Application
    Filed: August 20, 2020
    Publication date: September 1, 2022
    Inventors: Yee Him CHEUNG, Alexander Ryan MANKOVICH