Patents by Inventor Jimeng Sun

Jimeng Sun has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240105292
    Abstract: In one aspect, the present disclosure relates to a platform for creating synthetic electronic health records, the platform being configured to perform operations including receiving EHR data and encoding the received EHR data as a plurality of fixed length vectors to form a fixed-length matrix. The platform provides the fixed-length matrix to a machine learning model as input to produce a plurality of visit history representations. For one or more particular visit history representations, of the plurality of visit history representations, the platform applies code information associated with the particular visit history. One or more appended visit histories are provided to one or more masked linear layers to produce a probability matrix comprising probabilities for each code for each visit. The platform produces one or more synthetic EHRs based on repeated sequential generation of and sampling from the probability matrix.
    Type: Application
    Filed: August 29, 2023
    Publication date: March 28, 2024
    Inventors: Brandon Philip Theodorou, Jimeng Sun
  • Publication number: 20240045994
    Abstract: An example embodiment may involve obtaining text-based, ground truth electronic health records (EHRs), wherein the ground truth EHRs specify a sequence of medical visits involving a plurality of modalities, and wherein each of the medical visits specifies tokens representing at least one of the modalities; generating a training data set by perturbing the ground truth EHRs, wherein perturbing the ground truth EHRs involves deleting or shuffling some of the tokens in the ground truth EHRs; and iteratively applying a machine learning trainer application to the training data set, wherein the machine learning trainer application includes: (i) a bidirectional language model encoder that takes EHRs within the training data set and produces vector embeddings therefrom, (ii) an autoregressive language model decoder that takes the vector embeddings and infers predicted EHRs therefrom, and (iii) a loss function that compares the predicted EHRs to their corresponding ground truth EHRs.
    Type: Application
    Filed: July 5, 2023
    Publication date: February 8, 2024
    Inventors: Jimeng Sun, Zifeng Wang
  • Publication number: 20230034559
    Abstract: A system for prediction of clinical trial outcome. The system includes: a processor of a trial prediction (TP) node connected to at least one cloud server node over a network configured to host a machine learning (ML) module; a memory on which are stored machine-readable instructions that when executed by the processor, cause the processor to: receive a clinical trial (CT) data, parse the CT data to derive drug molecules data, disease information data, and trial protocols data, encode the drug molecules data, the disease information data, and the trial protocols data into corresponding embeddings, generate knowledge pre-trained embeddings using external knowledge data, and provide the knowledge pre-trained embeddings to the ML module for prediction of the CT outcome.
    Type: Application
    Filed: May 19, 2022
    Publication date: February 2, 2023
    Inventors: Tianfan Fu, Kexin Huang, Jimeng Sun
  • Patent number: 11315685
    Abstract: A method of building a machine learning pipeline for predicting the efficacy of anti-epilepsy drug treatment regimens is provided.
    Type: Grant
    Filed: January 25, 2017
    Date of Patent: April 26, 2022
    Assignee: UCB BIOPHARMA SRL
    Inventors: Kunal Malhotra, Sungtae An, Jimeng Sun, Myung Choi, Cynthia Dilley, Chris Clark, Joseph Robertson, Edward Han-Burgess
  • Patent number: 11195133
    Abstract: Systems and methods for individual risk factor identification include identifying common risk factors for one or more risk targets from population data. Individuals are stratified into clusters based upon the common risk factors. A discriminability of each of the common risk factors is determined, using a processor, for a target cluster using individual data of the target cluster to provide re-ranked common risk factors as individual risk factors for the target cluster, such that the discriminability is a measure of how a risk factor discriminates its cluster from other clusters.
    Type: Grant
    Filed: May 9, 2018
    Date of Patent: December 7, 2021
    Assignee: International Business Machines Corporation
    Inventors: David H. Gotz, Pei-Yun S. Hsueh, Jianying Hu, Jimeng Sun
  • Patent number: 10832821
    Abstract: A system and method for providing a temporally dynamic model parameter include building a model parameter by minimizing a loss function based on patient measurements taken at a plurality of time points. Temporally related values of the model parameter are identified, using a processor, having a same type of patient measurement taken at different time points. At least one value of the model parameter and temporally related values of the at least one value are selected to provide a temporally dynamic model parameter.
    Type: Grant
    Filed: August 19, 2013
    Date of Patent: November 10, 2020
    Assignee: International Business Machines Corporation
    Inventors: Shahram Ebadollahi, Jianying Hu, Jimeng Sun, Fei Wang, Jiayu Zhou
  • Patent number: 10535007
    Abstract: A method for determining a similarity between a plurality of graphs includes inferring a low-rank representation of a first graph, inferring a low-rank representation of a second graph, wherein the low-rank representations of the first and second graphs are stored in memory, estimating a left interaction between the first and second graphs, estimating a middle interaction between the first and second graphs, estimating a right interaction between the first and second graphs, wherein the estimations are based on the low-rank representations of the first and second graphs stored in memory, and aggregating the left interaction, the middle interaction and the right interaction into a kernel, wherein the kernel is indicative of the similarity between the first and second graphs.
    Type: Grant
    Filed: January 17, 2013
    Date of Patent: January 14, 2020
    Assignee: International Business Machines Corporation
    Inventors: U Kang, Ravindranath Konuru, Hanghang Tong, Jimeng Sun
  • Publication number: 20180260925
    Abstract: Systems and methods for individual risk factor identification include identifying common risk factors for one or more risk targets from population data. Individuals are stratified into clusters based upon the common risk factors. A discriminability of each of the common risk factors is determined, using a processor, for a target cluster using individual data of the target cluster to provide re-ranked common risk factors as individual risk factors for the target cluster, such that the discriminability is a measure of how a risk factor discriminates its cluster from other clusters.
    Type: Application
    Filed: May 9, 2018
    Publication date: September 13, 2018
    Inventors: DAVID H. GOTZ, PEI-YUN S. HSUEH, JIANYING HU, JIMENG SUN
  • Publication number: 20180211012
    Abstract: A method of building a machine learning pipeline for predicting the efficacy of anti-epilepsy drug treatment regimens is provided.
    Type: Application
    Filed: January 25, 2017
    Publication date: July 26, 2018
    Inventors: Kunal MALHOTRA, Sungtae AN, Jimeng SUN, Myung CHOI, Cynthia DILLEY, Chris CLARK, Joseph ROBERTSON, Edward Han-Burgess
  • Publication number: 20180211010
    Abstract: A method of building a machine learning pipeline for predicting refractoriness of epilepsy patients is provided. The method includes providing electronic health records data; constructing a patient cohort from the electronic health records data by selecting patients based on failure of at least one anti-epilepsy drug; constructing a set features found in or derived from the electronic health records data; electronically processing the patient cohort to identify a subset of the features that are predictive for refractoriness for inclusion in a predictive model configured for classifying patients as refractory or non-refractory; and training the predictive computerized model to classify the patients having at least one anti-epilepsy drug failure based on likelihood of becoming refractory.
    Type: Application
    Filed: January 23, 2017
    Publication date: July 26, 2018
    Inventors: Kunal MALHOTRA, Sungtae AN, Jimeng SUN, Myung CHOI, Cynthia DILLEY, Chris CLARK, Joseph ROBERTSON, Edward HAN-BURGESS
  • Patent number: 9996889
    Abstract: Systems and methods for individual risk factor identification include identifying common risk factors for one or more risk targets from population data. Individuals are stratified into clusters based upon the common risk factors. A discriminability of each of the common risk factors is determined, using a processor, for a target cluster using individual data of the target cluster to provide re-ranked common risk factors as individual risk factors for the target cluster, such that the discriminability is a measure of how a risk factor discriminates its cluster from other clusters.
    Type: Grant
    Filed: October 1, 2012
    Date of Patent: June 12, 2018
    Assignee: International Business Machines Corporation
    Inventors: David H. Gotz, Pei-Yun S. Hsueh, Jianying Hu, Jimeng Sun
  • Patent number: 9805307
    Abstract: A method for determining a correspondence between a first node set of a first graph and a second node set of a second graph includes building a feature representation for each of the first graph and the second graph, and inferring the correspondence between the first node set and the second node set based on the feature representations.
    Type: Grant
    Filed: January 17, 2013
    Date of Patent: October 31, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: U Kang, Ravindranath Konuru, Jimeng Sun, Hanghang Tong
  • Patent number: 9600795
    Abstract: Common sub-process patterns in a plurality of deployed process models may be discovered, and performance measures associated with the sub-process patterns may be computed based on runtime events of the deployed process models. Positive or negative performance patterns among sub-process patterns may be identified and used for creating new process models or improving existing process models.
    Type: Grant
    Filed: April 9, 2012
    Date of Patent: March 21, 2017
    Assignee: International Business Machines Corporation
    Inventors: Steve Demuth, Aliza R. Heching, Jimeng Sun, Judah M. Diament
  • Patent number: 9396439
    Abstract: A system and method for a composite distance metric leveraging multiple expert judgments includes inputting a data distribution of multiple expert judgments stored on a computer readable storage medium. Base distance metrics are converted into neighborhoods for comparison, wherein each base distance metric represents an expert and each neighborhood represents an individual similarity measure of the expert. The neighborhoods are combined to leverage the local discriminalities of all base distance metrics by applying at least one iterative process to output a composite distance metric.
    Type: Grant
    Filed: March 10, 2015
    Date of Patent: July 19, 2016
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Shahram Ebadollahi, Jimeng Sun, Fei Wang
  • Patent number: 9390194
    Abstract: Methods and apparatus are provided for multi-faceted visualization of rich text corpora. A data set comprising a plurality of entities, facets and relations is visualized by generating a visualization of a plurality of the facets in the data set, wherein the visualization indicates connections along the plurality of the facets in a single view using multi-faceted edges. The entities are instances of a particular concept, the facets are classes of entities and the relations are connections between pairs of the entities. A compound node comprises a representation of a primary entity, surrounded by representations of one or more secondary entities connected by one or more external relations. The internal relations can be represented as edges connecting two facet nodes from different compound nodes and a number of crossings of the edges can be reduced by adjusting a position order of facet nodes.
    Type: Grant
    Filed: August 31, 2010
    Date of Patent: July 12, 2016
    Assignee: International Business Machines Corporation
    Inventors: Nan Cao, David H. Gotz, Jimeng Sun
  • Patent number: 9342579
    Abstract: Visualization techniques are provided for a clustered multidimensional dataset. A data set is visualized by obtaining a clustering of a multidimensional dataset comprising a plurality of entities, wherein the entities are instances of a particular concept and wherein each entity comprises a plurality of features; and generating an icon for at least one of the entities, the icon having a plurality of regions, wherein each region corresponds to one of the features of the at least one entity, and wherein a size of each region is based on a value of the corresponding feature. Each icon can convey statistical measures. A stabilized Voronoi-based icon layout algorithm is optionally employed. Icons can be embedded in a visualization of the multidimensional dataset. A hierarchical encoding scheme can be employed to encode a data cluster into the icon, such as a hierarchy of cluster, feature type and entity.
    Type: Grant
    Filed: May 31, 2011
    Date of Patent: May 17, 2016
    Assignee: International Business Machines Corporation
    Inventors: Nan Cao, David H. Gotz, Jimeng Sun
  • Patent number: 9292575
    Abstract: Dynamically aggregating data is provided. A server device receives a subscriber request for a report based on a subset of metadata contained in a data dimensions catalog. The server device analyzes data aggregation requirements from a plurality of data sources for the report based on the subset of metadata defined in the subscriber request. The server device generates a data access plan for movement of data from the plurality of data sources based on the data aggregation requirements for the report. Then, the server device executes the data access plan to fetch the data from the plurality of data sources based on the data aggregation requirements for the report.
    Type: Grant
    Filed: November 19, 2010
    Date of Patent: March 22, 2016
    Assignee: International Business Machines Corporation
    Inventors: Abhijit Bose, Mithkal M. Smadi, Jimeng Sun, Chandra Kumar Velpuri
  • Patent number: 9087117
    Abstract: The invention provides a method and system for visualization of a data set, the method comprises: dividing the data set into a plurality of information layers based on different information dimensions; and visually processing the plurality of information layers based on different information dimensions, respectively, in order to present respective views of the plurality of information layers. In the present invention, by visualizing the data set through presenting different overviews of the data set from different information dimensions, respectively, the presentation of comprehensive information of the data set to a data set analyst is ensured while distortion of presented contents as well as visual clutter are prevented.
    Type: Grant
    Filed: November 1, 2010
    Date of Patent: July 21, 2015
    Assignee: International Business Machines Corporation
    Inventors: Nan Cao, Lei Shi, Jimeng Sun, Wei Hong Qian, Shixia Liu
  • Publication number: 20150186788
    Abstract: A system and method for a composite distance metric leveraging multiple expert judgments includes inputting a data distribution of multiple expert judgments stored on a computer readable storage medium. Base distance metrics are converted into neighborhoods for comparison, wherein each base distance metric represents an expert and each neighborhood represents an individual similarity measure of the expert. The neighborhoods are combined to leverage the local discriminalities of all base distance metrics by applying at least one iterative process to output a composite distance metric.
    Type: Application
    Filed: March 10, 2015
    Publication date: July 2, 2015
    Inventors: SHAHRAM EBADOLLAHI, JIMENG SUN, FEI WANG
  • Patent number: 8996443
    Abstract: A system and method for a composite distance metric leveraging multiple expert judgments includes inputting a data distribution of multiple expert judgments stored on a computer readable storage medium. Base distance metrics are converted into neighborhoods for comparison, wherein each base distance metric represents an expert. The neighborhoods are combined to leverage the local discriminalities of all base distance metrics by applying at least one iterative process to output a composite distance metric.
    Type: Grant
    Filed: September 23, 2013
    Date of Patent: March 31, 2015
    Assignee: International Business Machines Corporation
    Inventors: Shahram Ebadollahi, Jimeng Sun, Fei Wang