Patents by Inventor Jimeng Sun
Jimeng Sun has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20240105292Abstract: In one aspect, the present disclosure relates to a platform for creating synthetic electronic health records, the platform being configured to perform operations including receiving EHR data and encoding the received EHR data as a plurality of fixed length vectors to form a fixed-length matrix. The platform provides the fixed-length matrix to a machine learning model as input to produce a plurality of visit history representations. For one or more particular visit history representations, of the plurality of visit history representations, the platform applies code information associated with the particular visit history. One or more appended visit histories are provided to one or more masked linear layers to produce a probability matrix comprising probabilities for each code for each visit. The platform produces one or more synthetic EHRs based on repeated sequential generation of and sampling from the probability matrix.Type: ApplicationFiled: August 29, 2023Publication date: March 28, 2024Inventors: Brandon Philip Theodorou, Jimeng Sun
-
Publication number: 20240045994Abstract: An example embodiment may involve obtaining text-based, ground truth electronic health records (EHRs), wherein the ground truth EHRs specify a sequence of medical visits involving a plurality of modalities, and wherein each of the medical visits specifies tokens representing at least one of the modalities; generating a training data set by perturbing the ground truth EHRs, wherein perturbing the ground truth EHRs involves deleting or shuffling some of the tokens in the ground truth EHRs; and iteratively applying a machine learning trainer application to the training data set, wherein the machine learning trainer application includes: (i) a bidirectional language model encoder that takes EHRs within the training data set and produces vector embeddings therefrom, (ii) an autoregressive language model decoder that takes the vector embeddings and infers predicted EHRs therefrom, and (iii) a loss function that compares the predicted EHRs to their corresponding ground truth EHRs.Type: ApplicationFiled: July 5, 2023Publication date: February 8, 2024Inventors: Jimeng Sun, Zifeng Wang
-
Publication number: 20230034559Abstract: A system for prediction of clinical trial outcome. The system includes: a processor of a trial prediction (TP) node connected to at least one cloud server node over a network configured to host a machine learning (ML) module; a memory on which are stored machine-readable instructions that when executed by the processor, cause the processor to: receive a clinical trial (CT) data, parse the CT data to derive drug molecules data, disease information data, and trial protocols data, encode the drug molecules data, the disease information data, and the trial protocols data into corresponding embeddings, generate knowledge pre-trained embeddings using external knowledge data, and provide the knowledge pre-trained embeddings to the ML module for prediction of the CT outcome.Type: ApplicationFiled: May 19, 2022Publication date: February 2, 2023Inventors: Tianfan Fu, Kexin Huang, Jimeng Sun
-
Patent number: 11315685Abstract: A method of building a machine learning pipeline for predicting the efficacy of anti-epilepsy drug treatment regimens is provided.Type: GrantFiled: January 25, 2017Date of Patent: April 26, 2022Assignee: UCB BIOPHARMA SRLInventors: Kunal Malhotra, Sungtae An, Jimeng Sun, Myung Choi, Cynthia Dilley, Chris Clark, Joseph Robertson, Edward Han-Burgess
-
Patent number: 11195133Abstract: Systems and methods for individual risk factor identification include identifying common risk factors for one or more risk targets from population data. Individuals are stratified into clusters based upon the common risk factors. A discriminability of each of the common risk factors is determined, using a processor, for a target cluster using individual data of the target cluster to provide re-ranked common risk factors as individual risk factors for the target cluster, such that the discriminability is a measure of how a risk factor discriminates its cluster from other clusters.Type: GrantFiled: May 9, 2018Date of Patent: December 7, 2021Assignee: International Business Machines CorporationInventors: David H. Gotz, Pei-Yun S. Hsueh, Jianying Hu, Jimeng Sun
-
Patent number: 10832821Abstract: A system and method for providing a temporally dynamic model parameter include building a model parameter by minimizing a loss function based on patient measurements taken at a plurality of time points. Temporally related values of the model parameter are identified, using a processor, having a same type of patient measurement taken at different time points. At least one value of the model parameter and temporally related values of the at least one value are selected to provide a temporally dynamic model parameter.Type: GrantFiled: August 19, 2013Date of Patent: November 10, 2020Assignee: International Business Machines CorporationInventors: Shahram Ebadollahi, Jianying Hu, Jimeng Sun, Fei Wang, Jiayu Zhou
-
Patent number: 10535007Abstract: A method for determining a similarity between a plurality of graphs includes inferring a low-rank representation of a first graph, inferring a low-rank representation of a second graph, wherein the low-rank representations of the first and second graphs are stored in memory, estimating a left interaction between the first and second graphs, estimating a middle interaction between the first and second graphs, estimating a right interaction between the first and second graphs, wherein the estimations are based on the low-rank representations of the first and second graphs stored in memory, and aggregating the left interaction, the middle interaction and the right interaction into a kernel, wherein the kernel is indicative of the similarity between the first and second graphs.Type: GrantFiled: January 17, 2013Date of Patent: January 14, 2020Assignee: International Business Machines CorporationInventors: U Kang, Ravindranath Konuru, Hanghang Tong, Jimeng Sun
-
Publication number: 20180260925Abstract: Systems and methods for individual risk factor identification include identifying common risk factors for one or more risk targets from population data. Individuals are stratified into clusters based upon the common risk factors. A discriminability of each of the common risk factors is determined, using a processor, for a target cluster using individual data of the target cluster to provide re-ranked common risk factors as individual risk factors for the target cluster, such that the discriminability is a measure of how a risk factor discriminates its cluster from other clusters.Type: ApplicationFiled: May 9, 2018Publication date: September 13, 2018Inventors: DAVID H. GOTZ, PEI-YUN S. HSUEH, JIANYING HU, JIMENG SUN
-
Publication number: 20180211012Abstract: A method of building a machine learning pipeline for predicting the efficacy of anti-epilepsy drug treatment regimens is provided.Type: ApplicationFiled: January 25, 2017Publication date: July 26, 2018Inventors: Kunal MALHOTRA, Sungtae AN, Jimeng SUN, Myung CHOI, Cynthia DILLEY, Chris CLARK, Joseph ROBERTSON, Edward Han-Burgess
-
Publication number: 20180211010Abstract: A method of building a machine learning pipeline for predicting refractoriness of epilepsy patients is provided. The method includes providing electronic health records data; constructing a patient cohort from the electronic health records data by selecting patients based on failure of at least one anti-epilepsy drug; constructing a set features found in or derived from the electronic health records data; electronically processing the patient cohort to identify a subset of the features that are predictive for refractoriness for inclusion in a predictive model configured for classifying patients as refractory or non-refractory; and training the predictive computerized model to classify the patients having at least one anti-epilepsy drug failure based on likelihood of becoming refractory.Type: ApplicationFiled: January 23, 2017Publication date: July 26, 2018Inventors: Kunal MALHOTRA, Sungtae AN, Jimeng SUN, Myung CHOI, Cynthia DILLEY, Chris CLARK, Joseph ROBERTSON, Edward HAN-BURGESS
-
Patent number: 9996889Abstract: Systems and methods for individual risk factor identification include identifying common risk factors for one or more risk targets from population data. Individuals are stratified into clusters based upon the common risk factors. A discriminability of each of the common risk factors is determined, using a processor, for a target cluster using individual data of the target cluster to provide re-ranked common risk factors as individual risk factors for the target cluster, such that the discriminability is a measure of how a risk factor discriminates its cluster from other clusters.Type: GrantFiled: October 1, 2012Date of Patent: June 12, 2018Assignee: International Business Machines CorporationInventors: David H. Gotz, Pei-Yun S. Hsueh, Jianying Hu, Jimeng Sun
-
Patent number: 9805307Abstract: A method for determining a correspondence between a first node set of a first graph and a second node set of a second graph includes building a feature representation for each of the first graph and the second graph, and inferring the correspondence between the first node set and the second node set based on the feature representations.Type: GrantFiled: January 17, 2013Date of Patent: October 31, 2017Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: U Kang, Ravindranath Konuru, Jimeng Sun, Hanghang Tong
-
Patent number: 9600795Abstract: Common sub-process patterns in a plurality of deployed process models may be discovered, and performance measures associated with the sub-process patterns may be computed based on runtime events of the deployed process models. Positive or negative performance patterns among sub-process patterns may be identified and used for creating new process models or improving existing process models.Type: GrantFiled: April 9, 2012Date of Patent: March 21, 2017Assignee: International Business Machines CorporationInventors: Steve Demuth, Aliza R. Heching, Jimeng Sun, Judah M. Diament
-
Patent number: 9396439Abstract: A system and method for a composite distance metric leveraging multiple expert judgments includes inputting a data distribution of multiple expert judgments stored on a computer readable storage medium. Base distance metrics are converted into neighborhoods for comparison, wherein each base distance metric represents an expert and each neighborhood represents an individual similarity measure of the expert. The neighborhoods are combined to leverage the local discriminalities of all base distance metrics by applying at least one iterative process to output a composite distance metric.Type: GrantFiled: March 10, 2015Date of Patent: July 19, 2016Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Shahram Ebadollahi, Jimeng Sun, Fei Wang
-
Patent number: 9390194Abstract: Methods and apparatus are provided for multi-faceted visualization of rich text corpora. A data set comprising a plurality of entities, facets and relations is visualized by generating a visualization of a plurality of the facets in the data set, wherein the visualization indicates connections along the plurality of the facets in a single view using multi-faceted edges. The entities are instances of a particular concept, the facets are classes of entities and the relations are connections between pairs of the entities. A compound node comprises a representation of a primary entity, surrounded by representations of one or more secondary entities connected by one or more external relations. The internal relations can be represented as edges connecting two facet nodes from different compound nodes and a number of crossings of the edges can be reduced by adjusting a position order of facet nodes.Type: GrantFiled: August 31, 2010Date of Patent: July 12, 2016Assignee: International Business Machines CorporationInventors: Nan Cao, David H. Gotz, Jimeng Sun
-
Patent number: 9342579Abstract: Visualization techniques are provided for a clustered multidimensional dataset. A data set is visualized by obtaining a clustering of a multidimensional dataset comprising a plurality of entities, wherein the entities are instances of a particular concept and wherein each entity comprises a plurality of features; and generating an icon for at least one of the entities, the icon having a plurality of regions, wherein each region corresponds to one of the features of the at least one entity, and wherein a size of each region is based on a value of the corresponding feature. Each icon can convey statistical measures. A stabilized Voronoi-based icon layout algorithm is optionally employed. Icons can be embedded in a visualization of the multidimensional dataset. A hierarchical encoding scheme can be employed to encode a data cluster into the icon, such as a hierarchy of cluster, feature type and entity.Type: GrantFiled: May 31, 2011Date of Patent: May 17, 2016Assignee: International Business Machines CorporationInventors: Nan Cao, David H. Gotz, Jimeng Sun
-
Patent number: 9292575Abstract: Dynamically aggregating data is provided. A server device receives a subscriber request for a report based on a subset of metadata contained in a data dimensions catalog. The server device analyzes data aggregation requirements from a plurality of data sources for the report based on the subset of metadata defined in the subscriber request. The server device generates a data access plan for movement of data from the plurality of data sources based on the data aggregation requirements for the report. Then, the server device executes the data access plan to fetch the data from the plurality of data sources based on the data aggregation requirements for the report.Type: GrantFiled: November 19, 2010Date of Patent: March 22, 2016Assignee: International Business Machines CorporationInventors: Abhijit Bose, Mithkal M. Smadi, Jimeng Sun, Chandra Kumar Velpuri
-
Patent number: 9087117Abstract: The invention provides a method and system for visualization of a data set, the method comprises: dividing the data set into a plurality of information layers based on different information dimensions; and visually processing the plurality of information layers based on different information dimensions, respectively, in order to present respective views of the plurality of information layers. In the present invention, by visualizing the data set through presenting different overviews of the data set from different information dimensions, respectively, the presentation of comprehensive information of the data set to a data set analyst is ensured while distortion of presented contents as well as visual clutter are prevented.Type: GrantFiled: November 1, 2010Date of Patent: July 21, 2015Assignee: International Business Machines CorporationInventors: Nan Cao, Lei Shi, Jimeng Sun, Wei Hong Qian, Shixia Liu
-
Publication number: 20150186788Abstract: A system and method for a composite distance metric leveraging multiple expert judgments includes inputting a data distribution of multiple expert judgments stored on a computer readable storage medium. Base distance metrics are converted into neighborhoods for comparison, wherein each base distance metric represents an expert and each neighborhood represents an individual similarity measure of the expert. The neighborhoods are combined to leverage the local discriminalities of all base distance metrics by applying at least one iterative process to output a composite distance metric.Type: ApplicationFiled: March 10, 2015Publication date: July 2, 2015Inventors: SHAHRAM EBADOLLAHI, JIMENG SUN, FEI WANG
-
Patent number: 8996443Abstract: A system and method for a composite distance metric leveraging multiple expert judgments includes inputting a data distribution of multiple expert judgments stored on a computer readable storage medium. Base distance metrics are converted into neighborhoods for comparison, wherein each base distance metric represents an expert. The neighborhoods are combined to leverage the local discriminalities of all base distance metrics by applying at least one iterative process to output a composite distance metric.Type: GrantFiled: September 23, 2013Date of Patent: March 31, 2015Assignee: International Business Machines CorporationInventors: Shahram Ebadollahi, Jimeng Sun, Fei Wang