Patents by Inventor James Xu
James Xu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 12293438Abstract: In an approach for post-modeling data visualization and analysis, a processor presents a first visualization of a training dataset in a first plot. Responsive to receiving a selection of a data group of the training dataset to analyze, a processor identifies three or fewer key model features of the data group of the training dataset. A processor ascertains a representative record of each key model feature of the three or fewer key model features using a Local Interpretable Model-Agnostic Explanation technique. A processor presents a second visualization of the three or fewer key model features and the representative record of each key model feature in a second plot.Type: GrantFiled: December 13, 2022Date of Patent: May 6, 2025Assignee: International Business Machines CorporationInventors: Wen Pei Yu, Xiao Ming Ma, Xue Ying Zhang, Si Er Han, Jing James Xu, Jing Xu, Jun Wang
-
Publication number: 20250139500Abstract: Determining whether synthetic data is sufficient for utilization in connection with one or more machine learning models. The computing device accesses a protected batch of data associated with a machine learning model. The computing device accesses a simulated batch of data, the simulated batch of data based upon but anonymizing the protected batch of data. The computing device accesses one or more comparisons of one or more variables in the protected batch of data and the simulated batch of data to obtain a similarity value. The computing device performs a machine learning function utilizing at least in-part the simulated batch of data if the similarity value exceeds a similarity threshold.Type: ApplicationFiled: October 30, 2023Publication date: May 1, 2025Inventors: Xiao Ming Ma, Si Er Han, Xue Ying Zhang, Jing James Xu, Jing Xu, Ji Hui Yang, Rui Wang
-
Publication number: 20250131116Abstract: An embodiment configures a plurality of parameters, the parameters being usable to generate artificial data from original data, the configuring adjusting a level of privacy in the artificial data. An embodiment fits a distribution type to a variable of the original data. An embodiment adjusts, using a desired level of privacy and the distribution type, a level of noise, wherein the level of noise corresponds to the desired level of privacy. An embodiment generates, using the distribution type and the level of noise, the artificial data, the artificial data achieving the desired level of privacy by including noise data corresponding to the level of noise.Type: ApplicationFiled: October 20, 2023Publication date: April 24, 2025Applicant: International Business Machines CorporationInventors: Si Er Han, Jing Xu, Xiao Ming Ma, Jing James Xu, Jiang Bo Kang, Xue Ying Zhang, Jun Wang, Ji Hui Yang
-
Publication number: 20250124052Abstract: A computer-implemented method for generating an artificial data set is provided. Aspects include obtaining an input data set, calculating an association between the plurality of categorical variables of the input data set, and creating, based on the association, a plurality of clusters of categorical variables. Aspects also include identifying a key variable for each of the plurality of clusters of categorical variables, creating a key cluster for each of the plurality of clusters, and creating a cluster contingency table for each of the clusters. Aspects further include generating, based on the cluster contingency table for each of the plurality of clusters and for the key cluster, a data set for each of the plurality of clusters and the key cluster and generating the artificial data set based on a combination of the data set for each of the plurality of clusters and the key cluster.Type: ApplicationFiled: October 12, 2023Publication date: April 17, 2025Inventors: Si Er Han, Xiao Ming Ma, Rui Wang, Jing James Xu, Jing Xu, Xue Ying Zhang, Lei Tian, Dong Hai Yu
-
Publication number: 20250117443Abstract: A computer-implemented method for performing data difference evaluation is provided. Aspects include obtaining a first data set and a second data set, creating a first plurality of feature vectors by inputting the first data set into each of a plurality of models, and creating a second plurality of feature vectors by inputting the second data set into each of the plurality of models. Aspects also include identifying a mapping between elements of the first plurality of vectors and elements the second plurality of feature vectors created by a same model of the plurality of models, calculating, for each of the plurality of models based at least in part on the mapping, a model distance between the first data set and the second data set, and calculating, based at least in part on the model distances, an ensemble distance between first data set and the second data set.Type: ApplicationFiled: October 9, 2023Publication date: April 10, 2025Inventors: Lei Tian, Han Zhang, Jing James Xu, Xue Ying Zhang, Si Er Han
-
Publication number: 20250094267Abstract: A time series anomaly detection method, system, and computer program product that processes time series data includes absorbing profiles of the time series data and anomaly types of a model as features, optimizing biased ranks to create optimized ranks through merging initial ranks with new ranks generated by real anomalies, and auto-suggesting the optimized ranks for saving a predetermined amount of data operation.Type: ApplicationFiled: September 15, 2023Publication date: March 20, 2025Inventors: Jun Wang, Jing Xu, Xiao Ming Ma, Xue Ying Zhang, Si Er Han, Jing James Xu, Wen Pei Yu
-
Patent number: 12249012Abstract: A method, computer system, and a computer program product are provided for post-modeling feature evaluation. In one embodiment, at least at least one post model visual output and associated data is obtained that at least includes an individual conditional expectation (ICE) plot and a partial dependence (PDP) plot. Using the associated data and the plots, a Feature Importance (PI) plot is provided. A plurality of features is then determined for each PI, PDP and ICE plots to calculate at least one Interesting Value for each plot. An overall score is also calculated for each plurality of features based on the associated Interesting Values for each PDP, ICE and PI plots. At least one top feature is selected based on said scores. A final plot is then generated at least reflecting the top feature. The final plot combines the PI, PDP and ICE plots together.Type: GrantFiled: November 17, 2022Date of Patent: March 11, 2025Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Xiao Ming Ma, Wen Pei Yu, Jing James Xu, Xue Ying Zhang, Si Er Han, Jing Xu, Jun Wang
-
Patent number: 12242367Abstract: Disclosed are a computer-implemented method, a system and a computer program product for model exploration. Model feature importance of each model of a plurality of models can be obtained, the plurality of models can be grouped into a plurality of model clusters based on the model feature importance of each model, and the model feature importance can be presented by box-plot or confidence interval.Type: GrantFiled: May 15, 2022Date of Patent: March 4, 2025Assignee: International Business Machines CorporationInventors: Jing Xu, Xue Ying Zhang, Si Er Han, Jing James Xu, Xiao Ming Ma, Jun Wang, Wen Pei Yu
-
Patent number: 12243065Abstract: A computing system is configured to generate a predictive model during training of a machine learning program using a training data set including a personal data set of a plurality of first users. The predictive model is configured to generate a predicted assessment score with respect to a second user by correlating a personal data set of the second user to the personal data set of at least one of the first users, with the generating of the predicted assessment score occurring automatically when a data entry of the personal data set of the second user is determined to have changed by the computing system. The computing system is configured to report the automatically generated predicted assessment score to the second user via a user device of the second user.Type: GrantFiled: July 27, 2022Date of Patent: March 4, 2025Assignee: TRUIST BANKInventors: Dontá Lamar Wilson, Jane Moury Kane, Kenneth William Cluff, Peter Councill, Qing Li, James Xu
-
Publication number: 20250053858Abstract: In an approach, a processor selects a top N features for a machine learning (ML) model; discretizes values of each continuous feature of the top N features; generates a set of combination values that each represent a unique combination of feature values in for a data record; predicts, using the ML model, a target value for each record generating predicted target values; groups the predicted target values based on the combination value for each respective record; fits a distribution for each grouping of the predicted target values associated with a respective combination value generating a set of distributions; clusters and refits the set of distributions using a clustering algorithm resulting in a set of clusters and a refitted distribution for each cluster of the set of clusters; and outputs a visualization of the refitted distribution for each cluster as a distribution curve on a graph along with the associated records.Type: ApplicationFiled: August 8, 2023Publication date: February 13, 2025Inventors: Si Er Han, Xiao Ming Ma, Wen Pei Yu, Xue Ying Zhang, Jing Xu, Jing James Xu, Jun Wang, Lei Tian
-
Publication number: 20240427684Abstract: A computer-implemented method, a system and a computer program product for abnormal point simulation are disclosed. A processor analyzes a plurality of data blocks in first time series data to determine traits of respective data blocks. For the respective data blocks, a processor simulates one or more abnormal points based on the traits of the respective data blocks.Type: ApplicationFiled: June 20, 2023Publication date: December 26, 2024Inventors: Si Er Han, Xiao Ming Ma, Jun Wang, Wen Pei Yu, Xue Ying Zhang, Jing James Xu, Jing Xu
-
Publication number: 20240411783Abstract: A computer-implemented method for treating post-modeling data includes computing, sequentially for each category of a feature, a category importance (CI) value. The CI value is based on a model accuracy change when records of a category being examined are reassigned to a remaining set of categories of the feature according to a cumulative distribution of records among the remaining set of categories of the feature, wherein the remaining set of categories include all categories of the feature, except for the category being examined. A post-modeling category is performed to merge of each category having the CI value less than a CI value threshold.Type: ApplicationFiled: June 12, 2023Publication date: December 12, 2024Inventors: Xue Ying Zhang, Si Er Han, Jing Xu, Xiao Ming Ma, Wen Pei Yu, Jing James Xu, Jun Wang, Ji Hui Yang
-
Patent number: 12153953Abstract: Mechanisms are provided for intelligently identifying an execution environment to execute a computing job. An execution time of the computing job in each execution environment of a plurality of execution environments is predicted by applying a set of existing machine learning models matching execution context information and key parameters of the computing job and execution environment information of the execution environment. The predicted execution time of the machine learning models is aggregated. The aggregated predicted execution times of the computing job are summarized for the plurality of execution environments. Responsive to a selection of an execution environment from the plurality of execution environments based on the summary of the aggregated predicted execution times of the computing job, the computing job is executed in the selected execution environment. Related data during the execution of the computing job in the selected execution environment is collected.Type: GrantFiled: April 8, 2021Date of Patent: November 26, 2024Assignee: International Business Machines CorporationInventors: A Peng Zhang, Lei Gao, Jin Wang, Jing James Xu, Jun Wang, Dong Hai Yu
-
Patent number: 12056622Abstract: A method for identifying influential effects that contribute most to a status change of a target index for goal seeking analysis. The method includes generating a candidate list of significant changed predictors between the normal and abnormal status time periods in collected data, and building a plurality of regression models from the collected data. The method determines a first value (trend value or Pearson correlation value) for each of the significant changed predictors based on whether at least one of the significant changed predictors have a significant change trend using the regression models. The method obtains a second predictor importance value for each of the significant changed predictors from a single model built on all the collected data. The method generates a final predictor value for each of the significant changed predictors by combining the first value with the second predictor importance value for each of the significant changed predictors.Type: GrantFiled: February 3, 2021Date of Patent: August 6, 2024Assignee: International Business Machines CorporationInventors: Jing James Xu, Lei Gao, A Peng Zhang, Rui Wang, Si Er Han, Xiao Ming Ma
-
Publication number: 20240256637Abstract: A computer implemented method manages an ensemble model system to classify records. A number of processor units cluster records into groups of records based on classification predictions generated by base models in the ensemble model system for the records. The number of processor units determines sets of weights for the base models that increase a probability that the base models in the ensemble model system correctly predict the groups of records. Each set of weights in the sets of weights is associated with a group of records in the groups of records.Type: ApplicationFiled: January 27, 2023Publication date: August 1, 2024Inventors: Si Er Han, Xue Ying Zhang, Jing Xu, Jing James Xu, Xiao Ming Ma, Wen Pei Yu, Jun Wang, Ji Hui Yang
-
Patent number: 12014026Abstract: Using a set of menu to key process mappings, historical menu usage data for an application is aggregated into aggregated key process usage data. A set of key process association rules, each comprising a consequent key process given a particular antecedent key process, is generated. From the set of key process association rules and a set of ranked menus by frequency of usage within each key process, a set of model menu recommendations is generated. According to an application usage history, a menu frequency ratio, and a confidence value of a modelled next menu, the set of menu recommendations is scored. A scored menu recommendation having a rank below a threshold rank is pruned from a set of menu items of the application ranked according to their scores. The pruned set of scored menu recommendations is presented for selection instead of the set of menu items.Type: GrantFiled: April 21, 2023Date of Patent: June 18, 2024Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Long Fan, Yang Yang, Ye Fan, Juan Wu, Qi Mao, Jing James Xu
-
Publication number: 20240193830Abstract: In an approach for post-modeling data visualization and analysis, a processor presents a first visualization of a training dataset in a first plot. Responsive to receiving a selection of a data group of the training dataset to analyze, a processor identifies three or fewer key model features of the data group of the training dataset. A processor ascertains a representative record of each key model feature of the three or fewer key model features using a Local Interpretable Model-Agnostic Explanation technique. A processor presents a second visualization of the three or fewer key model features and the representative record of each key model feature in a second plot.Type: ApplicationFiled: December 13, 2022Publication date: June 13, 2024Inventors: Wen Pei Yu, Xiao Ming Ma, Xue Ying Zhang, Si Er Han, Jing James Xu, Jing Xu, Jun Wang
-
Publication number: 20240169614Abstract: A method, computer system, and a computer program product are provided for post-modeling feature evaluation. In one embodiment, at least at least one post model visual output and associated data is obtained that at least includes an individual conditional expectation (ICE) plot and a partial dependence (PDP) plot. Using the associated data and the plots, a Feature Importance (PI) plot is provided. A plurality of features is then determined for each PI, PDP and ICE plots to calculate at least one Interesting Value for each plot. An overall score is also calculated for each plurality of features based on the associated Interesting Values for each PDP, ICE and PI plots. At least one top feature is selected based on said scores. A final plot is then generated at least reflecting the top feature. The final plot combines the PI, PDP and ICE plots together.Type: ApplicationFiled: November 17, 2022Publication date: May 23, 2024Inventors: Xiao Ming Ma, Wen Pei Yu, Jing James Xu, Xue Ying Zhang, Si Er Han, Jing Xu, Jun Wang
-
Patent number: 11971796Abstract: An approach is provided in which the approach builds a combination model that includes a normal status model and an abnormal status model. The normal status model is built from a set of time-sequenced normal status records and the abnormal status model is built from a set of time-sequenced abnormal status records. The approach computes a set of time-sequenced coefficient combination values of the normal status model and the abnormal status model based on applying a set of fitting coefficient characteristics to the normal status model and the abnormal status model. The approach performs goal seek analysis on a system using the combination model and the set of time-sequenced coefficient combination values.Type: GrantFiled: May 18, 2021Date of Patent: April 30, 2024Assignee: International Business Machines CorporationInventors: Xiao Ming Ma, Si Er Han, Lei Gao, A Peng Zhang, Chun Lei Xu, Rui Wang, Jing James Xu
-
Patent number: 11966340Abstract: To automate time series forecasting machine learning pipeline generation, a data allocation size of time series data may be determined based on one or more characteristics of a time series data set. The time series data may be allocated for use by candidate machine learning pipelines based on the data allocation size. Features for the time series data may be determined and cached by the candidate machine learning pipelines. Predictions of each of the candidate machine learning pipelines using at least the one or more features may be evaluated. A ranked list of machine learning pipelines may be automatically generated from the candidate machine learning pipelines for time series forecasting based upon evaluating predictions of each of the one or more candidate machine learning pipelines.Type: GrantFiled: March 15, 2022Date of Patent: April 23, 2024Assignee: International Business Machines CorporationInventors: Long Vu, Bei Chen, Xuan-Hong Dang, Peter Daniel Kirchner, Syed Yousaf Shah, Dhavalkumar C. Patel, Si Er Han, Ji Hui Yang, Jun Wang, Jing James Xu, Dakuo Wang, Gregory Bramble, Horst Cornelius Samulowitz, Saket K. Sathe, Wesley M. Gifford, Petros Zerfos