Patents by Inventor Xiao-Ming Ma

Xiao-Ming Ma has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

VISUAL REPRESNTATION USING POST MODELING FEATURE EVALUATION

Publication number: 20240169614

Abstract: A method, computer system, and a computer program product are provided for post-modeling feature evaluation. In one embodiment, at least at least one post model visual output and associated data is obtained that at least includes an individual conditional expectation (ICE) plot and a partial dependence (PDP) plot. Using the associated data and the plots, a Feature Importance (PI) plot is provided. A plurality of features is then determined for each PI, PDP and ICE plots to calculate at least one Interesting Value for each plot. An overall score is also calculated for each plurality of features based on the associated Interesting Values for each PDP, ICE and PI plots. At least one top feature is selected based on said scores. A final plot is then generated at least reflecting the top feature. The final plot combines the PI, PDP and ICE plots together.

Type: Application

Filed: November 17, 2022

Publication date: May 23, 2024

Inventors: Xiao Ming Ma, Wen Pei Yu, Jing James Xu, Xue Ying Zhang, Si Er Han, Jing Xu, Jun Wang
Goal seek analysis based on status models

Patent number: 11971796

Abstract: An approach is provided in which the approach builds a combination model that includes a normal status model and an abnormal status model. The normal status model is built from a set of time-sequenced normal status records and the abnormal status model is built from a set of time-sequenced abnormal status records. The approach computes a set of time-sequenced coefficient combination values of the normal status model and the abnormal status model based on applying a set of fitting coefficient characteristics to the normal status model and the abnormal status model. The approach performs goal seek analysis on a system using the combination model and the set of time-sequenced coefficient combination values.

Type: Grant

Filed: May 18, 2021

Date of Patent: April 30, 2024

Assignee: International Business Machines Corporation

Inventors: Xiao Ming Ma, Si Er Han, Lei Gao, A Peng Zhang, Chun Lei Xu, Rui Wang, Jing James Xu
DETECTING ANOMALOUS DATA

Publication number: 20240054211

Abstract: Detecting anomalous data by applying a plurality of models to a data set to yield detection results including anomalous data, applying evaluation methods to the detection results for each of the plurality of models, determining a combined score for the detection results according to the evaluation methods, determining a combined score threshold, and defining a set of detected anomalies according to the combined score and the combined score threshold.

Type: Application

Filed: August 10, 2022

Publication date: February 15, 2024

Inventors: Jing Xu, Xue Ying Zhang, Si Er Han, Jing James Xu, Xiao Ming Ma, Wen Pei Yu
Deep forest model development and training

Patent number: 11893499

Abstract: Automated development and training of deep forest models for analyzing data by growing a random forest of decision trees using data, determining Out-of-bag (OOB) predictions for the forest, appending the OOB predictions to the data set, and growing an additional forest using the data set including the appended OOB predictions, and combining the output of the additional forest, then utilizing the model to classify data outside the training data set.

Type: Grant

Filed: March 12, 2019

Date of Patent: February 6, 2024

Assignee: International Business Machines Corporation

Inventors: Jing Xu, Rui Wang, Xiao Ming Ma, Ji Hui Yang, Xue Ying Zhang, Jing James Xu, Si Er Han
Parallel chart generator

Patent number: 11893666

Abstract: An approach is provided in which the approach generates a parallel chart based on multiple records that includes a set of variable values corresponding to a set of variables. To generate the parallel chart, the approach arranges the set of variables on the parallel chart in a variable order based on at least one variable arrangement rule. The approach arranges an initial variable value order for each one of the set of variables, and computes a lucidity score based on the variable order and the initial variable value order of each of the set of variables. The lucidity score is a measurement of the clarity of the parallel chart. The approach adjusts the variable value order of at least one of the set of variables to increase the lucidity score and optimizes the clarity of the parallel chart based on the adjusted variable value order.

Type: Grant

Filed: January 19, 2022

Date of Patent: February 6, 2024

Assignee: International Business Machines Corporation

Inventors: Xiao Ming Ma, Si Er Han, Xue Ying Zhang, Jing Xu, Wen Pei Yu, Ji Hui Yang, Jing Jia
PARTIAL IMPORTANCE OF INPUT VARIABLE OF PREDICTIVE MODELS

Publication number: 20230394326

Abstract: Embodiments of the present disclosure relate to a method, system, and computer program product for predictive models. According to the method, a processor may provide a first list including at least one input variable of a predictive model and a second list including a plurality of variables of the predictive model. For each of input variables in the second list, the processor may determine contribution of the input variable to prediction of the predictive model with respect to the at least one input variable in the first list. The processor may update the first list by moving an input variable in the second list into the first list based on the determined contribution of the plurality of input variables. The processor may render one or more of input variables in the updated first list based on an order of the input variables in the updated first list.

Type: Application

Filed: June 1, 2022

Publication date: December 7, 2023

Inventors: Si Er Han, Xue Ying Zhang, Xiao Ming Ma, Wen Pei Yu, Jing Xu, Jing James Xu, Rui Wang
Parallelized scoring for ensemble model

Patent number: 11823077

Abstract: Provided are a computer-implemented method, a system, and a computer program product. The method comprises extracting features from a plurality of base models in an ensemble model. The plurality of base models are configured to provide respective prediction results. The ensemble model is configured to provide an overall prediction result from the prediction results of the plurality of base models. The features are associated with time performance of the base models. The method further comprises clustering the plurality of base models into a plurality of clusters based on the extracted features. The method further comprises assigning the plurality of base models to a plurality of parallel computation units based on the plurality of clusters.

Type: Grant

Filed: August 13, 2020

Date of Patent: November 21, 2023

Assignee: International Business Machines Corporation

Inventors: Dong Hai Yu, Jun Wang, Jing Xu, Ji Hui Yang, Xiao Ming Ma, Song Bo
FEATURE IMPORTANCE BASED MODEL OPTIMIZATION

Publication number: 20230367689

Abstract: Disclosed are a computer-implemented method, a system and a computer program product for model exploration. Model feature importance of each model of a plurality of models can be obtained, the plurality of models can be grouped into a plurality of model clusters based on the model feature importance of each model, and the model feature importance can be presented by box-plot or confidence interval.

Type: Application

Filed: May 15, 2022

Publication date: November 16, 2023

Inventors: Jing Xu, Xue Ying Zhang, Si Er Han, Jing James Xu, Xiao Ming Ma, Jun Wang, Wen Pei Yu
PRIVACY PROTECTION IN A SEARCH PROCESS

Publication number: 20230359758

Abstract: The present disclosure relates to privacy protection in a search process. According to a method, a target emotion vector is extracted from a search interaction, the target emotion vector representing emotional information in the search interaction. Respective emotion distances between the target emotion vector and respective emotion vectors associated with a plurality of text clusters are determined. The plurality of text clusters is clustered from a dictionary of text elements. A first number of text clusters are selected from the plurality of text clusters based on the determined respective emotion distances. The first number of text clusters have emotion distances larger than at least one unselected text cluster among the plurality of text clusters. A plurality of confused search interactions are constructed for the search interaction based on the first number of text clusters, and the plurality of confused search interactions are performed.

Type: Application

Filed: May 3, 2022

Publication date: November 9, 2023

Inventors: Jin Wang, Lei GAO, A PENG ZHANG, Kai Li, Jun Wang, Xiao Ming Ma, Xin Feng Zhu, Geng Wu Yang
STABLE LOCAL INTERPRETABLE MODEL FOR PREDICTION

Publication number: 20230306312

Abstract: Examples described herein provide a computer-implemented method that includes determining a kernel width for the machine learning model. The method further includes building a local interpretable linear model using the kernel width. The method further includes computing a contribution and confidence for a feature of the local interpretable linear model. The method further includes updating the local interpretable linear model to generate a final model and computing an overall confidence for the final model.

Type: Application

Filed: March 21, 2022

Publication date: September 28, 2023

Inventors: Xiao Ming Ma, Si Er Han, Xue Ying Zhang, Wen Pei Yu, Jing Xu, Jing James Xu, Lei Gao, A Peng Zhang
BUILDING MODELS WITH EXPECTED FEATURE IMPORTANCE

Publication number: 20230297647

Abstract: A method, computer program, and computer system are provided for training a machine learning model. A feature associated with training data derived from a dataset is identified. A machine learning model is generated based on the training data. At least a portion of the training data associated with maximizing an importance value associated with the identified feature is selected. The importance value corresponds to a need associated with the machine learning model. One or more weight values is assigned to the selected portion of the training data. The machine learning model is updated based on the assigned weight values.

Type: Application

Filed: March 18, 2022

Publication date: September 21, 2023

Inventors: Xiao Ming Ma, Jin Wang, Lei Gao, A PENG ZHANG, Wen Pei Yu, Xin Feng Zhu
INTERACTIVE WHAT-IF ANALYSIS SYSTEM BASED ON IMPRECISION SCORING SIMULATION TECHNIQUES

Publication number: 20230289693

Abstract: A method, computer system, and a computer program product for performing an interactive outcome analysis is provided. The present invention may include generating, by a computer, a first estimation outcome from a first plurality of input conditions. The present invention may include generating, by the computer, a parallel estimation outcome from a second plurality of input conditions, wherein at least one of said input conditions in said first plurality of input conditions is different from any of said second plurality of input conditions. The present invention may include selecting, by the computer, either said first or said parallel estimation outcome by analyzing said outcomes with one another and with a target goal outcome.

Type: Application

Filed: March 14, 2022

Publication date: September 14, 2023

Inventors: Wen Pei Yu, Xiao Ming Ma, Xue Ying Zhang, Si Er Han, Jing James Xu, Jing Xu, Rui Wang, Jun Wang, Ji Hui Yang
Model-free high confidence data clustering

Patent number: 11741128

Abstract: A method of clustering data generated by an unknown model, the method including accessing the data, wherein the data includes a prediction target and a confidence, extracting a data group with high prediction confidence from the data, wherein the data group comprises a plurality of data cases, and where each of the data cases is described by a plurality of predictors, identifying high rank predictors of each the data cases in the data group, transforming the data group into a transformed data group including only the high rank predictors for each of the data cases, wherein the high rank predictors are ranked within each of the data cases included in the transformed data group, clustering the transformed data group to generate clusters, and profiling the clusters to extract an insight about the unknown model.

Type: Grant

Filed: May 6, 2021

Date of Patent: August 29, 2023

Assignee: International Business Machines Corporation

Inventors: Jing Xu, Xue Ying Zhang, Si Er Han, Xiao Ming Ma, Ji Hui Yang
Generating personalized recommendations to address a target problem

Patent number: 11727312

Abstract: A computer-implemented method, system and computer program product for generating personalized recommendations to address a target problem. A machine learning prediction model directed to a target problem for an individual is built with historical data. After receiving data about the individual, a prediction for the individual is obtained in connection with the target problem by the built model using the received data about the individual. Key predictors (e.g., parameters) and their weight for the individual are generated using the prediction by an explanation model. Record(s) are identified from the historical data by performing similarity analysis of the historical data using the key predictors and their weight. Such records provide a population closely related to the individual with respect to the target problem. These records are then analyzed and recommendations are provided to a user to solve the target problem for the individual based on the analysis of the identified record(s).

Type: Grant

Filed: September 3, 2019

Date of Patent: August 15, 2023

Assignee: International Business Machines Corporation

Inventors: Xue Ying Zhang, Jing Xu, Xiao Ming Ma, Jing James Xu, Ying Xu, Ang Chang
PARALLEL CHART GENERATOR

Publication number: 20230252699

Abstract: An approach is provided in which the approach generates a parallel chart based on multiple records that includes a set of variable values corresponding to a set of variables. To generate the parallel chart, the approach arranges the set of variables on the parallel chart in a variable order based on at least one variable arrangement rule. The approach arranges an initial variable value order for each one of the set of variables, and computes a lucidity score based on the variable order and the initial variable value order of each of the set of variables. The lucidity score is a measurement of the clarity of the parallel chart. The approach adjusts the variable value order of at least one of the set of variables to increase the lucidity score and optimizes the clarity of the parallel chart based on the adjusted variable value order.

Type: Application

Filed: January 19, 2022

Publication date: August 10, 2023

Inventors: Xiao Ming Ma, Si Er Han, Xue Ying Zhang, Jing Xu, Wen Pei Yu, Ji Hui Yang, Jing Jia
Time Series Model Update

Publication number: 20230185879

Abstract: A computer implemented technique including: splitting data of a historical time series data set into subsets; updating a time series model by backwards data selection to obtain an interim version of the time series model; exploring pattern changes in the new data to obtain new predictors of pattern change; and updating the interim version of the time series model by applying the new predictors of pattern change to obtain an updated version of the time series model.

Type: Application

Filed: December 15, 2021

Publication date: June 15, 2023

Inventors: Si Er Han, Xue Ying Zhang, Jing Xu, Xiao Ming Ma, Ji Hui Yang
IMPUTING MACHINE LEARNING TRAINING DATA

Publication number: 20230177119

Abstract: Embodiments are disclosed for a method. The method includes determining a correlation list of missing value predictors. The method also includes generating a cluster model having multiple clusters. The cluster model is based on a target value and predictor values. The method further includes determining an imputed value for a missing value of a row of original training data based on a linear regression model for multiple non-missing value predictor values for the clusters.

Type: Application

Filed: December 5, 2021

Publication date: June 8, 2023

Inventors: A PENG ZHANG, Xiao Ming Ma, Lei Gao, Jin Wang, Kai Li
CHAINING VERSION DATA BI-DIRECTIONALLY IN DATA PAGE TO AVOID ADDITIONAL VERSION DATA ACCESSES

Publication number: 20230153282

Abstract: A computer-implemented method, system and computer program product for improving performance of a distributed database. A query is received to store version data in the distributed database. Upon receiving the query to store the version data, the version data is stored in a row of a data page of a main table of a heap organized table/index organized table of the distributed database, where the row of the data page of the main table of the heap organized table/index organized table of the distributed database contains a pointer pointing to a later/previous version of the version data if the later/previous version of the version data is stored in the data page thereby chaining version data bi-directionally.

Type: Application

Filed: November 15, 2021

Publication date: May 18, 2023

Inventors: Sheng Yan Sun, Shuo Li, Xiaobo Wang, Xiao Ming Ma
INCREMENTAL MACHINE LEARNING FOR A PARAMETRIC MACHINE LEARNING MODEL

Publication number: 20230137184

Abstract: A method, system, and computer program product for incremental machine learning for a parametric machine learning model are disclosed. The method may include processing samples comprising historical samples and new samples with an existing parametric machine learning model to obtain at least one prediction residual of each of the samples, wherein the existing parametric machine learning model was trained based on the historical samples. The method may further include clustering the samples based on the at least one prediction residual of each of the samples and features of each of the samples. The method may further include sampling samples in each cluster to ensure that each cluster includes substantially similar number of sampled samples. The method may further include updating the existing parametric machine learning model to obtain an updated parametric machine learning model based on sampled samples in each cluster.

Type: Application

Filed: November 4, 2021

Publication date: May 4, 2023

Inventors: Si Er Han, Ji Hui Yang, Xiao Ming Ma, Jing Xu, Xue Ying Zhang
GENERATING VISUALIZATIONS FOR SEMI-STRUCTURED DATA

Publication number: 20230125621

Abstract: A computer-implemented method, system and computer program product for generating visualizations for semi-structured data. Visualization data is extracted from infographics depicting semi-structured data. The visualization data that is extracted includes the traits or characteristics of the semi-structured data depicted in the infographics (e.g., dimension), the characteristics of the infographics (e.g., location of the depicted data), and the constraints or display requirements (e.g., display target value in a particular axis). A trait and constraint rule set is then generated based on the extracted visualization data. The trait and constraint rule set includes a set of rules that maps the display requirements to the particular set of traits or characteristics exhibited by the semi-structured data displayed in the infographics.

Type: Application

Filed: October 25, 2021

Publication date: April 27, 2023

Inventors: Wen Pei Yu, Ji Hui Yang, Xiao Ming Ma, Rui Wang, Jing James Xu

1 2 3 next