Patents by Inventor Xiao-Ming Ma
Xiao-Ming Ma has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20240169614Abstract: A method, computer system, and a computer program product are provided for post-modeling feature evaluation. In one embodiment, at least at least one post model visual output and associated data is obtained that at least includes an individual conditional expectation (ICE) plot and a partial dependence (PDP) plot. Using the associated data and the plots, a Feature Importance (PI) plot is provided. A plurality of features is then determined for each PI, PDP and ICE plots to calculate at least one Interesting Value for each plot. An overall score is also calculated for each plurality of features based on the associated Interesting Values for each PDP, ICE and PI plots. At least one top feature is selected based on said scores. A final plot is then generated at least reflecting the top feature. The final plot combines the PI, PDP and ICE plots together.Type: ApplicationFiled: November 17, 2022Publication date: May 23, 2024Inventors: Xiao Ming Ma, Wen Pei Yu, Jing James Xu, Xue Ying Zhang, Si Er Han, Jing Xu, Jun Wang
-
Patent number: 11971796Abstract: An approach is provided in which the approach builds a combination model that includes a normal status model and an abnormal status model. The normal status model is built from a set of time-sequenced normal status records and the abnormal status model is built from a set of time-sequenced abnormal status records. The approach computes a set of time-sequenced coefficient combination values of the normal status model and the abnormal status model based on applying a set of fitting coefficient characteristics to the normal status model and the abnormal status model. The approach performs goal seek analysis on a system using the combination model and the set of time-sequenced coefficient combination values.Type: GrantFiled: May 18, 2021Date of Patent: April 30, 2024Assignee: International Business Machines CorporationInventors: Xiao Ming Ma, Si Er Han, Lei Gao, A Peng Zhang, Chun Lei Xu, Rui Wang, Jing James Xu
-
Publication number: 20240054211Abstract: Detecting anomalous data by applying a plurality of models to a data set to yield detection results including anomalous data, applying evaluation methods to the detection results for each of the plurality of models, determining a combined score for the detection results according to the evaluation methods, determining a combined score threshold, and defining a set of detected anomalies according to the combined score and the combined score threshold.Type: ApplicationFiled: August 10, 2022Publication date: February 15, 2024Inventors: Jing Xu, Xue Ying Zhang, Si Er Han, Jing James Xu, Xiao Ming Ma, Wen Pei Yu
-
Patent number: 11893499Abstract: Automated development and training of deep forest models for analyzing data by growing a random forest of decision trees using data, determining Out-of-bag (OOB) predictions for the forest, appending the OOB predictions to the data set, and growing an additional forest using the data set including the appended OOB predictions, and combining the output of the additional forest, then utilizing the model to classify data outside the training data set.Type: GrantFiled: March 12, 2019Date of Patent: February 6, 2024Assignee: International Business Machines CorporationInventors: Jing Xu, Rui Wang, Xiao Ming Ma, Ji Hui Yang, Xue Ying Zhang, Jing James Xu, Si Er Han
-
Patent number: 11893666Abstract: An approach is provided in which the approach generates a parallel chart based on multiple records that includes a set of variable values corresponding to a set of variables. To generate the parallel chart, the approach arranges the set of variables on the parallel chart in a variable order based on at least one variable arrangement rule. The approach arranges an initial variable value order for each one of the set of variables, and computes a lucidity score based on the variable order and the initial variable value order of each of the set of variables. The lucidity score is a measurement of the clarity of the parallel chart. The approach adjusts the variable value order of at least one of the set of variables to increase the lucidity score and optimizes the clarity of the parallel chart based on the adjusted variable value order.Type: GrantFiled: January 19, 2022Date of Patent: February 6, 2024Assignee: International Business Machines CorporationInventors: Xiao Ming Ma, Si Er Han, Xue Ying Zhang, Jing Xu, Wen Pei Yu, Ji Hui Yang, Jing Jia
-
Publication number: 20230394326Abstract: Embodiments of the present disclosure relate to a method, system, and computer program product for predictive models. According to the method, a processor may provide a first list including at least one input variable of a predictive model and a second list including a plurality of variables of the predictive model. For each of input variables in the second list, the processor may determine contribution of the input variable to prediction of the predictive model with respect to the at least one input variable in the first list. The processor may update the first list by moving an input variable in the second list into the first list based on the determined contribution of the plurality of input variables. The processor may render one or more of input variables in the updated first list based on an order of the input variables in the updated first list.Type: ApplicationFiled: June 1, 2022Publication date: December 7, 2023Inventors: Si Er Han, Xue Ying Zhang, Xiao Ming Ma, Wen Pei Yu, Jing Xu, Jing James Xu, Rui Wang
-
Patent number: 11823077Abstract: Provided are a computer-implemented method, a system, and a computer program product. The method comprises extracting features from a plurality of base models in an ensemble model. The plurality of base models are configured to provide respective prediction results. The ensemble model is configured to provide an overall prediction result from the prediction results of the plurality of base models. The features are associated with time performance of the base models. The method further comprises clustering the plurality of base models into a plurality of clusters based on the extracted features. The method further comprises assigning the plurality of base models to a plurality of parallel computation units based on the plurality of clusters.Type: GrantFiled: August 13, 2020Date of Patent: November 21, 2023Assignee: International Business Machines CorporationInventors: Dong Hai Yu, Jun Wang, Jing Xu, Ji Hui Yang, Xiao Ming Ma, Song Bo
-
Publication number: 20230367689Abstract: Disclosed are a computer-implemented method, a system and a computer program product for model exploration. Model feature importance of each model of a plurality of models can be obtained, the plurality of models can be grouped into a plurality of model clusters based on the model feature importance of each model, and the model feature importance can be presented by box-plot or confidence interval.Type: ApplicationFiled: May 15, 2022Publication date: November 16, 2023Inventors: Jing Xu, Xue Ying Zhang, Si Er Han, Jing James Xu, Xiao Ming Ma, Jun Wang, Wen Pei Yu
-
Publication number: 20230359758Abstract: The present disclosure relates to privacy protection in a search process. According to a method, a target emotion vector is extracted from a search interaction, the target emotion vector representing emotional information in the search interaction. Respective emotion distances between the target emotion vector and respective emotion vectors associated with a plurality of text clusters are determined. The plurality of text clusters is clustered from a dictionary of text elements. A first number of text clusters are selected from the plurality of text clusters based on the determined respective emotion distances. The first number of text clusters have emotion distances larger than at least one unselected text cluster among the plurality of text clusters. A plurality of confused search interactions are constructed for the search interaction based on the first number of text clusters, and the plurality of confused search interactions are performed.Type: ApplicationFiled: May 3, 2022Publication date: November 9, 2023Inventors: Jin Wang, Lei GAO, A PENG ZHANG, Kai Li, Jun Wang, Xiao Ming Ma, Xin Feng Zhu, Geng Wu Yang
-
Publication number: 20230306312Abstract: Examples described herein provide a computer-implemented method that includes determining a kernel width for the machine learning model. The method further includes building a local interpretable linear model using the kernel width. The method further includes computing a contribution and confidence for a feature of the local interpretable linear model. The method further includes updating the local interpretable linear model to generate a final model and computing an overall confidence for the final model.Type: ApplicationFiled: March 21, 2022Publication date: September 28, 2023Inventors: Xiao Ming Ma, Si Er Han, Xue Ying Zhang, Wen Pei Yu, Jing Xu, Jing James Xu, Lei Gao, A Peng Zhang
-
Publication number: 20230297647Abstract: A method, computer program, and computer system are provided for training a machine learning model. A feature associated with training data derived from a dataset is identified. A machine learning model is generated based on the training data. At least a portion of the training data associated with maximizing an importance value associated with the identified feature is selected. The importance value corresponds to a need associated with the machine learning model. One or more weight values is assigned to the selected portion of the training data. The machine learning model is updated based on the assigned weight values.Type: ApplicationFiled: March 18, 2022Publication date: September 21, 2023Inventors: Xiao Ming Ma, Jin Wang, Lei Gao, A PENG ZHANG, Wen Pei Yu, Xin Feng Zhu
-
Publication number: 20230289693Abstract: A method, computer system, and a computer program product for performing an interactive outcome analysis is provided. The present invention may include generating, by a computer, a first estimation outcome from a first plurality of input conditions. The present invention may include generating, by the computer, a parallel estimation outcome from a second plurality of input conditions, wherein at least one of said input conditions in said first plurality of input conditions is different from any of said second plurality of input conditions. The present invention may include selecting, by the computer, either said first or said parallel estimation outcome by analyzing said outcomes with one another and with a target goal outcome.Type: ApplicationFiled: March 14, 2022Publication date: September 14, 2023Inventors: Wen Pei Yu, Xiao Ming Ma, Xue Ying Zhang, Si Er Han, Jing James Xu, Jing Xu, Rui Wang, Jun Wang, Ji Hui Yang
-
Patent number: 11741128Abstract: A method of clustering data generated by an unknown model, the method including accessing the data, wherein the data includes a prediction target and a confidence, extracting a data group with high prediction confidence from the data, wherein the data group comprises a plurality of data cases, and where each of the data cases is described by a plurality of predictors, identifying high rank predictors of each the data cases in the data group, transforming the data group into a transformed data group including only the high rank predictors for each of the data cases, wherein the high rank predictors are ranked within each of the data cases included in the transformed data group, clustering the transformed data group to generate clusters, and profiling the clusters to extract an insight about the unknown model.Type: GrantFiled: May 6, 2021Date of Patent: August 29, 2023Assignee: International Business Machines CorporationInventors: Jing Xu, Xue Ying Zhang, Si Er Han, Xiao Ming Ma, Ji Hui Yang
-
Patent number: 11727312Abstract: A computer-implemented method, system and computer program product for generating personalized recommendations to address a target problem. A machine learning prediction model directed to a target problem for an individual is built with historical data. After receiving data about the individual, a prediction for the individual is obtained in connection with the target problem by the built model using the received data about the individual. Key predictors (e.g., parameters) and their weight for the individual are generated using the prediction by an explanation model. Record(s) are identified from the historical data by performing similarity analysis of the historical data using the key predictors and their weight. Such records provide a population closely related to the individual with respect to the target problem. These records are then analyzed and recommendations are provided to a user to solve the target problem for the individual based on the analysis of the identified record(s).Type: GrantFiled: September 3, 2019Date of Patent: August 15, 2023Assignee: International Business Machines CorporationInventors: Xue Ying Zhang, Jing Xu, Xiao Ming Ma, Jing James Xu, Ying Xu, Ang Chang
-
Publication number: 20230252699Abstract: An approach is provided in which the approach generates a parallel chart based on multiple records that includes a set of variable values corresponding to a set of variables. To generate the parallel chart, the approach arranges the set of variables on the parallel chart in a variable order based on at least one variable arrangement rule. The approach arranges an initial variable value order for each one of the set of variables, and computes a lucidity score based on the variable order and the initial variable value order of each of the set of variables. The lucidity score is a measurement of the clarity of the parallel chart. The approach adjusts the variable value order of at least one of the set of variables to increase the lucidity score and optimizes the clarity of the parallel chart based on the adjusted variable value order.Type: ApplicationFiled: January 19, 2022Publication date: August 10, 2023Inventors: Xiao Ming Ma, Si Er Han, Xue Ying Zhang, Jing Xu, Wen Pei Yu, Ji Hui Yang, Jing Jia
-
Publication number: 20230185879Abstract: A computer implemented technique including: splitting data of a historical time series data set into subsets; updating a time series model by backwards data selection to obtain an interim version of the time series model; exploring pattern changes in the new data to obtain new predictors of pattern change; and updating the interim version of the time series model by applying the new predictors of pattern change to obtain an updated version of the time series model.Type: ApplicationFiled: December 15, 2021Publication date: June 15, 2023Inventors: Si Er Han, Xue Ying Zhang, Jing Xu, Xiao Ming Ma, Ji Hui Yang
-
Publication number: 20230177119Abstract: Embodiments are disclosed for a method. The method includes determining a correlation list of missing value predictors. The method also includes generating a cluster model having multiple clusters. The cluster model is based on a target value and predictor values. The method further includes determining an imputed value for a missing value of a row of original training data based on a linear regression model for multiple non-missing value predictor values for the clusters.Type: ApplicationFiled: December 5, 2021Publication date: June 8, 2023Inventors: A PENG ZHANG, Xiao Ming Ma, Lei Gao, Jin Wang, Kai Li
-
Publication number: 20230153282Abstract: A computer-implemented method, system and computer program product for improving performance of a distributed database. A query is received to store version data in the distributed database. Upon receiving the query to store the version data, the version data is stored in a row of a data page of a main table of a heap organized table/index organized table of the distributed database, where the row of the data page of the main table of the heap organized table/index organized table of the distributed database contains a pointer pointing to a later/previous version of the version data if the later/previous version of the version data is stored in the data page thereby chaining version data bi-directionally.Type: ApplicationFiled: November 15, 2021Publication date: May 18, 2023Inventors: Sheng Yan Sun, Shuo Li, Xiaobo Wang, Xiao Ming Ma
-
Publication number: 20230137184Abstract: A method, system, and computer program product for incremental machine learning for a parametric machine learning model are disclosed. The method may include processing samples comprising historical samples and new samples with an existing parametric machine learning model to obtain at least one prediction residual of each of the samples, wherein the existing parametric machine learning model was trained based on the historical samples. The method may further include clustering the samples based on the at least one prediction residual of each of the samples and features of each of the samples. The method may further include sampling samples in each cluster to ensure that each cluster includes substantially similar number of sampled samples. The method may further include updating the existing parametric machine learning model to obtain an updated parametric machine learning model based on sampled samples in each cluster.Type: ApplicationFiled: November 4, 2021Publication date: May 4, 2023Inventors: Si Er Han, Ji Hui Yang, Xiao Ming Ma, Jing Xu, Xue Ying Zhang
-
Publication number: 20230125621Abstract: A computer-implemented method, system and computer program product for generating visualizations for semi-structured data. Visualization data is extracted from infographics depicting semi-structured data. The visualization data that is extracted includes the traits or characteristics of the semi-structured data depicted in the infographics (e.g., dimension), the characteristics of the infographics (e.g., location of the depicted data), and the constraints or display requirements (e.g., display target value in a particular axis). A trait and constraint rule set is then generated based on the extracted visualization data. The trait and constraint rule set includes a set of rules that maps the display requirements to the particular set of traits or characteristics exhibited by the semi-structured data displayed in the infographics.Type: ApplicationFiled: October 25, 2021Publication date: April 27, 2023Inventors: Wen Pei Yu, Ji Hui Yang, Xiao Ming Ma, Rui Wang, Jing James Xu