Patents by Inventor Xiao-Ming Ma

Xiao-Ming Ma has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240169614
    Abstract: A method, computer system, and a computer program product are provided for post-modeling feature evaluation. In one embodiment, at least at least one post model visual output and associated data is obtained that at least includes an individual conditional expectation (ICE) plot and a partial dependence (PDP) plot. Using the associated data and the plots, a Feature Importance (PI) plot is provided. A plurality of features is then determined for each PI, PDP and ICE plots to calculate at least one Interesting Value for each plot. An overall score is also calculated for each plurality of features based on the associated Interesting Values for each PDP, ICE and PI plots. At least one top feature is selected based on said scores. A final plot is then generated at least reflecting the top feature. The final plot combines the PI, PDP and ICE plots together.
    Type: Application
    Filed: November 17, 2022
    Publication date: May 23, 2024
    Inventors: Xiao Ming Ma, Wen Pei Yu, Jing James Xu, Xue Ying Zhang, Si Er Han, Jing Xu, Jun Wang
  • Patent number: 11971796
    Abstract: An approach is provided in which the approach builds a combination model that includes a normal status model and an abnormal status model. The normal status model is built from a set of time-sequenced normal status records and the abnormal status model is built from a set of time-sequenced abnormal status records. The approach computes a set of time-sequenced coefficient combination values of the normal status model and the abnormal status model based on applying a set of fitting coefficient characteristics to the normal status model and the abnormal status model. The approach performs goal seek analysis on a system using the combination model and the set of time-sequenced coefficient combination values.
    Type: Grant
    Filed: May 18, 2021
    Date of Patent: April 30, 2024
    Assignee: International Business Machines Corporation
    Inventors: Xiao Ming Ma, Si Er Han, Lei Gao, A Peng Zhang, Chun Lei Xu, Rui Wang, Jing James Xu
  • Publication number: 20240054211
    Abstract: Detecting anomalous data by applying a plurality of models to a data set to yield detection results including anomalous data, applying evaluation methods to the detection results for each of the plurality of models, determining a combined score for the detection results according to the evaluation methods, determining a combined score threshold, and defining a set of detected anomalies according to the combined score and the combined score threshold.
    Type: Application
    Filed: August 10, 2022
    Publication date: February 15, 2024
    Inventors: Jing Xu, Xue Ying Zhang, Si Er Han, Jing James Xu, Xiao Ming Ma, Wen Pei Yu
  • Patent number: 11893499
    Abstract: Automated development and training of deep forest models for analyzing data by growing a random forest of decision trees using data, determining Out-of-bag (OOB) predictions for the forest, appending the OOB predictions to the data set, and growing an additional forest using the data set including the appended OOB predictions, and combining the output of the additional forest, then utilizing the model to classify data outside the training data set.
    Type: Grant
    Filed: March 12, 2019
    Date of Patent: February 6, 2024
    Assignee: International Business Machines Corporation
    Inventors: Jing Xu, Rui Wang, Xiao Ming Ma, Ji Hui Yang, Xue Ying Zhang, Jing James Xu, Si Er Han
  • Patent number: 11893666
    Abstract: An approach is provided in which the approach generates a parallel chart based on multiple records that includes a set of variable values corresponding to a set of variables. To generate the parallel chart, the approach arranges the set of variables on the parallel chart in a variable order based on at least one variable arrangement rule. The approach arranges an initial variable value order for each one of the set of variables, and computes a lucidity score based on the variable order and the initial variable value order of each of the set of variables. The lucidity score is a measurement of the clarity of the parallel chart. The approach adjusts the variable value order of at least one of the set of variables to increase the lucidity score and optimizes the clarity of the parallel chart based on the adjusted variable value order.
    Type: Grant
    Filed: January 19, 2022
    Date of Patent: February 6, 2024
    Assignee: International Business Machines Corporation
    Inventors: Xiao Ming Ma, Si Er Han, Xue Ying Zhang, Jing Xu, Wen Pei Yu, Ji Hui Yang, Jing Jia
  • Publication number: 20230394326
    Abstract: Embodiments of the present disclosure relate to a method, system, and computer program product for predictive models. According to the method, a processor may provide a first list including at least one input variable of a predictive model and a second list including a plurality of variables of the predictive model. For each of input variables in the second list, the processor may determine contribution of the input variable to prediction of the predictive model with respect to the at least one input variable in the first list. The processor may update the first list by moving an input variable in the second list into the first list based on the determined contribution of the plurality of input variables. The processor may render one or more of input variables in the updated first list based on an order of the input variables in the updated first list.
    Type: Application
    Filed: June 1, 2022
    Publication date: December 7, 2023
    Inventors: Si Er Han, Xue Ying Zhang, Xiao Ming Ma, Wen Pei Yu, Jing Xu, Jing James Xu, Rui Wang
  • Patent number: 11823077
    Abstract: Provided are a computer-implemented method, a system, and a computer program product. The method comprises extracting features from a plurality of base models in an ensemble model. The plurality of base models are configured to provide respective prediction results. The ensemble model is configured to provide an overall prediction result from the prediction results of the plurality of base models. The features are associated with time performance of the base models. The method further comprises clustering the plurality of base models into a plurality of clusters based on the extracted features. The method further comprises assigning the plurality of base models to a plurality of parallel computation units based on the plurality of clusters.
    Type: Grant
    Filed: August 13, 2020
    Date of Patent: November 21, 2023
    Assignee: International Business Machines Corporation
    Inventors: Dong Hai Yu, Jun Wang, Jing Xu, Ji Hui Yang, Xiao Ming Ma, Song Bo
  • Publication number: 20230367689
    Abstract: Disclosed are a computer-implemented method, a system and a computer program product for model exploration. Model feature importance of each model of a plurality of models can be obtained, the plurality of models can be grouped into a plurality of model clusters based on the model feature importance of each model, and the model feature importance can be presented by box-plot or confidence interval.
    Type: Application
    Filed: May 15, 2022
    Publication date: November 16, 2023
    Inventors: Jing Xu, Xue Ying Zhang, Si Er Han, Jing James Xu, Xiao Ming Ma, Jun Wang, Wen Pei Yu
  • Publication number: 20230359758
    Abstract: The present disclosure relates to privacy protection in a search process. According to a method, a target emotion vector is extracted from a search interaction, the target emotion vector representing emotional information in the search interaction. Respective emotion distances between the target emotion vector and respective emotion vectors associated with a plurality of text clusters are determined. The plurality of text clusters is clustered from a dictionary of text elements. A first number of text clusters are selected from the plurality of text clusters based on the determined respective emotion distances. The first number of text clusters have emotion distances larger than at least one unselected text cluster among the plurality of text clusters. A plurality of confused search interactions are constructed for the search interaction based on the first number of text clusters, and the plurality of confused search interactions are performed.
    Type: Application
    Filed: May 3, 2022
    Publication date: November 9, 2023
    Inventors: Jin Wang, Lei GAO, A PENG ZHANG, Kai Li, Jun Wang, Xiao Ming Ma, Xin Feng Zhu, Geng Wu Yang
  • Publication number: 20230306312
    Abstract: Examples described herein provide a computer-implemented method that includes determining a kernel width for the machine learning model. The method further includes building a local interpretable linear model using the kernel width. The method further includes computing a contribution and confidence for a feature of the local interpretable linear model. The method further includes updating the local interpretable linear model to generate a final model and computing an overall confidence for the final model.
    Type: Application
    Filed: March 21, 2022
    Publication date: September 28, 2023
    Inventors: Xiao Ming Ma, Si Er Han, Xue Ying Zhang, Wen Pei Yu, Jing Xu, Jing James Xu, Lei Gao, A Peng Zhang
  • Publication number: 20230297647
    Abstract: A method, computer program, and computer system are provided for training a machine learning model. A feature associated with training data derived from a dataset is identified. A machine learning model is generated based on the training data. At least a portion of the training data associated with maximizing an importance value associated with the identified feature is selected. The importance value corresponds to a need associated with the machine learning model. One or more weight values is assigned to the selected portion of the training data. The machine learning model is updated based on the assigned weight values.
    Type: Application
    Filed: March 18, 2022
    Publication date: September 21, 2023
    Inventors: Xiao Ming Ma, Jin Wang, Lei Gao, A PENG ZHANG, Wen Pei Yu, Xin Feng Zhu
  • Publication number: 20230289693
    Abstract: A method, computer system, and a computer program product for performing an interactive outcome analysis is provided. The present invention may include generating, by a computer, a first estimation outcome from a first plurality of input conditions. The present invention may include generating, by the computer, a parallel estimation outcome from a second plurality of input conditions, wherein at least one of said input conditions in said first plurality of input conditions is different from any of said second plurality of input conditions. The present invention may include selecting, by the computer, either said first or said parallel estimation outcome by analyzing said outcomes with one another and with a target goal outcome.
    Type: Application
    Filed: March 14, 2022
    Publication date: September 14, 2023
    Inventors: Wen Pei Yu, Xiao Ming Ma, Xue Ying Zhang, Si Er Han, Jing James Xu, Jing Xu, Rui Wang, Jun Wang, Ji Hui Yang
  • Patent number: 11741128
    Abstract: A method of clustering data generated by an unknown model, the method including accessing the data, wherein the data includes a prediction target and a confidence, extracting a data group with high prediction confidence from the data, wherein the data group comprises a plurality of data cases, and where each of the data cases is described by a plurality of predictors, identifying high rank predictors of each the data cases in the data group, transforming the data group into a transformed data group including only the high rank predictors for each of the data cases, wherein the high rank predictors are ranked within each of the data cases included in the transformed data group, clustering the transformed data group to generate clusters, and profiling the clusters to extract an insight about the unknown model.
    Type: Grant
    Filed: May 6, 2021
    Date of Patent: August 29, 2023
    Assignee: International Business Machines Corporation
    Inventors: Jing Xu, Xue Ying Zhang, Si Er Han, Xiao Ming Ma, Ji Hui Yang
  • Patent number: 11727312
    Abstract: A computer-implemented method, system and computer program product for generating personalized recommendations to address a target problem. A machine learning prediction model directed to a target problem for an individual is built with historical data. After receiving data about the individual, a prediction for the individual is obtained in connection with the target problem by the built model using the received data about the individual. Key predictors (e.g., parameters) and their weight for the individual are generated using the prediction by an explanation model. Record(s) are identified from the historical data by performing similarity analysis of the historical data using the key predictors and their weight. Such records provide a population closely related to the individual with respect to the target problem. These records are then analyzed and recommendations are provided to a user to solve the target problem for the individual based on the analysis of the identified record(s).
    Type: Grant
    Filed: September 3, 2019
    Date of Patent: August 15, 2023
    Assignee: International Business Machines Corporation
    Inventors: Xue Ying Zhang, Jing Xu, Xiao Ming Ma, Jing James Xu, Ying Xu, Ang Chang
  • Publication number: 20230252699
    Abstract: An approach is provided in which the approach generates a parallel chart based on multiple records that includes a set of variable values corresponding to a set of variables. To generate the parallel chart, the approach arranges the set of variables on the parallel chart in a variable order based on at least one variable arrangement rule. The approach arranges an initial variable value order for each one of the set of variables, and computes a lucidity score based on the variable order and the initial variable value order of each of the set of variables. The lucidity score is a measurement of the clarity of the parallel chart. The approach adjusts the variable value order of at least one of the set of variables to increase the lucidity score and optimizes the clarity of the parallel chart based on the adjusted variable value order.
    Type: Application
    Filed: January 19, 2022
    Publication date: August 10, 2023
    Inventors: Xiao Ming Ma, Si Er Han, Xue Ying Zhang, Jing Xu, Wen Pei Yu, Ji Hui Yang, Jing Jia
  • Publication number: 20230185879
    Abstract: A computer implemented technique including: splitting data of a historical time series data set into subsets; updating a time series model by backwards data selection to obtain an interim version of the time series model; exploring pattern changes in the new data to obtain new predictors of pattern change; and updating the interim version of the time series model by applying the new predictors of pattern change to obtain an updated version of the time series model.
    Type: Application
    Filed: December 15, 2021
    Publication date: June 15, 2023
    Inventors: Si Er Han, Xue Ying Zhang, Jing Xu, Xiao Ming Ma, Ji Hui Yang
  • Publication number: 20230177119
    Abstract: Embodiments are disclosed for a method. The method includes determining a correlation list of missing value predictors. The method also includes generating a cluster model having multiple clusters. The cluster model is based on a target value and predictor values. The method further includes determining an imputed value for a missing value of a row of original training data based on a linear regression model for multiple non-missing value predictor values for the clusters.
    Type: Application
    Filed: December 5, 2021
    Publication date: June 8, 2023
    Inventors: A PENG ZHANG, Xiao Ming Ma, Lei Gao, Jin Wang, Kai Li
  • Publication number: 20230153282
    Abstract: A computer-implemented method, system and computer program product for improving performance of a distributed database. A query is received to store version data in the distributed database. Upon receiving the query to store the version data, the version data is stored in a row of a data page of a main table of a heap organized table/index organized table of the distributed database, where the row of the data page of the main table of the heap organized table/index organized table of the distributed database contains a pointer pointing to a later/previous version of the version data if the later/previous version of the version data is stored in the data page thereby chaining version data bi-directionally.
    Type: Application
    Filed: November 15, 2021
    Publication date: May 18, 2023
    Inventors: Sheng Yan Sun, Shuo Li, Xiaobo Wang, Xiao Ming Ma
  • Publication number: 20230137184
    Abstract: A method, system, and computer program product for incremental machine learning for a parametric machine learning model are disclosed. The method may include processing samples comprising historical samples and new samples with an existing parametric machine learning model to obtain at least one prediction residual of each of the samples, wherein the existing parametric machine learning model was trained based on the historical samples. The method may further include clustering the samples based on the at least one prediction residual of each of the samples and features of each of the samples. The method may further include sampling samples in each cluster to ensure that each cluster includes substantially similar number of sampled samples. The method may further include updating the existing parametric machine learning model to obtain an updated parametric machine learning model based on sampled samples in each cluster.
    Type: Application
    Filed: November 4, 2021
    Publication date: May 4, 2023
    Inventors: Si Er Han, Ji Hui Yang, Xiao Ming Ma, Jing Xu, Xue Ying Zhang
  • Publication number: 20230125621
    Abstract: A computer-implemented method, system and computer program product for generating visualizations for semi-structured data. Visualization data is extracted from infographics depicting semi-structured data. The visualization data that is extracted includes the traits or characteristics of the semi-structured data depicted in the infographics (e.g., dimension), the characteristics of the infographics (e.g., location of the depicted data), and the constraints or display requirements (e.g., display target value in a particular axis). A trait and constraint rule set is then generated based on the extracted visualization data. The trait and constraint rule set includes a set of rules that maps the display requirements to the particular set of traits or characteristics exhibited by the semi-structured data displayed in the infographics.
    Type: Application
    Filed: October 25, 2021
    Publication date: April 27, 2023
    Inventors: Wen Pei Yu, Ji Hui Yang, Xiao Ming Ma, Rui Wang, Jing James Xu