Patents by Inventor Wen Pei Yu

Wen Pei Yu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Key category identification and visualization

Patent number: 12314290

Abstract: A computer-implemented method for treating post-modeling data includes computing, sequentially for each category of a feature, a category importance (CI) value. The CI value is based on a model accuracy change when records of a category being examined are reassigned to a remaining set of categories of the feature according to a cumulative distribution of records among the remaining set of categories of the feature, wherein the remaining set of categories include all categories of the feature, except for the category being examined. A post-modeling category is performed to merge of each category having the CI value less than a CI value threshold.

Type: Grant

Filed: June 12, 2023

Date of Patent: May 27, 2025

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Xue Ying Zhang, Si Er Han, Jing Xu, Xiao Ming Ma, Wen Pei Yu, Jing James Xu, Jun Wang, Ji Hui Yang
Automatically drawing infographics for statistical data based on a data model

Patent number: 12299007

Abstract: A computer-implemented method, system and computer program product for automatically drawing infographics. Variables of a dataset are received from a computing device that were selected by the user of the computing device. For those selected variables that are associated with a data model, a procedure to draw infographics for variables assigned or not assigned the role of a target using the data model associated with each of the variables assigned or not assigned the role of target, respectively, is implemented. Alternatively, if the selected variables are not associated with a data model, then such variables are assigned a level of measurement as well as assigned the role of input. Such assignments become the data model which, along with the metadata (e.g., values of the variable) obtained by parsing the original data, are used to implement the procedure to draw infographics for variables not assigned the role of a target.

Type: Grant

Filed: January 21, 2022

Date of Patent: May 13, 2025

Assignee: International Business Machines Corporation

Inventors: Ye Fan, Qi Mao, Juan Wu, Jia Zhong Wu, Long Fan, Chong Liu, Wen Pei Yu, Yang Yang
Visualize data and significant records based on relationship with the model

Patent number: 12293438

Abstract: In an approach for post-modeling data visualization and analysis, a processor presents a first visualization of a training dataset in a first plot. Responsive to receiving a selection of a data group of the training dataset to analyze, a processor identifies three or fewer key model features of the data group of the training dataset. A processor ascertains a representative record of each key model feature of the three or fewer key model features using a Local Interpretable Model-Agnostic Explanation technique. A processor presents a second visualization of the three or fewer key model features and the representative record of each key model feature in a second plot.

Type: Grant

Filed: December 13, 2022

Date of Patent: May 6, 2025

Assignee: International Business Machines Corporation

Inventors: Wen Pei Yu, Xiao Ming Ma, Xue Ying Zhang, Si Er Han, Jing James Xu, Jing Xu, Jun Wang
INTELLIGENT RECOMMENDATION OF TIME SERIES ANOMALY DETECTION MODEL PIPELINES

Publication number: 20250094267

Abstract: A time series anomaly detection method, system, and computer program product that processes time series data includes absorbing profiles of the time series data and anomaly types of a model as features, optimizing biased ranks to create optimized ranks through merging initial ranks with new ranks generated by real anomalies, and auto-suggesting the optimized ranks for saving a predetermined amount of data operation.

Type: Application

Filed: September 15, 2023

Publication date: March 20, 2025

Inventors: Jun Wang, Jing Xu, Xiao Ming Ma, Xue Ying Zhang, Si Er Han, Jing James Xu, Wen Pei Yu
Visual representation using post modeling feature evaluation

Patent number: 12249012

Abstract: A method, computer system, and a computer program product are provided for post-modeling feature evaluation. In one embodiment, at least at least one post model visual output and associated data is obtained that at least includes an individual conditional expectation (ICE) plot and a partial dependence (PDP) plot. Using the associated data and the plots, a Feature Importance (PI) plot is provided. A plurality of features is then determined for each PI, PDP and ICE plots to calculate at least one Interesting Value for each plot. An overall score is also calculated for each plurality of features based on the associated Interesting Values for each PDP, ICE and PI plots. At least one top feature is selected based on said scores. A final plot is then generated at least reflecting the top feature. The final plot combines the PI, PDP and ICE plots together.

Type: Grant

Filed: November 17, 2022

Date of Patent: March 11, 2025

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Xiao Ming Ma, Wen Pei Yu, Jing James Xu, Xue Ying Zhang, Si Er Han, Jing Xu, Jun Wang
Feature importance based model optimization

Patent number: 12242367

Abstract: Disclosed are a computer-implemented method, a system and a computer program product for model exploration. Model feature importance of each model of a plurality of models can be obtained, the plurality of models can be grouped into a plurality of model clusters based on the model feature importance of each model, and the model feature importance can be presented by box-plot or confidence interval.

Type: Grant

Filed: May 15, 2022

Date of Patent: March 4, 2025

Assignee: International Business Machines Corporation

Inventors: Jing Xu, Xue Ying Zhang, Si Er Han, Jing James Xu, Xiao Ming Ma, Jun Wang, Wen Pei Yu
POST-MODELING VISUALIZATION

Publication number: 20250053858

Abstract: In an approach, a processor selects a top N features for a machine learning (ML) model; discretizes values of each continuous feature of the top N features; generates a set of combination values that each represent a unique combination of feature values in for a data record; predicts, using the ML model, a target value for each record generating predicted target values; groups the predicted target values based on the combination value for each respective record; fits a distribution for each grouping of the predicted target values associated with a respective combination value generating a set of distributions; clusters and refits the set of distributions using a clustering algorithm resulting in a set of clusters and a refitted distribution for each cluster of the set of clusters; and outputs a visualization of the refitted distribution for each cluster as a distribution curve on a graph along with the associated records.

Type: Application

Filed: August 8, 2023

Publication date: February 13, 2025

Inventors: Si Er Han, Xiao Ming Ma, Wen Pei Yu, Xue Ying Zhang, Jing Xu, Jing James Xu, Jun Wang, Lei Tian
ABNORMAL POINT SIMULATION

Publication number: 20240427684

Abstract: A computer-implemented method, a system and a computer program product for abnormal point simulation are disclosed. A processor analyzes a plurality of data blocks in first time series data to determine traits of respective data blocks. For the respective data blocks, a processor simulates one or more abnormal points based on the traits of the respective data blocks.

Type: Application

Filed: June 20, 2023

Publication date: December 26, 2024

Inventors: Si Er Han, Xiao Ming Ma, Jun Wang, Wen Pei Yu, Xue Ying Zhang, Jing James Xu, Jing Xu
KEY CATEGORY IDENTIFICATION AND VISUALIZATION

Publication number: 20240411783

Abstract: A computer-implemented method for treating post-modeling data includes computing, sequentially for each category of a feature, a category importance (CI) value. The CI value is based on a model accuracy change when records of a category being examined are reassigned to a remaining set of categories of the feature according to a cumulative distribution of records among the remaining set of categories of the feature, wherein the remaining set of categories include all categories of the feature, except for the category being examined. A post-modeling category is performed to merge of each category having the CI value less than a CI value threshold.

Type: Application

Filed: June 12, 2023

Publication date: December 12, 2024

Inventors: Xue Ying Zhang, Si Er Han, Jing Xu, Xiao Ming Ma, Wen Pei Yu, Jing James Xu, Jun Wang, Ji Hui Yang
Data Classification Using Ensemble Models

Publication number: 20240256637

Abstract: A computer implemented method manages an ensemble model system to classify records. A number of processor units cluster records into groups of records based on classification predictions generated by base models in the ensemble model system for the records. The number of processor units determines sets of weights for the base models that increase a probability that the base models in the ensemble model system correctly predict the groups of records. Each set of weights in the sets of weights is associated with a group of records in the groups of records.

Type: Application

Filed: January 27, 2023

Publication date: August 1, 2024

Inventors: Si Er Han, Xue Ying Zhang, Jing Xu, Jing James Xu, Xiao Ming Ma, Wen Pei Yu, Jun Wang, Ji Hui Yang
VISUALIZE DATA AND SIGNIFICANT RECORDS BASED ON RELATIONSHIP WITH THE MODEL

Publication number: 20240193830

Abstract: In an approach for post-modeling data visualization and analysis, a processor presents a first visualization of a training dataset in a first plot. Responsive to receiving a selection of a data group of the training dataset to analyze, a processor identifies three or fewer key model features of the data group of the training dataset. A processor ascertains a representative record of each key model feature of the three or fewer key model features using a Local Interpretable Model-Agnostic Explanation technique. A processor presents a second visualization of the three or fewer key model features and the representative record of each key model feature in a second plot.

Type: Application

Filed: December 13, 2022

Publication date: June 13, 2024

Inventors: Wen Pei Yu, Xiao Ming Ma, Xue Ying Zhang, Si Er Han, Jing James Xu, Jing Xu, Jun Wang
VISUAL REPRESNTATION USING POST MODELING FEATURE EVALUATION

Publication number: 20240169614

Abstract: A method, computer system, and a computer program product are provided for post-modeling feature evaluation. In one embodiment, at least at least one post model visual output and associated data is obtained that at least includes an individual conditional expectation (ICE) plot and a partial dependence (PDP) plot. Using the associated data and the plots, a Feature Importance (PI) plot is provided. A plurality of features is then determined for each PI, PDP and ICE plots to calculate at least one Interesting Value for each plot. An overall score is also calculated for each plurality of features based on the associated Interesting Values for each PDP, ICE and PI plots. At least one top feature is selected based on said scores. A final plot is then generated at least reflecting the top feature. The final plot combines the PI, PDP and ICE plots together.

Type: Application

Filed: November 17, 2022

Publication date: May 23, 2024

Inventors: Xiao Ming Ma, Wen Pei Yu, Jing James Xu, Xue Ying Zhang, Si Er Han, Jing Xu, Jun Wang
DETECTING ANOMALOUS DATA

Publication number: 20240054211

Abstract: Detecting anomalous data by applying a plurality of models to a data set to yield detection results including anomalous data, applying evaluation methods to the detection results for each of the plurality of models, determining a combined score for the detection results according to the evaluation methods, determining a combined score threshold, and defining a set of detected anomalies according to the combined score and the combined score threshold.

Type: Application

Filed: August 10, 2022

Publication date: February 15, 2024

Inventors: Jing Xu, Xue Ying Zhang, Si Er Han, Jing James Xu, Xiao Ming Ma, Wen Pei Yu
Parallel chart generator

Patent number: 11893666

Abstract: An approach is provided in which the approach generates a parallel chart based on multiple records that includes a set of variable values corresponding to a set of variables. To generate the parallel chart, the approach arranges the set of variables on the parallel chart in a variable order based on at least one variable arrangement rule. The approach arranges an initial variable value order for each one of the set of variables, and computes a lucidity score based on the variable order and the initial variable value order of each of the set of variables. The lucidity score is a measurement of the clarity of the parallel chart. The approach adjusts the variable value order of at least one of the set of variables to increase the lucidity score and optimizes the clarity of the parallel chart based on the adjusted variable value order.

Type: Grant

Filed: January 19, 2022

Date of Patent: February 6, 2024

Assignee: International Business Machines Corporation

Inventors: Xiao Ming Ma, Si Er Han, Xue Ying Zhang, Jing Xu, Wen Pei Yu, Ji Hui Yang, Jing Jia
PARTIAL IMPORTANCE OF INPUT VARIABLE OF PREDICTIVE MODELS

Publication number: 20230394326

Abstract: Embodiments of the present disclosure relate to a method, system, and computer program product for predictive models. According to the method, a processor may provide a first list including at least one input variable of a predictive model and a second list including a plurality of variables of the predictive model. For each of input variables in the second list, the processor may determine contribution of the input variable to prediction of the predictive model with respect to the at least one input variable in the first list. The processor may update the first list by moving an input variable in the second list into the first list based on the determined contribution of the plurality of input variables. The processor may render one or more of input variables in the updated first list based on an order of the input variables in the updated first list.

Type: Application

Filed: June 1, 2022

Publication date: December 7, 2023

Inventors: Si Er Han, Xue Ying Zhang, Xiao Ming Ma, Wen Pei Yu, Jing Xu, Jing James Xu, Rui Wang
FEATURE IMPORTANCE BASED MODEL OPTIMIZATION

Publication number: 20230367689

Abstract: Disclosed are a computer-implemented method, a system and a computer program product for model exploration. Model feature importance of each model of a plurality of models can be obtained, the plurality of models can be grouped into a plurality of model clusters based on the model feature importance of each model, and the model feature importance can be presented by box-plot or confidence interval.

Type: Application

Filed: May 15, 2022

Publication date: November 16, 2023

Inventors: Jing Xu, Xue Ying Zhang, Si Er Han, Jing James Xu, Xiao Ming Ma, Jun Wang, Wen Pei Yu
STABLE LOCAL INTERPRETABLE MODEL FOR PREDICTION

Publication number: 20230306312

Abstract: Examples described herein provide a computer-implemented method that includes determining a kernel width for the machine learning model. The method further includes building a local interpretable linear model using the kernel width. The method further includes computing a contribution and confidence for a feature of the local interpretable linear model. The method further includes updating the local interpretable linear model to generate a final model and computing an overall confidence for the final model.

Type: Application

Filed: March 21, 2022

Publication date: September 28, 2023

Inventors: Xiao Ming Ma, Si Er Han, Xue Ying Zhang, Wen Pei Yu, Jing Xu, Jing James Xu, Lei Gao, A Peng Zhang
BUILDING MODELS WITH EXPECTED FEATURE IMPORTANCE

Publication number: 20230297647

Abstract: A method, computer program, and computer system are provided for training a machine learning model. A feature associated with training data derived from a dataset is identified. A machine learning model is generated based on the training data. At least a portion of the training data associated with maximizing an importance value associated with the identified feature is selected. The importance value corresponds to a need associated with the machine learning model. One or more weight values is assigned to the selected portion of the training data. The machine learning model is updated based on the assigned weight values.

Type: Application

Filed: March 18, 2022

Publication date: September 21, 2023

Inventors: Xiao Ming Ma, Jin Wang, Lei Gao, A PENG ZHANG, Wen Pei Yu, Xin Feng Zhu
INTERACTIVE WHAT-IF ANALYSIS SYSTEM BASED ON IMPRECISION SCORING SIMULATION TECHNIQUES

Publication number: 20230289693

Abstract: A method, computer system, and a computer program product for performing an interactive outcome analysis is provided. The present invention may include generating, by a computer, a first estimation outcome from a first plurality of input conditions. The present invention may include generating, by the computer, a parallel estimation outcome from a second plurality of input conditions, wherein at least one of said input conditions in said first plurality of input conditions is different from any of said second plurality of input conditions. The present invention may include selecting, by the computer, either said first or said parallel estimation outcome by analyzing said outcomes with one another and with a target goal outcome.

Type: Application

Filed: March 14, 2022

Publication date: September 14, 2023

Inventors: Wen Pei Yu, Xiao Ming Ma, Xue Ying Zhang, Si Er Han, Jing James Xu, Jing Xu, Rui Wang, Jun Wang, Ji Hui Yang
OBJECT TRAIL ANALYTICS

Publication number: 20230267622

Abstract: A method, a structure, and a computer system for object trail analytics. The exemplary embodiments may include obtaining time series data detailing an average speed of one or more roads within a traffic network at one or more times. The exemplary embodiments may further include extracting one or more features corresponding to the time series data, and generating one or more time series forecasting models based on the time series data and the one or more features. Additionally, the exemplary embodiments may include identifying a current location of a moving object within the traffic network, and predicting a speed of the moving object based on applying the one or more time series forecasting models to the current location.

Type: Application

Filed: February 21, 2022

Publication date: August 24, 2023

Inventors: Jun Wang, Jing Xu, Wen Pei Yu, Lei Gao, Jin Wang, A PENG ZHANG

1 2 next