Patents by Inventor Wen Pei Yu
Wen Pei Yu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 12293438Abstract: In an approach for post-modeling data visualization and analysis, a processor presents a first visualization of a training dataset in a first plot. Responsive to receiving a selection of a data group of the training dataset to analyze, a processor identifies three or fewer key model features of the data group of the training dataset. A processor ascertains a representative record of each key model feature of the three or fewer key model features using a Local Interpretable Model-Agnostic Explanation technique. A processor presents a second visualization of the three or fewer key model features and the representative record of each key model feature in a second plot.Type: GrantFiled: December 13, 2022Date of Patent: May 6, 2025Assignee: International Business Machines CorporationInventors: Wen Pei Yu, Xiao Ming Ma, Xue Ying Zhang, Si Er Han, Jing James Xu, Jing Xu, Jun Wang
-
Publication number: 20250094267Abstract: A time series anomaly detection method, system, and computer program product that processes time series data includes absorbing profiles of the time series data and anomaly types of a model as features, optimizing biased ranks to create optimized ranks through merging initial ranks with new ranks generated by real anomalies, and auto-suggesting the optimized ranks for saving a predetermined amount of data operation.Type: ApplicationFiled: September 15, 2023Publication date: March 20, 2025Inventors: Jun Wang, Jing Xu, Xiao Ming Ma, Xue Ying Zhang, Si Er Han, Jing James Xu, Wen Pei Yu
-
Patent number: 12249012Abstract: A method, computer system, and a computer program product are provided for post-modeling feature evaluation. In one embodiment, at least at least one post model visual output and associated data is obtained that at least includes an individual conditional expectation (ICE) plot and a partial dependence (PDP) plot. Using the associated data and the plots, a Feature Importance (PI) plot is provided. A plurality of features is then determined for each PI, PDP and ICE plots to calculate at least one Interesting Value for each plot. An overall score is also calculated for each plurality of features based on the associated Interesting Values for each PDP, ICE and PI plots. At least one top feature is selected based on said scores. A final plot is then generated at least reflecting the top feature. The final plot combines the PI, PDP and ICE plots together.Type: GrantFiled: November 17, 2022Date of Patent: March 11, 2025Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Xiao Ming Ma, Wen Pei Yu, Jing James Xu, Xue Ying Zhang, Si Er Han, Jing Xu, Jun Wang
-
Patent number: 12242367Abstract: Disclosed are a computer-implemented method, a system and a computer program product for model exploration. Model feature importance of each model of a plurality of models can be obtained, the plurality of models can be grouped into a plurality of model clusters based on the model feature importance of each model, and the model feature importance can be presented by box-plot or confidence interval.Type: GrantFiled: May 15, 2022Date of Patent: March 4, 2025Assignee: International Business Machines CorporationInventors: Jing Xu, Xue Ying Zhang, Si Er Han, Jing James Xu, Xiao Ming Ma, Jun Wang, Wen Pei Yu
-
Publication number: 20250053858Abstract: In an approach, a processor selects a top N features for a machine learning (ML) model; discretizes values of each continuous feature of the top N features; generates a set of combination values that each represent a unique combination of feature values in for a data record; predicts, using the ML model, a target value for each record generating predicted target values; groups the predicted target values based on the combination value for each respective record; fits a distribution for each grouping of the predicted target values associated with a respective combination value generating a set of distributions; clusters and refits the set of distributions using a clustering algorithm resulting in a set of clusters and a refitted distribution for each cluster of the set of clusters; and outputs a visualization of the refitted distribution for each cluster as a distribution curve on a graph along with the associated records.Type: ApplicationFiled: August 8, 2023Publication date: February 13, 2025Inventors: Si Er Han, Xiao Ming Ma, Wen Pei Yu, Xue Ying Zhang, Jing Xu, Jing James Xu, Jun Wang, Lei Tian
-
Publication number: 20240427684Abstract: A computer-implemented method, a system and a computer program product for abnormal point simulation are disclosed. A processor analyzes a plurality of data blocks in first time series data to determine traits of respective data blocks. For the respective data blocks, a processor simulates one or more abnormal points based on the traits of the respective data blocks.Type: ApplicationFiled: June 20, 2023Publication date: December 26, 2024Inventors: Si Er Han, Xiao Ming Ma, Jun Wang, Wen Pei Yu, Xue Ying Zhang, Jing James Xu, Jing Xu
-
Publication number: 20240411783Abstract: A computer-implemented method for treating post-modeling data includes computing, sequentially for each category of a feature, a category importance (CI) value. The CI value is based on a model accuracy change when records of a category being examined are reassigned to a remaining set of categories of the feature according to a cumulative distribution of records among the remaining set of categories of the feature, wherein the remaining set of categories include all categories of the feature, except for the category being examined. A post-modeling category is performed to merge of each category having the CI value less than a CI value threshold.Type: ApplicationFiled: June 12, 2023Publication date: December 12, 2024Inventors: Xue Ying Zhang, Si Er Han, Jing Xu, Xiao Ming Ma, Wen Pei Yu, Jing James Xu, Jun Wang, Ji Hui Yang
-
Publication number: 20240256637Abstract: A computer implemented method manages an ensemble model system to classify records. A number of processor units cluster records into groups of records based on classification predictions generated by base models in the ensemble model system for the records. The number of processor units determines sets of weights for the base models that increase a probability that the base models in the ensemble model system correctly predict the groups of records. Each set of weights in the sets of weights is associated with a group of records in the groups of records.Type: ApplicationFiled: January 27, 2023Publication date: August 1, 2024Inventors: Si Er Han, Xue Ying Zhang, Jing Xu, Jing James Xu, Xiao Ming Ma, Wen Pei Yu, Jun Wang, Ji Hui Yang
-
Publication number: 20240193830Abstract: In an approach for post-modeling data visualization and analysis, a processor presents a first visualization of a training dataset in a first plot. Responsive to receiving a selection of a data group of the training dataset to analyze, a processor identifies three or fewer key model features of the data group of the training dataset. A processor ascertains a representative record of each key model feature of the three or fewer key model features using a Local Interpretable Model-Agnostic Explanation technique. A processor presents a second visualization of the three or fewer key model features and the representative record of each key model feature in a second plot.Type: ApplicationFiled: December 13, 2022Publication date: June 13, 2024Inventors: Wen Pei Yu, Xiao Ming Ma, Xue Ying Zhang, Si Er Han, Jing James Xu, Jing Xu, Jun Wang
-
Publication number: 20240169614Abstract: A method, computer system, and a computer program product are provided for post-modeling feature evaluation. In one embodiment, at least at least one post model visual output and associated data is obtained that at least includes an individual conditional expectation (ICE) plot and a partial dependence (PDP) plot. Using the associated data and the plots, a Feature Importance (PI) plot is provided. A plurality of features is then determined for each PI, PDP and ICE plots to calculate at least one Interesting Value for each plot. An overall score is also calculated for each plurality of features based on the associated Interesting Values for each PDP, ICE and PI plots. At least one top feature is selected based on said scores. A final plot is then generated at least reflecting the top feature. The final plot combines the PI, PDP and ICE plots together.Type: ApplicationFiled: November 17, 2022Publication date: May 23, 2024Inventors: Xiao Ming Ma, Wen Pei Yu, Jing James Xu, Xue Ying Zhang, Si Er Han, Jing Xu, Jun Wang
-
Publication number: 20240054211Abstract: Detecting anomalous data by applying a plurality of models to a data set to yield detection results including anomalous data, applying evaluation methods to the detection results for each of the plurality of models, determining a combined score for the detection results according to the evaluation methods, determining a combined score threshold, and defining a set of detected anomalies according to the combined score and the combined score threshold.Type: ApplicationFiled: August 10, 2022Publication date: February 15, 2024Inventors: Jing Xu, Xue Ying Zhang, Si Er Han, Jing James Xu, Xiao Ming Ma, Wen Pei Yu
-
Patent number: 11893666Abstract: An approach is provided in which the approach generates a parallel chart based on multiple records that includes a set of variable values corresponding to a set of variables. To generate the parallel chart, the approach arranges the set of variables on the parallel chart in a variable order based on at least one variable arrangement rule. The approach arranges an initial variable value order for each one of the set of variables, and computes a lucidity score based on the variable order and the initial variable value order of each of the set of variables. The lucidity score is a measurement of the clarity of the parallel chart. The approach adjusts the variable value order of at least one of the set of variables to increase the lucidity score and optimizes the clarity of the parallel chart based on the adjusted variable value order.Type: GrantFiled: January 19, 2022Date of Patent: February 6, 2024Assignee: International Business Machines CorporationInventors: Xiao Ming Ma, Si Er Han, Xue Ying Zhang, Jing Xu, Wen Pei Yu, Ji Hui Yang, Jing Jia
-
Publication number: 20230394326Abstract: Embodiments of the present disclosure relate to a method, system, and computer program product for predictive models. According to the method, a processor may provide a first list including at least one input variable of a predictive model and a second list including a plurality of variables of the predictive model. For each of input variables in the second list, the processor may determine contribution of the input variable to prediction of the predictive model with respect to the at least one input variable in the first list. The processor may update the first list by moving an input variable in the second list into the first list based on the determined contribution of the plurality of input variables. The processor may render one or more of input variables in the updated first list based on an order of the input variables in the updated first list.Type: ApplicationFiled: June 1, 2022Publication date: December 7, 2023Inventors: Si Er Han, Xue Ying Zhang, Xiao Ming Ma, Wen Pei Yu, Jing Xu, Jing James Xu, Rui Wang
-
Publication number: 20230367689Abstract: Disclosed are a computer-implemented method, a system and a computer program product for model exploration. Model feature importance of each model of a plurality of models can be obtained, the plurality of models can be grouped into a plurality of model clusters based on the model feature importance of each model, and the model feature importance can be presented by box-plot or confidence interval.Type: ApplicationFiled: May 15, 2022Publication date: November 16, 2023Inventors: Jing Xu, Xue Ying Zhang, Si Er Han, Jing James Xu, Xiao Ming Ma, Jun Wang, Wen Pei Yu
-
Publication number: 20230306312Abstract: Examples described herein provide a computer-implemented method that includes determining a kernel width for the machine learning model. The method further includes building a local interpretable linear model using the kernel width. The method further includes computing a contribution and confidence for a feature of the local interpretable linear model. The method further includes updating the local interpretable linear model to generate a final model and computing an overall confidence for the final model.Type: ApplicationFiled: March 21, 2022Publication date: September 28, 2023Inventors: Xiao Ming Ma, Si Er Han, Xue Ying Zhang, Wen Pei Yu, Jing Xu, Jing James Xu, Lei Gao, A Peng Zhang
-
Publication number: 20230297647Abstract: A method, computer program, and computer system are provided for training a machine learning model. A feature associated with training data derived from a dataset is identified. A machine learning model is generated based on the training data. At least a portion of the training data associated with maximizing an importance value associated with the identified feature is selected. The importance value corresponds to a need associated with the machine learning model. One or more weight values is assigned to the selected portion of the training data. The machine learning model is updated based on the assigned weight values.Type: ApplicationFiled: March 18, 2022Publication date: September 21, 2023Inventors: Xiao Ming Ma, Jin Wang, Lei Gao, A PENG ZHANG, Wen Pei Yu, Xin Feng Zhu
-
Publication number: 20230289693Abstract: A method, computer system, and a computer program product for performing an interactive outcome analysis is provided. The present invention may include generating, by a computer, a first estimation outcome from a first plurality of input conditions. The present invention may include generating, by the computer, a parallel estimation outcome from a second plurality of input conditions, wherein at least one of said input conditions in said first plurality of input conditions is different from any of said second plurality of input conditions. The present invention may include selecting, by the computer, either said first or said parallel estimation outcome by analyzing said outcomes with one another and with a target goal outcome.Type: ApplicationFiled: March 14, 2022Publication date: September 14, 2023Inventors: Wen Pei Yu, Xiao Ming Ma, Xue Ying Zhang, Si Er Han, Jing James Xu, Jing Xu, Rui Wang, Jun Wang, Ji Hui Yang
-
Publication number: 20230267622Abstract: A method, a structure, and a computer system for object trail analytics. The exemplary embodiments may include obtaining time series data detailing an average speed of one or more roads within a traffic network at one or more times. The exemplary embodiments may further include extracting one or more features corresponding to the time series data, and generating one or more time series forecasting models based on the time series data and the one or more features. Additionally, the exemplary embodiments may include identifying a current location of a moving object within the traffic network, and predicting a speed of the moving object based on applying the one or more time series forecasting models to the current location.Type: ApplicationFiled: February 21, 2022Publication date: August 24, 2023Inventors: Jun Wang, Jing Xu, Wen Pei Yu, Lei Gao, Jin Wang, A PENG ZHANG
-
Publication number: 20230252699Abstract: An approach is provided in which the approach generates a parallel chart based on multiple records that includes a set of variable values corresponding to a set of variables. To generate the parallel chart, the approach arranges the set of variables on the parallel chart in a variable order based on at least one variable arrangement rule. The approach arranges an initial variable value order for each one of the set of variables, and computes a lucidity score based on the variable order and the initial variable value order of each of the set of variables. The lucidity score is a measurement of the clarity of the parallel chart. The approach adjusts the variable value order of at least one of the set of variables to increase the lucidity score and optimizes the clarity of the parallel chart based on the adjusted variable value order.Type: ApplicationFiled: January 19, 2022Publication date: August 10, 2023Inventors: Xiao Ming Ma, Si Er Han, Xue Ying Zhang, Jing Xu, Wen Pei Yu, Ji Hui Yang, Jing Jia
-
Publication number: 20230237076Abstract: A computer-implemented method, system and computer program product for automatically drawing infographics. Variables of a dataset are received from a computing device that were selected by the user of the computing device. For those selected variables that are associated with a data model, a procedure to draw infographics for variables assigned or not assigned the role of a target using the data model associated with each of the variables assigned or not assigned the role of target, respectively, is implemented. Alternatively, if the selected variables are not associated with a data model, then such variables are assigned a level of measurement as well as assigned the role of input. Such assignments become the data model which, along with the metadata (e.g., values of the variable) obtained by parsing the original data, are used to implement the procedure to draw infographics for variables not assigned the role of a target.Type: ApplicationFiled: January 21, 2022Publication date: July 27, 2023Inventors: Ye Fan, Qi Mao, Juan Wu, Jia Zhong Wu, Long Fan, Chong Liu, Wen Pei Yu, Yang Yang