Patents by Inventor Jing-Yun Shyr

Jing-Yun Shyr has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

AUTOMATED SELECTION OF GENERALIZED LINEAR MODEL COMPONENTS FOR BUSINESS INTELLIGENCE ANALYTICS

Publication number: 20170004409

Abstract: Techniques are described for automated selection of components for a generalized linear model. In one example, a method includes determining a candidate set of distributions, a candidate set of link functions, and a candidate set of predictor variables, based at least in part on a dataset of interest. The method further includes selecting a distribution from the initial candidate set of distributions and a link function from the initial candidate set of link functions, based at least in part on the candidate set of predictor variables; and selecting predictor variables from the candidate set of predictor variables, based at least in part on the selected distribution and the selected link function. The method further includes reiterating the selecting processes until a stopping criterion is fulfilled, and generating a generalized linear model output comprising the selected distribution, the selected link function, and the selected predictor variables.

Type: Application

Filed: June 30, 2015

Publication date: January 5, 2017

Inventors: Yea Jane Chu, Jing-Yun Shyr, Weicai Zhong
AUTOMATIC TIME SERIES EXPLORATION FOR BUSINESS INTELLIGENCE ANALYTICS

Publication number: 20160342909

Abstract: Techniques are described for generating characterizations of time series data. In one example, a method includes extracting a trend-cycle component, a seasonal component, and an irregular component from a time series of data. The method further includes performing one or more pattern analyses on the trend-cycle component, the seasonal component, and the irregular component. The method further includes, for each pattern analysis of the one or more pattern analyses, performing a comparison of an analytic result of the respective pattern analysis to a selected significance threshold for the respective pattern analysis to determine if the analytic result passes the significance threshold for the respective pattern analysis. The method further includes generating an output for each of the analytic results that pass the significance threshold for the respective pattern analysis.

Type: Application

Filed: May 18, 2015

Publication date: November 24, 2016

Inventors: Yea Jane Chu, Sier Han, Jing-Yun Shyr
AUTOMATIC TIME SERIES EXPLORATION FOR BUSINESS INTELLIGENCE ANALYTICS

Publication number: 20160342910

Abstract: Techniques are described for generating characterizations of time series data. In one example, a method includes extracting a trend-cycle component, a seasonal component, and an irregular component from a time series of data. The method further includes performing one or more pattern analyses on the trend-cycle component, the seasonal component, and the irregular component. The method further includes, for each pattern analysis of the one or more pattern analyses, performing a comparison of an analytic result of the respective pattern analysis to a selected significance threshold for the respective pattern analysis to determine if the analytic result passes the significance threshold for the respective pattern analysis. The method further includes generating an output for each of the analytic results that pass the significance threshold for the respective pattern analysis.

Type: Application

Filed: May 13, 2016

Publication date: November 24, 2016

Inventors: Yea Jane Chu, Sier Han, Jing-Yun Shyr
Adaptive variable selection for data clustering

Patent number: 9477781

Abstract: One or more processors generate subsets of cluster feature (CF)-trees, which represent respective sets of local data as leaf entries. One or more processors collect variables that were used to generate the CF-trees included in the subsets. One or more processors generate respective approximate clustering solutions for the subsets by applying hierarchical agglomerative clustering to the collected variables and leaf entries of the plurality of CF-trees. One or more processors select candidate sets of variables with maximal goodness that are locally optimal for respective subsets based on the approximate clustering solutions. One or more processors select a set of variables, which produce an overall clustering solution, from the candidate sets of variables.

Type: Grant

Filed: April 8, 2014

Date of Patent: October 25, 2016

Assignee: International Business Machines Corporation

Inventors: Jing-Yun Shyr, Damir Spisic, Jing Xu
Condensing hierarchical data

Patent number: 9460402

Abstract: A computing device includes at least one processor, and at least one module operable by the at least one processor to receive data representing a hierarchy, wherein the hierarchy comprises at least one set of sibling nodes and a respective parent node, generate a condensed hierarchy by determining a grouping for the at least one set of sibling nodes, determine whether the at least one set of sibling nodes can be represented by the respective parent node, based at least in part on the grouping for the at least one set of sibling nodes, and responsive to determining that the at least one set of sibling nodes can be represented by the respective parent node, remove the at least one set of sibling nodes from the condensed hierarchy. The at least one module may further be operable by the at least one processor to output the condensed hierarchy for display.

Type: Grant

Filed: December 27, 2013

Date of Patent: October 4, 2016

Assignee: International Business Machines Corporation

Inventors: Daniel J. Rope, Jing-Yun Shyr, Damir Spisic
Adaptive variable selection for data clustering

Patent number: 9460236

Abstract: One or more processors generate subsets of cluster feature (CF)-trees, which represent respective sets of local data as leaf entries. One or more processors collect variables that were used to generate the CF-trees included in the subsets. One or more processors generate respective approximate clustering solutions for the subsets by applying hierarchical agglomerative clustering to the collected variables and leaf entries of the plurality of CF-trees. One or more processors select candidate sets of variables with maximal goodness that are locally optimal for respective subsets based on the approximate clustering solutions. One or more processors select a set of variables, which produce an overall clustering solution, from the candidate sets of variables.

Type: Grant

Filed: November 26, 2014

Date of Patent: October 4, 2016

Assignee: International Business Machines Corporation

Inventors: Jing-Yun Shyr, Damir Spisic, Jing Xu
Missing value imputation for predictive models

Patent number: 9443194

Abstract: Provided are techniques for imputing a missing value for each of one or more predictor variables. Data is received from one or more data sources. For each of the one or more predictor variables, an imputation model is built based on information of a target variable; a type of imputation model to construct is determined based on the one or more data sources, a measurement level of the predictor variable, and a measurement level of the target variable; and the determined type of imputation model is constructed using basic statistics of the predictor variable and the target variable. The missing value is imputed for each of the one or more predictor variables using the data from the one or more data sources and one or more built imputation models to generate a completed data set.

Type: Grant

Filed: April 12, 2012

Date of Patent: September 13, 2016

Assignee: International Business Machines Corporation

Inventors: Yea J. Chu, Sier Han, Jing-Yun Shyr, Jing Xu
PREDICTIVE MODEL SEARCH BY COMMUNICATING COMPARATIVE STRENGTH

Publication number: 20160253330

Abstract: A method for comparing predictive data models based on a predictive model search is provided. The method may include receiving a first and second portion of a set of data. The method may also include identifying a first and second variation of the second portion, wherein the first variation is different from the second variation. The method may further include generating first predictive data models based on the first variation, and second predictive data models based on the second variation. Additionally, the method may include applying a criteria to rank the first and second predictive data models based on predictive strength. The method may also include presenting a display of the ranked criteria, comprising the first portion, and a portion of the first and second predictive data models, wherein the portion of the first and second predictive data models are collectively ranked and presented according to the predictive strength.

Type: Application

Filed: March 22, 2016

Publication date: September 1, 2016

Inventors: Marc Altshuller, Jing-Yun Shyr, Damir Spisic, Margaret J. Vais, Neil Whitney
PREDICTIVE MODEL SEARCH BY COMMUNICATING COMPARATIVE STRENGTH

Publication number: 20160253324

Abstract: A method for comparing predictive data models based on a predictive model search is provided. The method may include receiving a first and second portion of a set of data. The method may also include identifying a first and second variation of the second portion, wherein the first variation is different from the second variation. The method may further include generating first predictive data models based on the first variation, and second predictive data models based on the second variation. Additionally, the method may include applying a criteria to rank the first and second predictive data models based on predictive strength. The method may also include presenting a display of the ranked criteria, comprising the first portion, and a portion of the first and second predictive data models, wherein the portion of the first and second predictive data models are collectively ranked and presented according to the predictive strength.

Type: Application

Filed: February 27, 2015

Publication date: September 1, 2016

Inventors: Marc Altshuller, Jing-Yun Shyr, Damir Spisic, Margaret J. Vais, Neil Whitney
Interaction detection for generalized linear models for a purchase decision

Patent number: 9361274

Abstract: Provided are techniques for interaction detection for generalized linear models. Basic statistics are calculated for a pair of categorical predictor variables and a target variable from a dataset during a single pass over the dataset. It is determined whether there is a significant interaction effect for the pair of categorical predictor variables on the target variable by: calculating a log-likelihood value for a full generalized linear model without estimating model parameters; calculating the model parameters for a reduced generalized linear model with a recursive marginal mean accumulation technique using the basic statistics; calculating a log-likelihood value for the reduced generalized linear model; calculating a likelihood ratio test statistic using the log-likelihood value for the full generalized linear model and the log-likelihood value for the reduced generalized linear model; calculating a p-value of the likelihood ratio test statistic; and comparing the p-value to a significance level.

Type: Grant

Filed: March 11, 2013

Date of Patent: June 7, 2016

Assignee: International Business Machines Corporation

Inventors: Yea J. Chu, Sier Han, Jing-Yun Shyr
Decision tree insight discovery

Patent number: 9348887

Abstract: Techniques for presenting insight into classification trees may include performing a grouping analysis to group leaf nodes of a classification tree into a significant group and an insignificant group, performing influential target category analysis to identify one or more influential target categories for the leaf nodes of the classification tree in the significant group, and presenting one or more insights into the classification tree based on the grouping analysis and the influential target category analysis. Techniques for presenting insight into regression trees may include performing a grouping analysis to group leaf nodes of a regression tree into a high group and a low group, performing unusual node detection analysis to detect one or more outlier nodes in the high group and in the low group, and presenting one or more insights into the regression tree based on the grouping analysis and the unusual node detection analysis.

Type: Grant

Filed: September 17, 2014

Date of Patent: May 24, 2016

Assignee: International Business Machines Corporation

Inventors: Jane Y. Chu, Jing-Yun Shyr, Weicai Zhong
Decision tree insight discovery

Patent number: 9317578

Abstract: Techniques for presenting insight into classification trees may include performing a grouping analysis to group leaf nodes of a classification tree into a significant group and an insignificant group, performing influential target category analysis to identify one or more influential target categories for the leaf nodes of the classification tree in the significant group, and presenting one or more insights into the classification tree based on the grouping analysis and the influential target category analysis. Techniques for presenting insight into regression trees may include performing a grouping analysis to group leaf nodes of a regression tree into a high group and a low group, performing unusual node detection analysis to detect one or more outlier nodes in the high group and in the low group, and presenting one or more insights into the regression tree based on the grouping analysis and the unusual node detection analysis.

Type: Grant

Filed: March 14, 2013

Date of Patent: April 19, 2016

Assignee: International Business Machines Corporation

Inventors: Jane Y. Chu, Jing-Yun Shyr, Weicai Zhong
PREDICTING CUSTOMER VALUE

Publication number: 20150332293

Abstract: In one example, a method includes determining, based on historical purchase data for a customer, an expectancy value that indicates when the customer is expected to make a purchase from a business, determining, based on the historical purchase data for the customer, a frequency value that indicates at what frequency the customer is expected to make purchases from the business during a future time period, and determining, based on the historical purchase data for the customer, a monetary value that indicates how much the customer is expected to spend during the future time period. In this example, the method includes determining, based on the expectancy value, the frequency value, and the monetary value, a future customer value score that indicates how valuable the customer is likely to be in the future time period.

Type: Application

Filed: February 6, 2015

Publication date: November 19, 2015

Inventors: Yea Jane Chu, Mohit Sewak, Jing-Yun Shyr
PREDICTING CUSTOMER VALUE

Publication number: 20150332296

Abstract: In one example, a method includes determining, based on historical purchase data for a customer, an expectancy value that indicates when the customer is expected to make a purchase from a business, determining, based on the historical purchase data for the customer, a frequency value that indicates at what frequency the customer is expected to make purchases from the business during a future time period, and determining, based on the historical purchase data for the customer, a monetary value that indicates how much the customer is expected to spend during the future time period. In this example, the method includes determining, based on the expectancy value, the frequency value, and the monetary value, a future customer value score that indicates how valuable the customer is likely to be in the future time period.

Type: Application

Filed: May 19, 2014

Publication date: November 19, 2015

Applicant: International Business Machines Corporation

Inventors: Yea Jane Chu, Mohit Sewak, Jing-Yun Shyr
Computing regression models

Patent number: 9159028

Abstract: Provided are techniques for computing a task result. A processing data set of records is created, wherein each of the records contains data specific to a sub-task from a set of actual sub-tasks and contains a reference to data shared by the set of actual sub-tasks, and wherein a number of the records is equivalent to a number of the actual sub-tasks in the set of actual sub-tasks. With each mapper in a set of mappers, one of the records of the processing data set is received and an assigned sub-task is executed using the received one of the records to generate output. With a single reducer, the output from each mapper in the set of mappers is reduced to determine a task result.

Type: Grant

Filed: January 11, 2013

Date of Patent: October 13, 2015

Assignee: International Business Machines Corporation

Inventors: Yea J. Chu, Dong Liang, Jing-Yun Shyr
ADAPTIVE VARIABLE SELECTION FOR DATA CLUSTERING

Publication number: 20150286704

Abstract: One or more processors generate subsets of cluster feature (CF)-trees, which represent respective sets of local data as leaf entries. One or more processors collect variables that were used to generate the CF-trees included in the subsets. One or more processors generate respective approximate clustering solutions for the subsets by applying hierarchical agglomerative clustering to the collected variables and leaf entries of the plurality of CF-trees. One or more processors select candidate sets of variables with maximal goodness that are locally optimal for respective subsets based on the approximate clustering solutions. One or more processors select a set of variables, which produce an overall clustering solution, from the candidate sets of variables.

Type: Application

Filed: November 26, 2014

Publication date: October 8, 2015

Inventors: Jing-Yun Shyr, Damir Spisic, Jing Xu
ADAPTIVE VARIABLE SELECTION FOR DATA CLUSTERING

Publication number: 20150286703

Abstract: One or more processors initiate cluster feature (CF)-tree based hierarchical clustering on leaf entries of CF-trees included in a plurality of subsets. One or more processors, generate respective partial clustering solutions for the subsets. A partial clustering solution includes a set of regular sub-clusters and candidate outlier sub-clusters. One or more processors generate initial regular clusters by performing hierarchical clustering using the regular sub-clusters. For a candidate outlier sub-cluster, one or more processors determine a closest initial regular cluster, and a distance separating the candidate outlier sub-cluster and the closest initial regular cluster. One or more processors determine which candidate outlier sub-clusters are outlier clusters based on which candidate outlier sub-clusters have a computed distance to their respective closest initial regular cluster that is greater than a corresponding distance threshold associated with their respective closest initial regular cluster.

Type: Application

Filed: November 25, 2014

Publication date: October 8, 2015

Inventors: Svetlana Levitan, Jing-Yun Shyr, Damir Spisic, Jing Xu
ADAPTIVE VARIABLE SELECTION FOR DATA CLUSTERING

Publication number: 20150286702

Abstract: One or more processors generate subsets of cluster feature (CF)-trees, which represent respective sets of local data as leaf entries. One or more processors collect variables that were used to generate the CF-trees included in the subsets. One or more processors generate respective approximate clustering solutions for the subsets by applying hierarchical agglomerative clustering to the collected variables and leaf entries of the plurality of CF-trees. One or more processors select candidate sets of variables with maximal goodness that are locally optimal for respective subsets based on the approximate clustering solutions. One or more processors select a set of variables, which produce an overall clustering solution, from the candidate sets of variables.

Type: Application

Filed: April 8, 2014

Publication date: October 8, 2015

Applicant: International Business Machines Corporation

Inventors: Jing-Yun Shyr, Damir Spisic, Jing Xu
DISTRIBUTED CLUSTERING WITH OUTLIER DETECTION

Publication number: 20150286707

Abstract: One or more processors initiate cluster feature (CF)-tree based hierarchical clustering on leaf entries of CF-trees included in a plurality of subsets. One or more processors, generate respective partial clustering solutions for the subsets. A partial clustering solution includes a set of regular sub-clusters and candidate outlier sub-clusters. One or more processors generate initial regular clusters by performing hierarchical clustering using the regular sub-clusters. For a candidate outlier sub-cluster, one or more processors determine a closest initial regular cluster, and a distance separating the candidate outlier sub-cluster and the closest initial regular cluster. One or more processors determine which candidate outlier sub-clusters are outlier clusters based on which candidate outlier sub-clusters have a computed distance to their respective closest initial regular cluster that is greater than a corresponding distance threshold associated with their respective closest initial regular cluster.

Type: Application

Filed: April 8, 2014

Publication date: October 8, 2015

Applicant: International Business Machines Corporation

Inventors: Svetlana Levitan, Jing-Yun Shyr, Damir Spisic, Jing Xu
Computing regression models

Patent number: 9152921

Abstract: Provided are techniques for computing a task result. A processing data set of records is created, wherein each of the records contains data specific to a sub-task from a set of actual sub-tasks and contains a reference to data shared by the set of actual sub-tasks, and wherein a number of the records is equivalent to a number of the actual sub-tasks in the set of actual sub-tasks. With each mapper in a set of mappers, one of the records of the processing data set is received and an assigned sub-task is executed using the received one of the records to generate output. With a single reducer, the output from each mapper in the set of mappers is reduced to determine a task result.

Type: Grant

Filed: March 21, 2014

Date of Patent: October 6, 2015

Assignee: International Business Machines Corporation

Inventors: Yea J. Chu, Dong Liang, Jing-Yun Shyr

prev 1 2 3 4 next