Patents by Inventor Jing-Yun Shyr

Jing-Yun Shyr has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20130218909
    Abstract: Provided are techniques for generating order statistics and error bounds. For each of multiple, distributed data sources, a finite number of data bins are created for each field in that data source. Data values in each of the multiple, distributed data sources are processed to generate basic summaries for each of the data bins in a single pass of the data values. The data bins from each of the multiple, distributed data sources are sorted. One or more approximate order statistics are computed for a data set by accumulating counts from a number of the sorted data bins. Lower and upper error bounds are provided for each of the computed one or more approximate order statistics, wherein the lower and upper error bounds are values delimiting an interval containing a true value of an order statistic.
    Type: Application
    Filed: April 11, 2012
    Publication date: August 22, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Yea J. Chu, Sier Han, Fan Li, Jing-Yun Shyr, Damir Spisic, Graham J. Wills, Jing Xu
  • Publication number: 20130218908
    Abstract: Provided are techniques for generating order statistics and error bounds. For each of multiple, distributed data sources, a finite number of data bins are created for each field in that data source. Data values in each of the multiple, distributed data sources are processed to generate basic summaries for each of the data bins in a single pass of the data values. The data bins from each of the multiple, distributed data sources are sorted. One or more approximate order statistics are computed for a data set by accumulating counts from a number of the sorted data bins. Lower and upper error bounds are provided for each of the computed one or more approximate order statistics, wherein the lower and upper error bounds are values delimiting an interval containing a true value of an order statistic.
    Type: Application
    Filed: February 17, 2012
    Publication date: August 22, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Yea J. Chu, Sier Han, Fan Li, Jing-Yun Shyr, Damir Spisic, Graham J. Wills, Jing Xu
  • Publication number: 20130006998
    Abstract: Provided are techniques for analyzing fields. Statistical metrics for each field in a data set are received. A general interestingness index is generated for each field using one or more combination functions that aggregate standardized interestingness sub-indexes. One or more fields are identified as interesting for further analysis using the general interestingness index. One or more expert recommendations for field transformations are constructed for the identified one or more fields.
    Type: Application
    Filed: June 29, 2011
    Publication date: January 3, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Jing-Yun Shyr, Damir Spisic, Raymond Wright, Jing Xu, Xueying Zhang
  • Publication number: 20130007003
    Abstract: Provided are techniques for analyzing fields. Statistical metrics for each field in a data set are received. A general interestingness index is generated for each field using one or more combination functions that aggregate standardized interestingness sub-indexes. One or more fields are identified as interesting for further analysis using the general interestingness index. One or more expert recommendations for field transformations are constructed for the identified one or more fields.
    Type: Application
    Filed: September 13, 2012
    Publication date: January 3, 2013
    Applicant: International Business Machines Corporation
    Inventors: Jing-Yun Shyr, Damir Spisic, Raymond Wright, Jing Xu, Xueying Zhang
  • Publication number: 20120278275
    Abstract: Techniques are disclosed for generating an ensemble model from multiple data sources. In one embodiment, the ensemble model is generated using a global validation sample, a global holdout sample and base models generated from the multiple data sources. An accuracy value may be determined for each base model, on the basis of the global validation dataset. The ensemble model may be generated from a subset of the base models, where the subset is selected on the basis of the determined accuracy values.
    Type: Application
    Filed: July 10, 2012
    Publication date: November 1, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Marius I. Danciu, Fan Li, Michael McRoberts, Jing-Yun Shyr, Damir Spisic, Jing Xu
  • Publication number: 20120239613
    Abstract: Techniques are disclosed for generating an ensemble model from multiple data sources. In one embodiment, the ensemble model is generated using a global validation sample, a global holdout sample and base models generated from the multiple data sources. An accuracy value may be determined for each base model, on the basis of the global validation dataset. The ensemble model may be generated from a subset of the base models, where the subset is selected on the basis of the determined accuracy values.
    Type: Application
    Filed: March 15, 2011
    Publication date: September 20, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Marius I. Danciu, Fan Li, Michael McRoberts, Jing-Yun Shyr, Damir Spisic, Jing Xu