Patents by Inventor Pyungchul Kim

Pyungchul Kim has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20050060331
    Abstract: Systems and methods are provided for producing a mining model accuracy display that depicts the model's accuracy at predicting a state for a multiple-state variable. The model predicts a state and provides an associated probability for each case. Points are graphed such that one coordinate of the data point corresponds to a number N of cases and the other coordinate corresponds to the number of correct predictions made in the top N cases by probability.
    Type: Application
    Filed: October 25, 2004
    Publication date: March 17, 2005
    Applicant: Microsoft Corporation
    Inventors: Zhaohui Tang, Pyungchul Kim
  • Publication number: 20050027478
    Abstract: Systems and methods are provided for producing a mining model accuracy display that depicts the model's accuracy at predicting a state for a multiple-state variable. The model predicts a state and provides an associated probability for each case. Points are graphed such that one coordinate of the data point corresponds to a number N of cases and the other coordinate corresponds to the number of correct predictions made in the top N cases by probability.
    Type: Application
    Filed: September 1, 2004
    Publication date: February 3, 2005
    Applicant: Microsoft Corporation
    Inventors: Zhaohui Tang, Pyungchul Kim
  • Publication number: 20050021482
    Abstract: A drill-through feature is provided which provides a universal drill-through to mining model source data from a trained mining model. In order for a user or application to obtain model content information on a given node of a model, a universal function is provided whereby the user specifies the node for a model and data set, and the cases underlying that node for that model and data set are returned. A sampling of underlying cases may be provided, where only a sampling of the cases represented in the node is requested.
    Type: Application
    Filed: June 30, 2003
    Publication date: January 27, 2005
    Inventors: Pyungchul Kim, C. MacLennan, Zhaohui Tang, Raman Iyer
  • Publication number: 20050021489
    Abstract: A mining structure is created which contains processed data from a data set. This data may be used to train one or more models. In addition to the selection of data to be used by model from data set, processing parameters are set, in one embodiment. For example, the discretization of a continuous variable into buckets, the number of buckets, and/or the sub-range corresponding to each bucket is set when the mining structure is created. The mining structure is processed, which causes the processing and storage of data from data set in the mining structure. After processing, the mining structure can be used by one or more models.
    Type: Application
    Filed: July 22, 2003
    Publication date: January 27, 2005
    Inventors: C. MacLennan, Zhaohui Tang, Pyungchul Kim, Raman Iyer
  • Patent number: 6810357
    Abstract: Systems and methods are provided for producing a mining model accuracy display that depicts the model's accuracy at predicting a state for a multiple-state variable. The model predicts a state and provides an associated probability for each case. Points are graphed such that one coordinate of the data point corresponds to a number N of cases and the other coordinate corresponds to the number of correct predictions made in the top N cases by probability.
    Type: Grant
    Filed: June 28, 2002
    Date of Patent: October 26, 2004
    Assignee: Microsoft Corporation
    Inventors: Zhaohui Tang, Pyungchul Kim
  • Publication number: 20040002981
    Abstract: High-cardinality attributes are used as input attributes and as output attributes in decision tree creation. When determining which attribute test to use at a node, a distribution of states for the high-cardinality attribute in the testing data at the node is created. A certain number of the most common states for the high-cardinality attribute are selected. The most common states are used as the states for the high-cardinality attribute in determining which attribute test to use. The remaining states are combined into one state and used as a single state for the high-cardinality attribute in determining which attribute test to use. The high-cardinality attribute may be either an input attribute or an output attribute to the decision tree.
    Type: Application
    Filed: June 28, 2002
    Publication date: January 1, 2004
    Applicant: Microsoft Corporation
    Inventors: Jeffrey R. Bernhardt, Pyungchul Kim, C. James MacLennan
  • Publication number: 20040002980
    Abstract: Continuous attributes are used as input attributes in decision tree creation. Buckets are created by dividing the range of values for the continuous attribute into sub-ranges of equal extent. These buckets form initial partitions. Mergers of adjacent partitions are considered to determine score gains from such mergers, and the most useful mergers occur. The resulting partitions are used as the discretization of the continuous attribute for use as an input attribute.
    Type: Application
    Filed: June 28, 2002
    Publication date: January 1, 2004
    Applicant: Microsoft Corporation
    Inventors: Jeffrey R. Bernhardt, Pyungchul Kim, C. James MacLennan, David Maxwell Chickering
  • Publication number: 20040002879
    Abstract: Selection of certain attributes as output and input attributes is provided so a decision tree may be created more efficiently. For each possible output attribute an interestingness score is calculated. This interestingness score is based on entropy of the output attribute and a desirable entropy constant. The attributes with the highest interestingness score are used as output attributes in the creation of the decision tree. Score gains for the input attribute over the output attributes are calculated using a conventional scoring algorithm. The sum of the score gains over all output attributes for each input attribute is calculated. The attributes with the highest score gain sums are used as input attributes in the creation of the decision tree.
    Type: Application
    Filed: June 27, 2002
    Publication date: January 1, 2004
    Applicant: Microsoft Corporation
    Inventors: Jeffrey R. Bernhardt, Pyungchul Kim, C. James MacLennan
  • Publication number: 20040002833
    Abstract: Systems and methods are provided for producing a mining model accuracy display that depicts the model's accuracy at predicting a state for a multiple-state variable. The model predicts a state and provides an associated probability for each case. Points are graphed such that one coordinate of the data point corresponds to a number N of cases and the other coordinate corresponds to the number of correct predictions made in the top N cases by probability.
    Type: Application
    Filed: June 28, 2002
    Publication date: January 1, 2004
    Applicant: Microsoft Corporation
    Inventors: Zhaohui Tang, Pyungchul Kim
  • Publication number: 20040002929
    Abstract: Systems and methods are provided for producing displays of the accuracy of data mining or statistical models that produce associative predictions. For all cases in a testing data set, the model makes predictions and provides associated probabilities. The cases are sorted by their probability of making accurate predictions and a graph is made of the accuracy of the model over various subsets containing the highest probability cases as evaluated by the model. Where a number of probabilities are presented for the predictions in a basket of predictions, those probabilities are combined to yield a probability score for the entire basket. Additionally, the accuracy of a model over different basket sizes may be graphed. The accuracy graph may also be produced for any models making a prediction, by graphing the probability of making accurate predictions and a graph made of the accuracy of the model over various subsets of the data containing the highest probability cases.
    Type: Application
    Filed: June 28, 2002
    Publication date: January 1, 2004
    Applicant: Microsoft Corporation
    Inventors: Pyungchul Kim, Zhaohui Tang, David Earl Heckerman, Scott Conrad Oveson