Patents by Inventor Pyungchul Kim

Pyungchul Kim has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 8019758
    Abstract: Methods for updating an information retrieval system are disclosed. In one embodiment, search terms affiliated with mappings or associations that represent a connection of relevancy between a query and an asset are pushed as content updates to a client system (e.g., as new updates or utilized to replace older data). The search terms are inserted (e.g., inserted as metadata) into corresponding content (the content associated with the asset). In this manner, content-searching data can be updated (e.g., remotely updated) as frequently as desired, even periodically, or selectively as new manually and/or automatically derived data becomes available. In another embodiment, the update data is already built into the content when it is delivered to a client machine. Other disclosed embodiments pertain to methods for generating a data mining classification model that is a blended representation of associations (e.g., query-asset associations) having different characteristics and/or different originating sources.
    Type: Grant
    Filed: June 21, 2005
    Date of Patent: September 13, 2011
    Assignee: Microsoft Corporation
    Inventors: Zijian Zheng, Frederic Behr, Pyungchul Kim, Steven Fox
  • Patent number: 7802197
    Abstract: A system for dynamically updating user accessible features of a software application on a client computer has a user interface, a local usage data file, and a data mining engine. The user interface is adapted to receive operator inputs. The local usage data file is adapted to store usage information corresponding to the operator inputs. The data mining engine is adapted to process the stored usage information and to generate local adjustments to a user interface of the software application based on the operator inputs. In one embodiment, a server is adapted to receive usage data from a plurality of application instances on a plurality of client computers and to generate global adjustments based on the received usage data. In one embodiment, the system has a merge feature adapted to blend and resolve conflicts between local and global adjustments to generate an interface adjustment for the user interface.
    Type: Grant
    Filed: April 22, 2005
    Date of Patent: September 21, 2010
    Assignee: Microsoft Corporation
    Inventors: Sin Shyh Lew, Pyungchul Kim, Sanjeev Katariya, Zijian Zheng
  • Patent number: 7747641
    Abstract: The subject invention relates to systems and methods to extend the capabilities of declarative data modeling languages. In one aspect, a declarative data modeling language system is provided. The system includes a data modeling language component that generates one or more data mining models to extract predictive information from local or remote databases. A language extension component facilitates modeling capability in the data modeling language by providing a data sequence model or a time series model within the data modeling language to support various data mining applications.
    Type: Grant
    Filed: April 28, 2005
    Date of Patent: June 29, 2010
    Assignee: Microsoft Corporation
    Inventors: Pyungchul Kim, C. James MacLennan, ZhaoHui Tang
  • Patent number: 7653611
    Abstract: The subject invention leverages data logging of responses to diagnostic reports to provide data that can be mined for diagnostic report quality information. Instances of the subject invention provide an initial diagnostic report assessment means to facilitate review by an entity. The entity's responses to the sorted diagnostic reports are logged unobtrusively to create diagnostic report quality data. This data is then analyzed by an analysis means that can then adjust the assessment means to improve its performance. In this manner, the performance of the assessment means is increased while reducing the workload of the entity reviewing the diagnostic reports. Other instances of the subject invention facilitate to increase the performance of a diagnostic report generating means as well. Instances of the subject invention can also employ machine learning techniques to facilitate in analyzing the quality data and/or in assessing the diagnostic reports.
    Type: Grant
    Filed: March 30, 2005
    Date of Patent: January 26, 2010
    Assignee: Microsoft Corporation
    Inventors: Zijian Zheng, Mark B. Mydland, Pyungchul Kim, Nancy E. Jacobs
  • Patent number: 7627555
    Abstract: A language schema that integrates multidimensional extensions (e.g., MDX) and data mining extensions (e.g., DMX) for performing data mining operations on data residing in OLAP cubes. The schema provides that the <source-data-query> can not only be a relational query, rather a multidimensional query formed using MDX, for example. The operations of model creation, training and prediction are described.
    Type: Grant
    Filed: June 22, 2004
    Date of Patent: December 1, 2009
    Assignee: Microsoft Corporation
    Inventors: C. James MacLennan, Pyungchul Kim, ZhaoHui Tang
  • Patent number: 7398268
    Abstract: A system that facilitates data mining comprises a reception component that receives command(s) in a declarative language that relate to utilizing an output of a first data mining model as an input to a second data mining model. An implementation component analyzes the received command(s) and implements the command(s) with respect to the first and second data mining models. In another aspect of the subject invention, the reception component can receive further command(s) in a declarative language with respect to causing one or more of the first and second data mining models to output a prediction, the prediction desirably generated without prediction input, the implementation component causes the one or more of the first and second data mining models to output the prediction.
    Type: Grant
    Filed: February 2, 2005
    Date of Patent: July 8, 2008
    Assignee: Microsoft Corporation
    Inventors: Pyungchul Kim, ZhaoHui Tang, Ioan Bogdan Crivat, C. James MacLennan, Raman S. Iyer, Irina G. Gorbach
  • Patent number: 7383234
    Abstract: The subject disclosure pertains to extensible data mining systems, means, and methodologies. For example, a data mining system is disclosed that supports plug-in or integration of non-native mining algorithms, perhaps provided by third parties, such that they function the same as built-in algorithms. Furthermore, non-native data mining viewers may also be seamlessly integrated into the system for displaying the results of one or more algorithms including those provided by third parties as well as those built-in. Still further yet, support is provided for extending data mining languages to include user-defined functions (UDFs).
    Type: Grant
    Filed: June 21, 2005
    Date of Patent: June 3, 2008
    Assignee: Microsoft Corporation
    Inventors: Raman S. Iyer, Ioan Bogdan Crivat, C. James MacLennan, Scott C. Oveson, Rong J. Guan, ZhaoHui Tang, Pyungchul Kim, Irina G. Gorbach
  • Patent number: 7379843
    Abstract: Systems and methods are provided for producing a mining model accuracy display that depicts the model's accuracy at predicting a state for a multiple-state variable. The model predicts a state and provides an associated probability for each case. Points are graphed such that one coordinate of the data point corresponds to a number N of cases and the other coordinate corresponds to the number of correct predictions made in the top N cases by probability.
    Type: Grant
    Filed: September 1, 2004
    Date of Patent: May 27, 2008
    Assignee: Microsoft Corporation
    Inventors: Zhaohui Tang, Pyungchul Kim
  • Patent number: 7251639
    Abstract: Selection of certain attributes as output and input attributes is provided so a decision tree may be created more efficiently. For each possible output attribute an interestingness score is calculated. This interestingness score is based on entropy of the output attribute and a desirable entropy constant. The attributes with the highest interestingness score are used as output attributes in the creation of the decision tree. Score gains for the input attribute over the output attributes are calculated using a conventional scoring algorithm. The sum of the score gains over all output attributes for each input attribute is calculated. The attributes with the highest score gain sums are used as input attributes in the creation of the decision tree.
    Type: Grant
    Filed: June 27, 2002
    Date of Patent: July 31, 2007
    Assignee: Microsoft Corporation
    Inventors: Jeffrey R. Bernhardt, Pyungchul Kim, C. James MacLennan
  • Patent number: 7209924
    Abstract: Continuous attributes are used as input attributes in decision tree creation. Buckets are created by dividing the range of values for the continuous attribute into sub-ranges of equal extent. These buckets form initial partitions. Mergers of adjacent partitions are considered to determine score gains from such mergers, and the most useful mergers occur. The resulting partitions are used as the discretization of the continuous attribute for use as an input attribute.
    Type: Grant
    Filed: June 28, 2002
    Date of Patent: April 24, 2007
    Assignee: Microsoft Corporation
    Inventors: Jeffrey R. Bernhardt, Pyungchul Kim, C. James MacLennan, David Maxwell Chickering
  • Patent number: 7188090
    Abstract: A drill-through feature is provided which provides a universal drill-through to mining model source data from a trained mining model. In order for a user or application to obtain model content information on a given node of a model, a universal function is provided whereby the user specifies the node for a model and data set, and the cases underlying that node for that model and data set are returned. A sampling of underlying cases may be provided, where only a sampling of the cases represented in the node is requested.
    Type: Grant
    Filed: June 30, 2003
    Date of Patent: March 6, 2007
    Assignee: Microsoft Corporation
    Inventors: Pyungchul Kim, C. James MacLennan, Zhaohui Tang, Raman Iyer
  • Publication number: 20070010966
    Abstract: Systems and methods are provided for producing displays of the accuracy of data mining or statistical models that produce associative predictions. For all cases in a testing data set, the model makes predictions and provides associated probabilities. The cases are sorted by their probability of making accurate predictions and a graph is made of the accuracy of the model over various subsets containing the highest probability cases as evaluated by the model. Where a number of probabilities are presented for the predictions in a basket of predictions, those probabilities are combined to yield a probability score for the entire basket. Additionally, the accuracy of a model over different basket sizes may be graphed. The accuracy graph may also be produced for any models making a prediction, by graphing the probability of making accurate predictions and a graph made of the accuracy of the model over various subsets of the data containing the highest probability cases.
    Type: Application
    Filed: September 11, 2006
    Publication date: January 11, 2007
    Applicant: Microsoft Corporation
    Inventors: Pyungchul Kim, Zhaohui Tang, David Heckerman, Scott Oveson
  • Publication number: 20060288038
    Abstract: A computer-implemented method includes training a data mining classification model to statistically account for query-to-asset associations. This data mining classification model can be utilized as a component of an information retrieval system.
    Type: Application
    Filed: June 21, 2005
    Publication date: December 21, 2006
    Applicant: Microsoft Corporation
    Inventors: Zijian Zheng, Frederic Behr, Pyungchul Kim, Steven Fox
  • Publication number: 20060241908
    Abstract: The subject invention leverages data logging of responses to diagnostic reports to provide data that can be mined for diagnostic report quality information. Instances of the subject invention provide an initial diagnostic report assessment means to facilitate review by an entity. The entity's responses to the sorted diagnostic reports are logged unobtrusively to create diagnostic report quality data. This data is then analyzed by an analysis means that can then adjust the assessment means to improve its performance. In this manner, the performance of the assessment means is increased while reducing the workload of the entity reviewing the diagnostic reports. Other instances of the subject invention facilitate to increase the performance of a diagnostic report generating means as well. Instances of the subject invention can also employ machine learning techniques to facilitate in analyzing the quality data and/or in assessing the diagnostic reports.
    Type: Application
    Filed: March 30, 2005
    Publication date: October 26, 2006
    Applicant: Microsoft Corporation
    Inventors: Zijian Zheng, Mark Mydland, Pyungchul Kim, Nancy Jacobs
  • Publication number: 20060242638
    Abstract: A system for dynamically updating user accessible features of a software application on a client computer has a user interface, a local usage data file, and a data mining engine. The user interface is adapted to receive operator inputs. The local usage data file is adapted to store usage information corresponding to the operator inputs. The data mining engine is adapted to process the stored usage information and to generate local adjustments to a user interface of the software application based on the operator inputs. In one embodiment, a server is adapted to receive usage data from a plurality of application instances on a plurality of client computers and to generate global adjustments based on the received usage data. In one embodiment, the system has a merge feature adapted to blend and resolve conflicts between local and global adjustments to generate an interface adjustment for the user interface.
    Type: Application
    Filed: April 22, 2005
    Publication date: October 26, 2006
    Applicant: Microsoft Corporation
    Inventors: Sin Shyh Lew, Pyungchul Kim, Sanjeev Katariya, Zijian Zheng
  • Patent number: 7124054
    Abstract: Systems and methods are provided for producing displays of the accuracy of data mining or statistical models that produce associative predictions. For all cases in a testing data set, the model makes predictions and provides associated probabilities. The cases are sorted by their probability of making accurate predictions and a graph is made of the accuracy of the model over various subsets containing the highest probability cases as evaluated by the model. Where a number of probabilities are presented for the predictions in a basket of predictions, those probabilities are combined to yield a probability score for the entire basket. Additionally, the accuracy of a model over different basket sizes may be graphed. The accuracy graph may also be produced for any models making a prediction, by graphing the probability of making accurate predictions and a graph made of the accuracy of the model over various subsets of the data containing the highest probability cases.
    Type: Grant
    Filed: June 28, 2002
    Date of Patent: October 17, 2006
    Assignee: Microsoft Corporation
    Inventors: Pyungchul Kim, Zhaohui Tang, David Earl Heckerman, Scott Conrad Oveson
  • Publication number: 20060020620
    Abstract: The subject disclosure pertains to extensible data mining systems, means, and methodologies. For example, a data mining system is disclosed that supports plug-in or integration of non-native mining algorithms, perhaps provided by third parties, such that they function the same as built-in algorithms. Furthermore, non-native data mining viewers may also be seamlessly integrated into the system for displaying the results of one or more algorithms including those provided by third parties as well as those built-in. Still further yet, support is provided for extending data mining languages to include user-defined functions (UDFs).
    Type: Application
    Filed: June 21, 2005
    Publication date: January 26, 2006
    Applicant: Microsoft Corporation
    Inventors: Raman Iyer, Ioan Crivat, C. MacLennan, Scott Oveson, Rong Guan, ZhaoHui Tang, Pyungchul Kim, Irina Gorbach
  • Publication number: 20060010142
    Abstract: The subject invention relates to systems and methods to extend the capabilities of declarative data modeling languages. In one aspect, a declarative data modeling language system is provided. The system includes a data modeling language component that generates one or more data mining models to extract predictive information from local or remote databases. A language extension component facilitates modeling capability in the data modeling language by providing a data sequence model or a time series model within the data modeling language to support various data mining applications.
    Type: Application
    Filed: April 28, 2005
    Publication date: January 12, 2006
    Applicant: Microsoft Corporation
    Inventors: Pyungchul Kim, C. MacLennan, ZhaoHui Tang
  • Publication number: 20060010110
    Abstract: A system that facilitates data mining comprises a reception component that receives command(s) in a declarative language that relate to utilizing an output of a first data mining model as an input to a second data mining model. An implementation component analyzes the received command(s) and implements the command(s) with respect to the first and second data mining models. In another aspect of the subject invention, the reception component can receive further command(s) in a declarative language with respect to causing one or more of the first and second data mining models to output a prediction, the prediction desirably generated without prediction input, the implementation component causes the one or more of the first and second data mining models to output the prediction.
    Type: Application
    Filed: February 2, 2005
    Publication date: January 12, 2006
    Applicant: Microsoft Corporation
    Inventors: Pyungchul Kim, ZhaoHui Tang, Ioan Crivat, C. MacLennan, Raman Iyer, Irina Gorbach
  • Publication number: 20050283459
    Abstract: A language schema that integrates multidimensional extensions (e.g., MDX) and data mining extensions (e.g., DMX) for performing data mining operations on data residing in OLAP cubes. The schema provides that the <source-data-query> can not only be a relational query, rather a multidimensional query formed using MDX, for example. The operations of model creation, training and prediction are described.
    Type: Application
    Filed: June 22, 2004
    Publication date: December 22, 2005
    Applicant: Microsoft Corporation
    Inventors: C. MacLennan, Pyungchul Kim, ZhaoHui Tang