Patents by Inventor Zhaohui Tang

Zhaohui Tang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 7062408
    Abstract: Systems and methods are provided for producing a mining model accuracy display that depicts the model's accuracy at predicting a state for a multiple-state variable. The model predicts a state and provides an associated probability for each case. Points are graphed such that one coordinate of the data point corresponds to a number N of cases and the other coordinate corresponds to the number of correct predictions made in the top N cases by probability.
    Type: Grant
    Filed: October 25, 2004
    Date of Patent: June 13, 2006
    Assignee: Microsoft Corporation
    Inventors: Zhaohui Tang, Pyunchul Kim
  • Patent number: 7028036
    Abstract: Distribution displays for categories are provided which illuminate the distribution of continuous attributes over all cases in a category, and which provide a histogram of the population of the different states of categorical attributes. An array of such displays by attribute (in one dimension) and category (in another dimension) may be provided. Category diagram displays are also provided for visualizing the different categories, and their distributions, populations, and similarities. These are displayed through different shading of nodes and edges representing categories and the relationship between two categories, and through proximity of nodes.
    Type: Grant
    Filed: June 28, 2002
    Date of Patent: April 11, 2006
    Assignee: Microsoft Corporation
    Inventors: David Maxwell Chickering, Zhaohui Tang, David Earl Heckerman, Robert L. Rounthwaite, Alexei V. Bocharov, Scott Conrad Oveson
  • Publication number: 20060020620
    Abstract: The subject disclosure pertains to extensible data mining systems, means, and methodologies. For example, a data mining system is disclosed that supports plug-in or integration of non-native mining algorithms, perhaps provided by third parties, such that they function the same as built-in algorithms. Furthermore, non-native data mining viewers may also be seamlessly integrated into the system for displaying the results of one or more algorithms including those provided by third parties as well as those built-in. Still further yet, support is provided for extending data mining languages to include user-defined functions (UDFs).
    Type: Application
    Filed: June 21, 2005
    Publication date: January 26, 2006
    Applicant: Microsoft Corporation
    Inventors: Raman Iyer, Ioan Crivat, C. MacLennan, Scott Oveson, Rong Guan, ZhaoHui Tang, Pyungchul Kim, Irina Gorbach
  • Publication number: 20060010110
    Abstract: A system that facilitates data mining comprises a reception component that receives command(s) in a declarative language that relate to utilizing an output of a first data mining model as an input to a second data mining model. An implementation component analyzes the received command(s) and implements the command(s) with respect to the first and second data mining models. In another aspect of the subject invention, the reception component can receive further command(s) in a declarative language with respect to causing one or more of the first and second data mining models to output a prediction, the prediction desirably generated without prediction input, the implementation component causes the one or more of the first and second data mining models to output the prediction.
    Type: Application
    Filed: February 2, 2005
    Publication date: January 12, 2006
    Applicant: Microsoft Corporation
    Inventors: Pyungchul Kim, ZhaoHui Tang, Ioan Crivat, C. MacLennan, Raman Iyer, Irina Gorbach
  • Publication number: 20060010142
    Abstract: The subject invention relates to systems and methods to extend the capabilities of declarative data modeling languages. In one aspect, a declarative data modeling language system is provided. The system includes a data modeling language component that generates one or more data mining models to extract predictive information from local or remote databases. A language extension component facilitates modeling capability in the data modeling language by providing a data sequence model or a time series model within the data modeling language to support various data mining applications.
    Type: Application
    Filed: April 28, 2005
    Publication date: January 12, 2006
    Applicant: Microsoft Corporation
    Inventors: Pyungchul Kim, C. MacLennan, ZhaoHui Tang
  • Publication number: 20050283357
    Abstract: A method for performing data mining is provided. The method includes selecting at least one data source of unstructured text. Additionally, a transformation is selected to identify a list of terms in the unstructured text. A run-time path is established to connect the data source to the transformation to load the list of terms identified into a destination database.
    Type: Application
    Filed: October 21, 2004
    Publication date: December 22, 2005
    Applicant: Microsoft Corporation
    Inventors: C. MacLennan, Hang Li, Ming Zhou, Yunbo Cao, ZhaoHui Tang
  • Publication number: 20050283459
    Abstract: A language schema that integrates multidimensional extensions (e.g., MDX) and data mining extensions (e.g., DMX) for performing data mining operations on data residing in OLAP cubes. The schema provides that the <source-data-query> can not only be a relational query, rather a multidimensional query formed using MDX, for example. The operations of model creation, training and prediction are described.
    Type: Application
    Filed: June 22, 2004
    Publication date: December 22, 2005
    Applicant: Microsoft Corporation
    Inventors: C. MacLennan, Pyungchul Kim, ZhaoHui Tang
  • Patent number: 6931391
    Abstract: Systems and methods are provided for generating prediction queries to help a user build and execute prediction queries. A user interface (UI) is provided that is easy to use and understand in connection with the generation of a prediction query for data mining. The UI can be instantiated from a variety of disparate sources that may request query building services. While prediction queries and relational queries are quite different, the UI enables prediction queries to be built in a manner that is similar to the way relational queries are built. In one embodiment, the main screen of the UI includes four main components: (1) a table column mapping area, (3) a selection grid area, (4) a query text display area and (5) a query result grid area. In one embodiment, the query text display area and the query result grid area are initially not presented to the user.
    Type: Grant
    Filed: June 21, 2002
    Date of Patent: August 16, 2005
    Assignee: Microsoft Corporation
    Inventors: Zhaohui Tang, Rong Jian Guan, Amir M. Netz, Scott Conrad Oveson
  • Publication number: 20050144163
    Abstract: Systems and methods are provided for generating prediction queries to help a user build and execute prediction queries. A user interface (UI) is provided that is easy to use and understand in connection with the generation of a prediction query for data mining. The UI can be instantiated from a variety of disparate sources that may request query building services. While prediction queries and relational queries are quite different, the UI enables prediction queries to be built in a manner that is similar to the way relational queries are built. In one embodiment, the main screen of the UI includes four main components: (1) a table column mapping area, (3) a selection grid area, (4) a query text display area and (5) a query result grid area. In one embodiment, the query text display area and the query result grid area are initially not presented to the user.
    Type: Application
    Filed: January 7, 2005
    Publication date: June 30, 2005
    Applicant: Microsoft Corporation
    Inventors: Zhaohui Tang, Rong Guan, Amir Netz, Scott Oveson
  • Publication number: 20050108285
    Abstract: Distribution displays for categories are provided which illuminate the distribution of continuous attributes over all cases in a category, and which provide a histogram of the population of the different states of categorical attributes. An array of such displays by attribute (in one dimension) and category (in another dimension) may be provided. Category diagram displays are also provided for visualizing the different categories, and their distributions, populations, and similarities. These are displayed through different shading of nodes and edges representing categories and the relationship between two categories, and through proximity of nodes.
    Type: Application
    Filed: September 30, 2004
    Publication date: May 19, 2005
    Applicant: Microsoft Corporation
    Inventors: David Chickering, Zhaohui Tang, David Heckerman, Robert Rounthwaite, Alexei Bocharov, Scott Oveson
  • Publication number: 20050108196
    Abstract: Distribution displays for categories are provided which illuminate the distribution of continuous attributes over all cases in a category, and which provide a histogram of the population of the different states of categorical attributes. An array of such displays by attribute (in one dimension) and category (in another dimension) may be provided. Category diagram displays are also provided for visualizing the different categories, and their distributions, populations, and similarities. These are displayed through different shading of nodes and edges representing categories and the relationship between two categories, and through proximity of nodes.
    Type: Application
    Filed: September 30, 2004
    Publication date: May 19, 2005
    Applicant: Microsoft Corporation
    Inventors: David Chickering, Zhaohui Tang, David Heckerman, Robert Rounthwaite, Alexei Bocharov, Scott Oveson
  • Publication number: 20050108284
    Abstract: Distribution displays for categories are provided which illuminate the distribution of continuous attributes over all cases in a category, and which provide a histogram of the population of the different states of categorical attributes. An array of such displays by attribute (in one dimension) and category (in another dimension) may be provided. Category diagram displays are also provided for visualizing the different categories, and their distributions, populations, and similarities. These are displayed through different shading of nodes and edges representing categories and the relationship between two categories, and through proximity of nodes.
    Type: Application
    Filed: September 30, 2004
    Publication date: May 19, 2005
    Applicant: Microsoft Corporation
    Inventors: David Chickering, Zhaohui Tang, David Heckerman, Robert Rounthwaite, Alexei Bocharov, Scott Oveson
  • Publication number: 20050060331
    Abstract: Systems and methods are provided for producing a mining model accuracy display that depicts the model's accuracy at predicting a state for a multiple-state variable. The model predicts a state and provides an associated probability for each case. Points are graphed such that one coordinate of the data point corresponds to a number N of cases and the other coordinate corresponds to the number of correct predictions made in the top N cases by probability.
    Type: Application
    Filed: October 25, 2004
    Publication date: March 17, 2005
    Applicant: Microsoft Corporation
    Inventors: Zhaohui Tang, Pyungchul Kim
  • Publication number: 20050041027
    Abstract: Distribution displays for categories are provided which illuminate the distribution of continuous attributes over all cases in a category, and which provide a histogram of the population of the different states of categorical attributes. An array of such displays by attribute (in one dimension) and category (in another dimension) may be provided. Category diagram displays are also provided for visualizing the different categories, and their distributions, populations, and similarities. These are displayed through different shading of nodes and edges representing categories and the relationship between two categories, and through proximity of nodes.
    Type: Application
    Filed: September 30, 2004
    Publication date: February 24, 2005
    Applicant: Microsoft Corporation
    Inventors: David Chickering, Zhaohui Tang, David Heckerman, Robert Rounthwaite, Alexei Bocharov, Scott Oveson
  • Publication number: 20050027478
    Abstract: Systems and methods are provided for producing a mining model accuracy display that depicts the model's accuracy at predicting a state for a multiple-state variable. The model predicts a state and provides an associated probability for each case. Points are graphed such that one coordinate of the data point corresponds to a number N of cases and the other coordinate corresponds to the number of correct predictions made in the top N cases by probability.
    Type: Application
    Filed: September 1, 2004
    Publication date: February 3, 2005
    Applicant: Microsoft Corporation
    Inventors: Zhaohui Tang, Pyungchul Kim
  • Publication number: 20050021489
    Abstract: A mining structure is created which contains processed data from a data set. This data may be used to train one or more models. In addition to the selection of data to be used by model from data set, processing parameters are set, in one embodiment. For example, the discretization of a continuous variable into buckets, the number of buckets, and/or the sub-range corresponding to each bucket is set when the mining structure is created. The mining structure is processed, which causes the processing and storage of data from data set in the mining structure. After processing, the mining structure can be used by one or more models.
    Type: Application
    Filed: July 22, 2003
    Publication date: January 27, 2005
    Inventors: C. MacLennan, Zhaohui Tang, Pyungchul Kim, Raman Iyer
  • Publication number: 20050021482
    Abstract: A drill-through feature is provided which provides a universal drill-through to mining model source data from a trained mining model. In order for a user or application to obtain model content information on a given node of a model, a universal function is provided whereby the user specifies the node for a model and data set, and the cases underlying that node for that model and data set are returned. A sampling of underlying cases may be provided, where only a sampling of the cases represented in the node is requested.
    Type: Application
    Filed: June 30, 2003
    Publication date: January 27, 2005
    Inventors: Pyungchul Kim, C. MacLennan, Zhaohui Tang, Raman Iyer
  • Patent number: 6810357
    Abstract: Systems and methods are provided for producing a mining model accuracy display that depicts the model's accuracy at predicting a state for a multiple-state variable. The model predicts a state and provides an associated probability for each case. Points are graphed such that one coordinate of the data point corresponds to a number N of cases and the other coordinate corresponds to the number of correct predictions made in the top N cases by probability.
    Type: Grant
    Filed: June 28, 2002
    Date of Patent: October 26, 2004
    Assignee: Microsoft Corporation
    Inventors: Zhaohui Tang, Pyungchul Kim
  • Publication number: 20040073528
    Abstract: The present invention relates to a system and methodology to generate and provide a lift chart to determine accuracy of one or more models that predict continuous variable data. Systems and processes are provided that process continuous variable prediction data in accordance with various analytical techniques. The processed data is then formatted for display, wherein model performance can then be determined by comparisons between models and/or by comparisons to idealized model performance. In one aspect, a system is provided that generates a continuous variable prediction lift chart. The system includes an analyzer that receives data from one or more models and a continuous variable test data set, wherein the formatter then generates a lift chart based on the analyzed models and the continuous variable test data set.
    Type: Application
    Filed: October 15, 2002
    Publication date: April 15, 2004
    Inventors: Zhaohui Tang, David E. Heckerman, David M. Chickering
  • Publication number: 20040002929
    Abstract: Systems and methods are provided for producing displays of the accuracy of data mining or statistical models that produce associative predictions. For all cases in a testing data set, the model makes predictions and provides associated probabilities. The cases are sorted by their probability of making accurate predictions and a graph is made of the accuracy of the model over various subsets containing the highest probability cases as evaluated by the model. Where a number of probabilities are presented for the predictions in a basket of predictions, those probabilities are combined to yield a probability score for the entire basket. Additionally, the accuracy of a model over different basket sizes may be graphed. The accuracy graph may also be produced for any models making a prediction, by graphing the probability of making accurate predictions and a graph made of the accuracy of the model over various subsets of the data containing the highest probability cases.
    Type: Application
    Filed: June 28, 2002
    Publication date: January 1, 2004
    Applicant: Microsoft Corporation
    Inventors: Pyungchul Kim, Zhaohui Tang, David Earl Heckerman, Scott Conrad Oveson