Patents by Inventor David M. Chickering

David M. Chickering has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 7184993
    Abstract: The present invention leverages approximations of distributions to provide tractable variational approximations, based on at least one continuous variable, for inference utilization in Bayesian networks where local distributions are decision-graphs. These tractable approximations are employed in lieu of exact inferences that are normally NP-hard to solve. By utilizing Jensen's inequality applied to logarithmic distributions composed of a generalized sum including an introduced arbitrary conditional distribution, a means is acquired to resolve a tightly bound likelihood distribution. The means includes application of Mean-Field Theory, approximations of conditional probability distributions, and/or other means that allow for a tractable variational approximation to be achieved.
    Type: Grant
    Filed: June 10, 2003
    Date of Patent: February 27, 2007
    Assignee: Microsoft Corporation
    Inventors: David E. Heckerman, Christopher A. Meek, David M. Chickering
  • Patent number: 7162489
    Abstract: The present invention leverages curve fitting data techniques to provide automatic detection of data anomalies in a “data tube” from a data perspective, allowing, for example, detection of data anomalies such as on-screen, drill down, and drill across data anomalies in, for example, pivot tables and/or OLAP cubes. It determines if data substantially deviates from a predicted value established by a curve fitting process such as, for example, a piece-wise linear function applied to the data tube. A threshold value can also be employed by the present invention to facilitate in determining a degree of deviation necessary before a data value is considered anomalous. The threshold value can be supplied dynamically and/or statically by a system and/or a user via a user interface. Additionally, the present invention provides an indication to a user of the type and location of a detected anomaly from a top level data perspective.
    Type: Grant
    Filed: December 12, 2005
    Date of Patent: January 9, 2007
    Assignee: Microsoft Corporation
    Inventors: Allan Folting, Bo Thiesson, David E. Heckerman, David M. Chickering, Eric Barber Vigesaa
  • Patent number: 7065534
    Abstract: The present invention leverages curve fitting data techniques to provide automatic detection of data anomalies in a “data tube” from a data perspective, allowing, for example, detection of data anomalies such as on-screen, drill down, and drill across data anomalies in, for example, pivot tables and/or OLAP cubes. It determines if data substantially deviates from a predicted value established by a curve fitting process such as, for example, a piece-wise linear function applied to the data tube. A threshold value can also be employed by the present invention to facilitate in determining a degree of deviation necessary before a data value is considered anomalous. The threshold value can be supplied dynamically and/or statically by a system and/or a user via a user interface. Additionally, the present invention provides an indication to a user of the type and location of a detected anomaly from a top level data perspective.
    Type: Grant
    Filed: June 23, 2004
    Date of Patent: June 20, 2006
    Assignee: Microsoft Corporation
    Inventors: Allan Folting, Bo Thiesson, David E. Heckerman, David M. Chickering, Eric Barber Vigesaa
  • Publication number: 20040260664
    Abstract: The present invention utilizes a cross-prediction scheme to predict values of discrete and continuous time observation data, wherein conditional variance of each continuous time tube variable is fixed to a small positive value. By allowing cross-predictions in an ARMA based model, values of continuous and discrete observations in a time series are accurately predicted. The present invention accomplishes this by extending an ARMA model such that a first time series “tube” is utilized to facilitate or “cross-predict” values in a second time series tube to form an “ARMAxp” model. In general, in the ARMAxp model, the distribution of each continuous variable is a decision graph having splits only on discrete variables and having linear regressions with continuous regressors at all leaves, and the distribution of each discrete variable is a decision graph having splits only on discrete variables and having additional distributions at all leaves.
    Type: Application
    Filed: June 17, 2003
    Publication date: December 23, 2004
    Inventors: Bo Thiesson, Christopher A. Meek, David M. Chickering, David E. Heckerman
  • Publication number: 20040254903
    Abstract: The present invention leverages approximations of distributions to provide tractable variational approximations, based on at least one continuous variable, for inference utilization in Bayesian networks where local distributions are decision-graphs. These tractable approximations are employed in lieu of exact inferences that are normally NP-hard to solve. By utilizing Jensen's inequality applied to logarithmic distributions composed of a generalized sum including an introduced arbitrary conditional distribution, a means is acquired to resolve a tightly bound likelihood distribution. The means includes application of Mean-Field Theory, approximations of conditional probability distributions, and/or other means that allow for a tractable variational approximation to be achieved.
    Type: Application
    Filed: June 10, 2003
    Publication date: December 16, 2004
    Inventors: David E. Heckerman, Christopher A. Meek, David M. Chickering
  • Publication number: 20040243548
    Abstract: A dependency network is created from a training data set utilizing a scalable method. A statistical model (or pattern), such as for example a Bayesian network, is then constructed to allow more convenient inferencing. The model (or pattern) is employed in lieu of the training data set for data access. The computational complexity of the method that produces the model (or pattern) is independent of the size of the original data set. The dependency network directly returns explicitly encoded data in the conditional probability distributions of the dependency network. Non-explicitly encoded data is generated via Gibbs sampling, approximated, or ignored.
    Type: Application
    Filed: May 29, 2003
    Publication date: December 2, 2004
    Inventors: Geoffrey J. Hulten, David M. Chickering, David E. Heckerman
  • Publication number: 20040181554
    Abstract: A system that incorporates an interactive graphical user interface for visualizing clusters (categories) and segments (summarized clusters) of data. Specifically, the system automatically categorizes incoming case data into clusters, summarizes those clusters into segments, determines similarity measures for the segments, scores the selected segments through the similarity measures, and then forms and visually depicts hierarchical organizations of those selected clusters. The.system also automatically and dynamically reduces, as necessary, a depth of the hierarchical organization, through elimination of unnecessary hierarchical levels and inter-nodal links, based on similarity measures of segments or segment groups. Attribute/value data that tends to meaningfully characterize each segment is also scored, rank ordered based on normalized scores, and then graphically displayed.
    Type: Application
    Filed: March 24, 2004
    Publication date: September 16, 2004
    Inventors: David E. Heckerman, Paul S. Bradley, David M. Chickering, Christopher A. Meek
  • Patent number: 6742003
    Abstract: A system that incorporates an interactive graphical user interface for visualizing clusters (categories) and segments (summarized clusters) of data. Specifically, the system automatically categorizes incoming case data into clusters, summarizes those clusters into segments, determines similarity measures for the segments, scores the selected segments through the similarity measures, and then forms and visually depicts hierarchical organizations of those selected clusters. The system also automatically and dynamically reduces, as necessary, a depth of the hierarchical organization, through elimination of unnecessary hierarchical levels and inter-nodal links, based on similarity measures of segments or segment groups. Attribute/value data that tends to meaningfully characterize each segment is also scored, rank ordered based on normalized scores, and then graphically displayed.
    Type: Grant
    Filed: April 30, 2001
    Date of Patent: May 25, 2004
    Assignee: Microsoft Corporation
    Inventors: David E. Heckerman, Paul S. Bradley, David M. Chickering, Christopher A. Meek
  • Publication number: 20040073528
    Abstract: The present invention relates to a system and methodology to generate and provide a lift chart to determine accuracy of one or more models that predict continuous variable data. Systems and processes are provided that process continuous variable prediction data in accordance with various analytical techniques. The processed data is then formatted for display, wherein model performance can then be determined by comparisons between models and/or by comparisons to idealized model performance. In one aspect, a system is provided that generates a continuous variable prediction lift chart. The system includes an analyzer that receives data from one or more models and a continuous variable test data set, wherein the formatter then generates a lift chart based on the analyzed models and the continuous variable test data set.
    Type: Application
    Filed: October 15, 2002
    Publication date: April 15, 2004
    Inventors: Zhaohui Tang, David E. Heckerman, David M. Chickering
  • Patent number: 6718315
    Abstract: Disclosed is a system for approximating conditional probabilities using an annotated decision tree where predictor values that did not exist in training data for the system are tracked, stored, and referenced to determine if statistical aggregation should be invoked. Further disclosed is a system for storing statistics for deriving a non-leaf probability corresponding to predictor values, and a system for aggregating such statistics to approximate conditional probabilities.
    Type: Grant
    Filed: December 18, 2000
    Date of Patent: April 6, 2004
    Assignee: Microsoft Corporation
    Inventors: Christopher A. Meek, David M. Chickering, Jeffrey R. Bernhardt, Robert L. Rounthwaite
  • Patent number: 6529888
    Abstract: An improved belief network generator is provided. A belief network is generated utilizing expert knowledge retrieved from an expert in a given field of expertise and empirical data reflecting observations made in the given field of the expert. In addition to utilizing expert knowledge and empirical data, the belief network generator provides for the use of continuous variables in the generated belief network and missing data in the empirical data.
    Type: Grant
    Filed: October 30, 1996
    Date of Patent: March 4, 2003
    Assignee: Microsoft Corporation
    Inventors: David E. Heckerman, Dan Geiger, David M. Chickering
  • Publication number: 20030018652
    Abstract: A system that incorporates an interactive graphical user interface for visualizing clusters (categories) and segments (summarized clusters) of data. Specifically, the system automatically categorizes incoming case data into clusters, summarizes those clusters into segments, determines similarity measures for the segments, scores the selected segments through the similarity measures, and then forms and visually depicts hierarchical organizations of those selected clusters. The system also automatically and dynamically reduces, as necessary, a depth of the hierarchical organization, through elimination of unnecessary hierarchical levels and inter-nodal links, based on similarity measures of segments or segment groups. Attribute/value data that tends to meaningfully characterize each segment is also scored, rank ordered based on normalized scores, and then graphically displayed.
    Type: Application
    Filed: April 30, 2001
    Publication date: January 23, 2003
    Applicant: Microsoft Corporation
    Inventors: David E. Heckerman, Paul S. Bradley, David M. Chickering, Christopher A. Meek
  • Patent number: 5802256
    Abstract: An improved belief network generator is provided. A belief network is generated utilizing expert knowledge retrieved from an expert in a given field of expertise and empirical data reflecting observations made in the given field of the expert. In addition to utilizing expert knowledge and empirical data, the belief network generator provides for the use of continuous variables in the generated belief network and missing data in the empirical data.
    Type: Grant
    Filed: May 23, 1997
    Date of Patent: September 1, 1998
    Assignee: Microsoft Corporation
    Inventors: David E. Heckerman, Dan Geiger, David M. Chickering
  • Patent number: 5704018
    Abstract: An improved belief network generator is provided. In a preferred embodiment of the present invention, a belief network is generated utilizing expert knowledge retrieved from an expert in a given field of expertise and empirical data reflecting observations made in the given field of the expert. In addition to utilizing expert knowledge and empirical data, the belief network generator of the preferred embodiment provides for the use of continuous variables in the generated belief network and missing data in the empirical data.
    Type: Grant
    Filed: May 9, 1994
    Date of Patent: December 30, 1997
    Assignee: Microsoft Corporation
    Inventors: David E. Heckerman, Dan Geiger, David M. Chickering
  • Patent number: 5696884
    Abstract: An improved belief network generator is provided. A belief network is generated utilizing expert knowledge retrieved from an expert in a given field of expertise and empirical data reflecting observations made in the given field of the expert. In addition to utilizing expert knowledge and empirical data, the belief network generator of the preferred embodiment provides for the use of continuous variables in the generated belief network and missing data in the empirical data.
    Type: Grant
    Filed: October 30, 1996
    Date of Patent: December 9, 1997
    Assignee: Microsoft Corporation
    Inventors: David E. Heckerman, Dan Geiger, David M. Chickering