Patents by Inventor David E. Heckerman

David E. Heckerman has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 7249162
    Abstract: The invention relates to a system for filtering messages—the system includes a seed filter having associated therewith a false positive rate and a false negative rate. A new filter is also provided for filtering the messages, the new filter is evaluated according to the false positive rate and the false negative rate of the seed filter, the data used to determine the false positive rate and the false negative rate of the seed filter are utilized to determine a new false positive rate and a new false negative rate of the new filter as a function of threshold. The new filter is employed in lieu of the seed filter if a threshold exists for the new filter such that the new false positive rate and new false negative rate are together considered better than the false positive and the false negative rate of the seed filter.
    Type: Grant
    Filed: February 25, 2003
    Date of Patent: July 24, 2007
    Assignee: Microsoft Corporation
    Inventors: Robert L. Rounthwaite, Joshua T. Goodman, David E. Heckerman, John C. Platt, Carl M. Kadie
  • Patent number: 7246048
    Abstract: Determining the near-optimal block size for incremental-type expectation maximization (EM) algorithms is disclosed. Block size is determined based on the novel insight that the speed increase resulting from using an incremental-type EM algorithm as opposed to the standard EM algorithm is roughly the same for a given range of block sizes. Furthermore, this block size can be determined by an initial version of the EM algorithm that does not reach convergence. For a current block size, the speed increase is determined, and if the speed increase is the greatest determined so far, the current block size is set as the target block size. This process is repeated for new block sizes, until no new block sizes can be determined.
    Type: Grant
    Filed: July 8, 2005
    Date of Patent: July 17, 2007
    Assignee: Microsoft Corporation
    Inventors: Bo Thiesson, Christopher A. Meek, David E. Heckerman
  • Patent number: 7225200
    Abstract: The present invention leverages machine learning techniques to provide automatic generation of conditioning variables for constructing a data perspective for a given target variable. The present invention determines and analyzes the best target variable predictors for a given target variable, employing them to facilitate the conveying of information about the target variable to a user. It automatically discretizes continuous and discrete variables utilized as target variable predictors to establish their granularity. In other instances of the present invention, a complexity and/or utility parameter can be specified to facilitate generation of the data perspective via analyzing a best target variable predictor versus the complexity of the conditioning variable(s) and/or utility. The present invention can also adjust the conditioning variables (i.e.
    Type: Grant
    Filed: April 14, 2004
    Date of Patent: May 29, 2007
    Assignee: Microsoft Corporation
    Inventors: David M. Chickering, Bo Thiesson, Carl M. Kadie, David E. Heckerman, Christopher A. Meek, Allan Folting, Eric B. Vigesaa
  • Patent number: 7219148
    Abstract: The subject invention provides for a feedback loop system and method that facilitate classifying items in connection with spam prevention in server and/or client-based architectures. The invention makes uses of a machine-learning approach as applied to spam filters, and in particular, randomly samples incoming email messages so that examples of both legitimate and junk/spam mail are obtained to generate sets of training data. Users which are identified as spam-fighters are asked to vote on whether a selection of their incoming email messages is individually either legitimate mail or junk mail. A database stores the properties for each mail and voting transaction such as user information, message properties and content summary, and polling results for each message to generate training data for machine learning systems. The machine learning systems facilitate creating improved spam filter(s) that are trained to recognize both legitimate mail and spam mail and to distinguish between them.
    Type: Grant
    Filed: March 3, 2003
    Date of Patent: May 15, 2007
    Assignee: Microsoft Corporation
    Inventors: Robert L. Rounthwaite, Joshua T. Goodman, David E. Heckerman, John D. Mehr, Nathan D. Howell, Micah C. Rupersburg, Dean A. Slawson
  • Patent number: 7200267
    Abstract: The invention performs handwriting recognition using mixtures of Bayesian networks. A mixture of Bayesian networks (MBN) consists of plural hypothesis-specific Bayesian networks (HSBNs) having possibly hidden and observed variables. A common external hidden variable is associated with the MBN, but is not included in any of the HSBNs. Each HSBN models the world under the hypothesis that the common external hidden variable is in a corresponding one of its states. The MBNs encode the probabilities of observing the sets of visual observations corresponding to a handwritten character. Each of the HSBNs encodes the probabilities of observing the sets of visual observations corresponding to a handwritten character and given a hidden common variable being in a particular state.
    Type: Grant
    Filed: December 30, 2005
    Date of Patent: April 3, 2007
    Assignee: Microsoft Corporation
    Inventors: John Bennett, David E. Heckerman, Christopher A. Meek, Bo Thiesson
  • Patent number: 7184993
    Abstract: The present invention leverages approximations of distributions to provide tractable variational approximations, based on at least one continuous variable, for inference utilization in Bayesian networks where local distributions are decision-graphs. These tractable approximations are employed in lieu of exact inferences that are normally NP-hard to solve. By utilizing Jensen's inequality applied to logarithmic distributions composed of a generalized sum including an introduced arbitrary conditional distribution, a means is acquired to resolve a tightly bound likelihood distribution. The means includes application of Mean-Field Theory, approximations of conditional probability distributions, and/or other means that allow for a tractable variational approximation to be achieved.
    Type: Grant
    Filed: June 10, 2003
    Date of Patent: February 27, 2007
    Assignee: Microsoft Corporation
    Inventors: David E. Heckerman, Christopher A. Meek, David M. Chickering
  • Patent number: 7162489
    Abstract: The present invention leverages curve fitting data techniques to provide automatic detection of data anomalies in a “data tube” from a data perspective, allowing, for example, detection of data anomalies such as on-screen, drill down, and drill across data anomalies in, for example, pivot tables and/or OLAP cubes. It determines if data substantially deviates from a predicted value established by a curve fitting process such as, for example, a piece-wise linear function applied to the data tube. A threshold value can also be employed by the present invention to facilitate in determining a degree of deviation necessary before a data value is considered anomalous. The threshold value can be supplied dynamically and/or statically by a system and/or a user via a user interface. Additionally, the present invention provides an indication to a user of the type and location of a detected anomaly from a top level data perspective.
    Type: Grant
    Filed: December 12, 2005
    Date of Patent: January 9, 2007
    Assignee: Microsoft Corporation
    Inventors: Allan Folting, Bo Thiesson, David E. Heckerman, David M. Chickering, Eric Barber Vigesaa
  • Patent number: 7158959
    Abstract: The invention provides systems and methods that can be used for targeted advertising. The system determines where to present impressions, such as advertisements, to maximize an expected utility subject to one or more constraints, which can include quotas and minimum utilities for groups of one or more impression. The traditional measure of utility in web-based advertising is click-though rates, but the present invention provides a broader definition of utility, including measures of sales, profits, or brand awareness, for example. This broader definition permits advertisements to be allocated more in accordance with the actual interests of advertisers.
    Type: Grant
    Filed: January 31, 2005
    Date of Patent: January 2, 2007
    Assignee: Microsoft Corporation
    Inventors: David Maxwell Chickering, David E. Heckerman
  • Patent number: 7143075
    Abstract: The invention provides systems and methods that can be used for targeted advertising. The system determines where to present impressions, such as advertisements, to maximize an expected utility subject to one or more constraints, which can include quotas and minimum utilities for groups of one or more impression. The traditional measure of utility in web-based advertising is click-though rates, but the present invention provides a broader definition of utility, including measures of sales, profits, or brand awareness, for example. This broader definition permits advertisements to be allocated more in accordance with the actual interests of advertisers.
    Type: Grant
    Filed: March 6, 2001
    Date of Patent: November 28, 2006
    Assignee: Microsoft Corporation
    Inventors: David Maxwell Chickering, David E. Heckerman
  • Patent number: 7133811
    Abstract: A system and method for generating staged mixture model(s) is provided. The staged mixture model includes a plurality of mixture components each having an associated mixture weight, and, an added mixture component having an initial structure, parameters and associated mixture weight. The added mixture component is modified based, at least in part, upon a case that is undesirably addressed by the plurality of mixture components using a structural expectation maximization (SEM) algorithm to modify at the structure, parameters and/or associated mixture weight of the added mixture component. The staged mixture model employs a data-driven staged mixture modeling technique, for example, for building density, regression, and classification model(s). The basic approach is to add mixture component(s) (e.g., sequentially) to the staged mixture model using an SEM algorithm.
    Type: Grant
    Filed: October 15, 2002
    Date of Patent: November 7, 2006
    Assignee: Microsoft Corporation
    Inventors: Bo Thiesson, Christopher A. Meek, David E. Heckerman
  • Patent number: 7076544
    Abstract: A streaming media caching mechanism and cache manager efficiently establish and maintain the contents of a streaming media cache for use in serving streaming media requests from cache rather than from an original data source when appropriate. The cost of caching is incurred only when the benefits of caching are likely to be experienced. The caching mechanism and cache manager evaluate the request count for each requested URL to determine whether the URL represents a cache candidate, and further analyze the URL request rate to determine whether the content associated with the URL will be cached. In an embodiment, the streaming media cache is maintained with a predetermined amount of reserve capacity rather than being filled to capacity whenever possible.
    Type: Grant
    Filed: April 8, 2002
    Date of Patent: July 11, 2006
    Assignee: Microsoft Corporation
    Inventors: Ariel Katz, Yifat Sagiv, Guy Friedel, David E. Heckerman, John R. Douceur, Joshua Goodman
  • Patent number: 7065534
    Abstract: The present invention leverages curve fitting data techniques to provide automatic detection of data anomalies in a “data tube” from a data perspective, allowing, for example, detection of data anomalies such as on-screen, drill down, and drill across data anomalies in, for example, pivot tables and/or OLAP cubes. It determines if data substantially deviates from a predicted value established by a curve fitting process such as, for example, a piece-wise linear function applied to the data tube. A threshold value can also be employed by the present invention to facilitate in determining a degree of deviation necessary before a data value is considered anomalous. The threshold value can be supplied dynamically and/or statically by a system and/or a user via a user interface. Additionally, the present invention provides an indication to a user of the type and location of a detected anomaly from a top level data perspective.
    Type: Grant
    Filed: June 23, 2004
    Date of Patent: June 20, 2006
    Assignee: Microsoft Corporation
    Inventors: Allan Folting, Bo Thiesson, David E. Heckerman, David M. Chickering, Eric Barber Vigesaa
  • Patent number: 7058592
    Abstract: The transmission of information during ad click-through is disclosed. In one embodiment, a computer-implemented method selects an ad to be displayed on a web page, as one of a plurality of ads within a current cluster in which each of the ad has a probability to be selected. The method displays the ad on the web page, and then detects activation—for example, click-through—of the displayed ad. The method transmits information to an entity associated with the ad, such as an advertiser, upon detecting click-through or other activation of the ad. In one embodiment, the information transmitted includes information regarding the current cluster.
    Type: Grant
    Filed: November 29, 1999
    Date of Patent: June 6, 2006
    Assignee: Microsoft Corporation
    Inventors: David E. Heckerman, D. Maxwell Chickering, Daniel Rosen
  • Patent number: 7003158
    Abstract: The invention performs handwriting recognition using mixtures of Bayesian networks. A mixture of Bayesian networks (MBN) consists of plural hypothesis-specific Bayesian networks (HSBNs) having possibly hidden and observed variables. A common external hidden variable is associated with the MBN, but is not included in any of the HSBNs. Each HSBN models the world under the hypothesis that the common external hidden variable is in a corresponding one of its states. The MBNs encode the probabilities of observing the sets of visual observations corresponding to a handwritten character. Each of the HSBNs encodes the probabilities of observing the sets of visual observations corresponding to a handwritten character and given a hidden common variable being in a particular state.
    Type: Grant
    Filed: February 14, 2002
    Date of Patent: February 21, 2006
    Assignee: Microsoft Corporation
    Inventors: John Bennett, David E. Heckerman, Christopher A. Meek, Bo Thiesson
  • Patent number: 6922660
    Abstract: Determining the near-optimal block size for incremental-type expectation maximization (EM) algorithms is disclosed. Block size is determined based on the novel insight that the speed increase resulting from using an incremental-type EM algorithm as opposed to the standard EM algorithm is roughly the same for a given range of block sizes. Furthermore, this block size can be determined by an initial version of the EM algorithm that does not reach convergence. For a current block size, the speed increase is determined, and if the speed increase is the greatest determined so far, the current block size is set as the target block size. This process is repeated for new block sizes, until no new block sizes can be determined.
    Type: Grant
    Filed: December 1, 2000
    Date of Patent: July 26, 2005
    Assignee: Microsoft Corporation
    Inventors: Bo Thiesson, Christopher A. Meek, David E. Heckerman
  • Publication number: 20040267596
    Abstract: The present invention provides collaborative filtering systems and methods employing statistical smoothing to provide quickly creatable models that can efficiently predict probability that a user likes an item and/or similarities between items. Smoothing is accomplished by utilizing statistical methods such as support cutoff, single and multiple prior on counts, and prior on measure of association and the like. By improving model-based collaborative filtering with such techniques, performance is increased with regard to product-to-product recommendations. The present invention also provides improvements over systems based on dependency nets (DN) in both areas of quality of recommendations and speed of model creation. It can also be complementary to DN to improve the value of an existing collaborative filtering system's overall efficiency. It is also employable with low frequency user preference data.
    Type: Application
    Filed: June 25, 2003
    Publication date: December 30, 2004
    Inventors: Jesper B. Lind, Carl M. Kadie, Christopher A. Meek, David E. Heckerman
  • Publication number: 20040260664
    Abstract: The present invention utilizes a cross-prediction scheme to predict values of discrete and continuous time observation data, wherein conditional variance of each continuous time tube variable is fixed to a small positive value. By allowing cross-predictions in an ARMA based model, values of continuous and discrete observations in a time series are accurately predicted. The present invention accomplishes this by extending an ARMA model such that a first time series “tube” is utilized to facilitate or “cross-predict” values in a second time series tube to form an “ARMAxp” model. In general, in the ARMAxp model, the distribution of each continuous variable is a decision graph having splits only on discrete variables and having linear regressions with continuous regressors at all leaves, and the distribution of each discrete variable is a decision graph having splits only on discrete variables and having additional distributions at all leaves.
    Type: Application
    Filed: June 17, 2003
    Publication date: December 23, 2004
    Inventors: Bo Thiesson, Christopher A. Meek, David M. Chickering, David E. Heckerman
  • Publication number: 20040260776
    Abstract: The subject invention provides for an advanced and robust system and method that facilitates detecting spam. The system and method include components as well as other operations which enhance or promote finding characteristics that are difficult or the spammer to avoid and finding characteristics in non-spam that are difficult for spammers to duplicate. Exemplary characteristics include examining origination features in pairs analyzing character and/or number sequences, strings, and sub-strings, detecting various entropy levels of one or more character sequences, strings and/or sub-strings as well as analyzing message and/or feature sizes.
    Type: Application
    Filed: June 23, 2003
    Publication date: December 23, 2004
    Inventors: Bryan T. Starbuck, Robert L. Rounthwaite, David E. Heckerman, Joshua T. Goodman, Eliot C. Gillum, Nathan D. Howell, Kenneth R. Aldinger
  • Publication number: 20040254903
    Abstract: The present invention leverages approximations of distributions to provide tractable variational approximations, based on at least one continuous variable, for inference utilization in Bayesian networks where local distributions are decision-graphs. These tractable approximations are employed in lieu of exact inferences that are normally NP-hard to solve. By utilizing Jensen's inequality applied to logarithmic distributions composed of a generalized sum including an introduced arbitrary conditional distribution, a means is acquired to resolve a tightly bound likelihood distribution. The means includes application of Mean-Field Theory, approximations of conditional probability distributions, and/or other means that allow for a tractable variational approximation to be achieved.
    Type: Application
    Filed: June 10, 2003
    Publication date: December 16, 2004
    Inventors: David E. Heckerman, Christopher A. Meek, David M. Chickering
  • Patent number: 6831663
    Abstract: The system and method of the present invention automatically assigns “scores” to the predictor/variable value pairs of a conventional probabilistic model to measure the relative impact or influence of particular elements of a set of topics, items, products, etc. in making specific predictions using the probabilistic model. In particular, these scores measure the relative impact, either positive or negative, that the value of each individual predictor variable has on the posterior distribution of the target topic, item, product, etc., for which a probability is being determined. These scores are useful for understanding why each prediction is make, and how much impact each predictor has on the prediction. Consequently, such scores are useful for explaining why a particular prediction or recommendation was made.
    Type: Grant
    Filed: May 24, 2001
    Date of Patent: December 14, 2004
    Assignee: Microsoft Corporation
    Inventors: David Maxwell Chickering, David E. Heckerman, Robert Rounthwaite