Patents by Inventor Bo Thiesson

Bo Thiesson has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20050288883
    Abstract: The present invention leverages curve fitting data techniques to provide automatic detection of data anomalies in a “data tube” from a data perspective, allowing, for example, detection of data anomalies such as on-screen, drill down, and drill across data anomalies in, for example, pivot tables and/or OLAP cubes. It determines if data substantially deviates from a predicted value established by a curve fitting process such as, for example, a piece-wise linear function applied to the data tube. A threshold value can also be employed by the present invention to facilitate in determining a degree of deviation necessary before a data value is considered anomalous. The threshold value can be supplied dynamically and/or statically by a system and/or a user via a user interface. Additionally, the present invention provides an indication to a user of the type and location of a detected anomaly from a top level data perspective.
    Type: Application
    Filed: June 23, 2004
    Publication date: December 29, 2005
    Applicant: Microsoft Corporation
    Inventors: Allan Folting, Bo Thiesson, David Heckerman, David Chickering, Eric Vigesaa
  • Publication number: 20050267717
    Abstract: Determining the near-optimal block size for incremental-type expectation maximization (EM) algorithms is disclosed. Block size is determined based on the novel insight that the speed increase resulting from using an incremental-type EM algorithm as opposed to the standard EM algorithm is roughly the same for a given range of block sizes. Furthermore, this block size can be determined by an initial version of the EM algorithm that does not reach convergence. For a current block size, the speed increase is determined, and if the speed increase is the greatest determined so far, the current block size is set as the target block size. This process is repeated for new block sizes, until no new block sizes can be determined.
    Type: Application
    Filed: July 8, 2005
    Publication date: December 1, 2005
    Applicant: Microsoft Corporation
    Inventors: Bo Thiesson, Christopher Meek, David Heckerman
  • Publication number: 20050234960
    Abstract: The present invention leverages machine learning techniques to provide automatic generation of conditioning variables for constructing a data perspective for a given target variable. The present invention determines and analyzes the best target variable predictors for a given target variable, employing them to facilitate the conveying of information about the target variable to a user. It automatically discretizes continuous and discrete variables utilized as target variable predictors to establish their granularity. In other instances of the present invention, a complexity and/or utility parameter can be specified to facilitate generation of the data perspective via analyzing a best target variable predictor versus the complexity of the conditioning variable(s) and/or utility. The present invention can also adjust the conditioning variables (i.e.
    Type: Application
    Filed: April 14, 2004
    Publication date: October 20, 2005
    Applicant: Microsoft Corporation
    Inventors: David Chickering, Bo Thiesson, Carl Kadie, David Heckerman, Christopher Meek, Allan Folting, Eric Vigesaa
  • Patent number: 6922660
    Abstract: Determining the near-optimal block size for incremental-type expectation maximization (EM) algorithms is disclosed. Block size is determined based on the novel insight that the speed increase resulting from using an incremental-type EM algorithm as opposed to the standard EM algorithm is roughly the same for a given range of block sizes. Furthermore, this block size can be determined by an initial version of the EM algorithm that does not reach convergence. For a current block size, the speed increase is determined, and if the speed increase is the greatest determined so far, the current block size is set as the target block size. This process is repeated for new block sizes, until no new block sizes can be determined.
    Type: Grant
    Filed: December 1, 2000
    Date of Patent: July 26, 2005
    Assignee: Microsoft Corporation
    Inventors: Bo Thiesson, Christopher A. Meek, David E. Heckerman
  • Publication number: 20050027665
    Abstract: The present invention relates to a system and method to facilitate data mining applications and automated evaluation of models for continuous variable data. In one aspect, a system is provided that facilitates decision tree learning. The system includes a learning component that generates non-standardized data that relates to a split in a decision tree and a scoring component that scores the split as if the non-standardized data at a subset of leaves of the decision tree had been shifted and/or scaled. A modification component can also be provided for a respective candidate split score on the decision tree, wherein the above data or data subset can be modified by shifting and/or scaling the data and a new score is computed on the modified data. Furthermore, an optimization component can be provided that analyzes the data and determines whether to treat the data as if it was: (1) shifted, (2) scaled, or (3) shifted and scaled.
    Type: Application
    Filed: July 28, 2003
    Publication date: February 3, 2005
    Inventors: Bo Thiesson, David Chickering
  • Publication number: 20040260664
    Abstract: The present invention utilizes a cross-prediction scheme to predict values of discrete and continuous time observation data, wherein conditional variance of each continuous time tube variable is fixed to a small positive value. By allowing cross-predictions in an ARMA based model, values of continuous and discrete observations in a time series are accurately predicted. The present invention accomplishes this by extending an ARMA model such that a first time series “tube” is utilized to facilitate or “cross-predict” values in a second time series tube to form an “ARMAxp” model. In general, in the ARMAxp model, the distribution of each continuous variable is a decision graph having splits only on discrete variables and having linear regressions with continuous regressors at all leaves, and the distribution of each discrete variable is a decision graph having splits only on discrete variables and having additional distributions at all leaves.
    Type: Application
    Filed: June 17, 2003
    Publication date: December 23, 2004
    Inventors: Bo Thiesson, Christopher A. Meek, David M. Chickering, David E. Heckerman
  • Publication number: 20040234128
    Abstract: The present invention utilizes generic and user-specific features of handwriting samples to provide adaptive handwriting recognition with a minimum level of user-specific enrollment data. By allowing generic and user-specific classifiers to facilitate in a recognition process, the features of a specific user's handwriting can be exploited to quickly ascertain characteristics of handwriting characters not yet entered by the user. Thus, new characters can be recognized without requiring a user to first enter that character as enrollment or “training” data. In one instance of the present invention, processing of generic features is accomplished by a generic classifier trained on multiple users. In another instance of the present invention, a user-specific classifier is employed to modify a generic classifier's classification as required to provide user-specific handwriting recognition.
    Type: Application
    Filed: May 21, 2003
    Publication date: November 25, 2004
    Inventors: Bo Thiesson, Christopher A. Meek
  • Publication number: 20040236576
    Abstract: The present invention utilizes a discriminative density model selection method to provide an optimized density model subset employable in constructing a classifier. By allowing multiple alternative density models to be considered for each class in a multi-class classification system and then developing an optimal configuration comprised of a single density model for each class, the classifier can be tuned to exhibit a desired characteristic such as, for example, high classification accuracy, low cost, and/or a balance of both. In one instance of the present invention, error graph, junction tree, and min-sum propagation algorithms are utilized to obtain an optimization from discriminatively selected density models.
    Type: Application
    Filed: May 20, 2003
    Publication date: November 25, 2004
    Inventors: Bo Thiesson, Christopher A. Meek
  • Patent number: 6807537
    Abstract: One aspect of the invention is the construction of mixtures of Bayesian networks. Another aspect of the invention is the use of such mixtures of Bayesian networks to perform inferencing. A mixture of Bayesian networks (MBN) consists of plural hypothesis-specific Bayesian networks (HSBNs) having possibly hidden and observed variables. A common external hidden variable is associated with the MBN, but is not included in any of the HSBNs. The number of HSBNs in the MBN corresponds to the number of states of the common external hidden variable, and each HSBN is based upon the hypothesis that the common external hidden variable is in a corresponding one of those states. In one mode of the invention, the MBN having the highest MBN score is selected for use in performing inferencing.
    Type: Grant
    Filed: December 4, 1997
    Date of Patent: October 19, 2004
    Assignee: Microsoft Corporation
    Inventors: Bo Thiesson, Christopher A. Meek, David Maxwell Chickering, David Earl Heckerman
  • Publication number: 20040073537
    Abstract: A system and method for generating staged mixture model(s) is provided. The staged mixture model includes a plurality of mixture components each having an associated mixture weight, and, an added mixture component having an initial structure, parameters and associated mixture weight. The added mixture component is modified based, at least in part, upon a case that is undesirably addressed by the plurality of mixture components using a structural expectation maximization (SEM) algorithm to modify at the structure, parameters and/or associated mixture weight of the added mixture component.
    Type: Application
    Filed: October 15, 2002
    Publication date: April 15, 2004
    Inventors: Bo Thiesson, Christopher A. Meek, David E. Heckerman
  • Patent number: 6694301
    Abstract: Clustering for purposes of data visualization and making predictions is disclosed. Embodiments of the invention are operable on a number of variables that have a predetermined representation. The variables include input-only variables, output-only variables, and both input-and-output variables. Embodiments of the invention generate a model that has a bottleneck architecture. The model includes a top layer of nodes of at least the input-only variables, one or more middle layer of hidden nodes, and a bottom layer of nodes of the output-only and the input-and-output variables. At least one cluster is determined from this model. The model can be a probabilistic neural network and/or a Bayesian network.
    Type: Grant
    Filed: March 31, 2000
    Date of Patent: February 17, 2004
    Assignee: Microsoft Corporation
    Inventors: David E. Heckerman, D. Maxwell Chickering, John C. Platt, Christopher A. Meek, Bo Thiesson
  • Publication number: 20040002940
    Abstract: A technique for reducing a model database for use with handwriting recognizers. The model database is processed with a tuning set to identify a set of models that would result in the greatest character recognition accuracy. If further model database reduction is desired, the technique iteratively identifies smaller models that have the least adverse effect on the error rate. The technique continues identifying smaller models until a desired model database size has been achieved.
    Type: Application
    Filed: June 28, 2002
    Publication date: January 1, 2004
    Applicant: Microsoft Corporation
    Inventors: Christopher Meek, Bo Thiesson, John R. Bennett
  • Patent number: 6496816
    Abstract: One aspect of the invention is the construction of mixtures of Bayesian networks. Another aspect of the invention is the use of such mixtures of Bayesian networks to perform inferencing. A mixture of Bayesian networks (MBN) consists of plural hypothesis-specific Bayesian networks (HSBNs) having possibly hidden and observed variables. A common external hidden variable is associated with the MBN, but is not included in any of the HSBNs. The number of HSBNs in the MBN corresponds to the number of states of the common external hidden variable, and each HSBN is based upon the hypothesis that the common external hidden variable is in a corresponding one of those states. In one mode of the invention, the MBN having the highest MBN score is selected for use in performing inferencing.
    Type: Grant
    Filed: December 23, 1998
    Date of Patent: December 17, 2002
    Assignee: Microsoft Corporation
    Inventors: Bo Thiesson, Christopher A. Meek, David Maxwell Chickering, David Earl Heckerman
  • Publication number: 20020095277
    Abstract: Determining the near-optimal block size for incremental-type expectation maximization (EM) algorithms is disclosed. Block size is determined based on the novel insight that the speed increase resulting from using an incremental-type EM algorithm as opposed to the standard EM algorithm is roughly the same for a given range of block sizes. Furthermore, this block size can be determined by an initial version of the EM algorithm that does not reach convergence. For a current block size, the speed increase is determined, and if the speed increase is the greatest determined so far, the current block size is set as the target block size. This process is repeated for new block sizes, until no new block sizes can be determined.
    Type: Application
    Filed: December 1, 2000
    Publication date: July 18, 2002
    Inventors: Bo Thiesson, Christopher A. Meek, David E. Heckerman
  • Patent number: 6408290
    Abstract: One aspect of the invention is the construction of mixtures of Bayesian networks. Another aspect of the invention is the use of such mixtures of Bayesian networks to perform inferencing. A mixture of Bayesian networks (MBN) consists of plural hypothesis-specific Bayesian networks (HSBNs) having possibly hidden and observed variables. A common external hidden variable is associated with the MBN, but is not included in any of the HSBNs. The number of HSBNs in the MBN corresponds to the number of states of the common external hidden variable, and each HSBN is based upon the hypothesis that the common external hidden variable is in a corresponding one of those states. In one mode of the invention, the MBN having the highest MBN score is selected for use in performing inferencing.
    Type: Grant
    Filed: December 23, 1998
    Date of Patent: June 18, 2002
    Assignee: Microsoft Corporation
    Inventors: Bo Thiesson, Christopher A. Meek, David Maxwell Chickering, David Earl Heckerman
  • Patent number: 6345265
    Abstract: The invention employs mixtures of Bayesian networks to perform clustering. A mixture of Bayesian networks (MBN) consists of plural hypothesis-specific Bayesian networks (HSBNs) having possibly hidden and observed variables. A common external hidden variable is associated with the MBN, but is not included in any of the HSBNs. The number of HSBNs in the MBN corresponds to the number of states of the common external hidden variable, and each HSBN is based upon the hypothesis that the common external hidden variable is in a corresponding one of those states. In one mode of the invention, the MBN having the highest MBN score is selected for use in performing inferencing. The invention determines membership of an individual case in a cluster based upon a set of data of plural individual cases by first learning the structure and parameters of an MBN given that data and then using the MBN to compute the probability of each HSBN generating the data of the individual case.
    Type: Grant
    Filed: December 23, 1998
    Date of Patent: February 5, 2002
    Inventors: Bo Thiesson, Christopher A. Meek, David Maxwell Chickering, David Earl Heckerman
  • Patent number: 6336108
    Abstract: The invention performs speech recognition using an array of mixtures of Bayesian networks. A mixture of Bayesian networks (MBN) consists of plural hypothesis-specific Bayesian networks (HSBNs) having possibly hidden and observed variables. A common external hidden variable is associated with the MBN, but is not included in any of the HSBNs. The number of HSBNs in the MBN corresponds to the number of states of the common external hidden variable, and each HSBN models the world under the hypothesis that the common external hidden variable is in a corresponding one of those states. In accordance with the invention, the MBNs encode the probabilities of observing the sets of acoustic observations given the utterance of a respective one of said parts of speech. Each of the HSBNs encodes the probabilities of observing the sets of acoustic observations given the utterance of a respective one of the parts of speech and given a hidden common variable being in a particular state.
    Type: Grant
    Filed: December 23, 1998
    Date of Patent: January 1, 2002
    Assignee: Microsoft Corporation
    Inventors: Bo Thiesson, Christopher A. Meek, David Maxwell Chickering, David Earl Heckerman, Fileno A. Alleva, Mei-Yuh Hwang