Patents by Inventor Hagai Attias

Hagai Attias has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20060074973
    Abstract: A method and apparatus are provided for organizing media objects in a database using contextual information for a media object and known media objects, categories, indexes and searches, to arrive at an inference for cataloging the media object in a database. The media object may then be cataloged in the database according to the inference. A method and apparatus are provided for clustering media objects by forming groups of unlabeled data and applying a distance metric to said group.
    Type: Application
    Filed: November 21, 2005
    Publication date: April 6, 2006
    Applicant: Microsoft Corporation
    Inventors: John Platt, Jonathan Kagle, Hagai Attias, Victoria Milton
  • Patent number: 6990447
    Abstract: A probability distribution for speech model parameters, such as auto-regression parameters, is used to identify a distribution of denoised values from a noisy signal. Under one embodiment, the probability distributions of the speech model parameters and the denoised values are adjusted to improve a variational inference so that the variational inference better approximates the joint probability of the speech model parameters and the denoised values given a noisy signal. In some embodiments, this improvement is performed during an expectation step in an expectation-maximization algorithm. The statistical model can also be used to identify an average spectrum for the clean signal and this average spectrum may be provided to a speech recognizer instead of the estimate of the clean signal.
    Type: Grant
    Filed: November 15, 2001
    Date of Patent: January 24, 2006
    Assignee: Microsoft Corportion
    Inventors: Hagai Attias, John Carlton Platt, Li Deng, Alejandro Acero
  • Publication number: 20050262068
    Abstract: A system and method for generating responsibility vectors associated with multi-media files (e.g., audio and/or video files) is provided. The responsibility vectors are based upon responsibility of mixture components fitted to a mixture model for frames of the files. The responsibility vectors can be grouped based upon clustering related to extracted identifiable features of frames of the multi-media files. Once generated, responsibility vectors can be searched by a multi-media searching system. Also provided is a system for multi-media searching based, at least in part upon responsibility vectors associated with a query segment and multi-media files. The system can generate a query profile based, at least in part, upon responsibility vectors of frames of the query segment. The system can further generate segment profiles of segments of the multi-media files.
    Type: Application
    Filed: July 20, 2005
    Publication date: November 24, 2005
    Applicant: Microsoft Corporation
    Inventor: Hagai Attias
  • Patent number: 6957226
    Abstract: A system and method for generating responsibility vectors associated with multi-media files (e.g., audio and/or video files) is provided. The responsibility vectors are based upon responsibility of mixture components fitted to a mixture model for frames of the files. The responsibility vectors can be grouped based upon clustering related to extracted identifiable features of frames of the multi-media files. Once generated, responsibility vectors can be searched by a multi-media searching system. Also provided is a system for multi-media searching based, at least in part upon responsibility vectors associated with a query segment and multi-media files. The system can generate a query profile based, at least in part, upon responsibility vectors of frames of the query segment. The system can further generate segment profiles of segments of the multi-media files.
    Type: Grant
    Filed: June 27, 2002
    Date of Patent: October 18, 2005
    Assignee: Microsoft Corporation
    Inventor: Hagai Attias
  • Patent number: 6940540
    Abstract: A system and method facilitating object tracking is provided. The system includes an audio model that receives at least two audio input signals and a video model that receives a video input. The audio model and the video model employ probabilistic generative models which are combined to facilitate object tracking. Expectation maximization can be employed to modify trainable parameters of the audio model and the video model.
    Type: Grant
    Filed: June 27, 2002
    Date of Patent: September 6, 2005
    Assignee: Microsoft Corporation
    Inventors: Matthew James Beal, Nebojsa Jojic, Hagai Attias
  • Publication number: 20050185834
    Abstract: A method and apparatus are provided for learning a model for the appearance of an object while tracking the position of the object in three dimensions. Under embodiments of the present invention, this is achieved by combining a particle filtering technique for tracking the object's position with an expectation-maximization technique for learning the appearance of the object. Two stereo cameras are used to generate data for the learning and tracking.
    Type: Application
    Filed: February 20, 2004
    Publication date: August 25, 2005
    Applicant: Microsoft Corporation
    Inventors: Trausti Kristjansson, Hagai Attias, John Hershey
  • Patent number: 6931374
    Abstract: A method is developed which includes 1) defining a switching state space model for a continuous valued hidden production-related parameter and the observed speech acoustics, and 2) approximating a posterior probability that provides the likelihood of a sequence of the hidden production-related parameters and a sequence of speech units based on a sequence of observed input values. In approximating the posterior probability, the boundaries of the speech units are not fixed but are optimally determined. Under one embodiment, a mixture of Gaussian approximation is used. In another embodiment, an HMM posterior approximation is used.
    Type: Grant
    Filed: April 1, 2003
    Date of Patent: August 16, 2005
    Assignee: Microsoft Corporation
    Inventors: Hagai Attias, Leo Jingyu Lee, Li Deng
  • Publication number: 20050171971
    Abstract: A system and method facilitating object tracking is provided. The invention includes an audio model that receives at least two audio input signals and a video model that receives a video input. The audio model and the video model employ probabilistic generative models which are combined to facilitate object tracking. Expectation maximization can be employed to modify trainable parameters of the audio model and the video model.
    Type: Application
    Filed: March 31, 2005
    Publication date: August 4, 2005
    Applicant: Microsoft Corporation
    Inventors: Matthew Beal, Nebojsa Jojic, Hagai Attias
  • Publication number: 20050159951
    Abstract: A method of efficiently setting posterior probability parameters for a switching state space model begins by defining a window containing at least two but fewer than all of the frames. A separate posterior probability parameter is determined for each frame in the window. The window is then shifted sequentially from left to right in time so that it includes one or more subsequent frames in the sequence of frames. A separate posterior probability parameter is then determined for each frame in the shifted window. This method closely approximates a more rigorous solution but saves computational cost by two to three orders of magnitude. Further, a method of determining the optimal discrete state sequence in the switching state space model is invented that directly exploits the observation vector on a frame-by-frame basis and operates from left to right in time.
    Type: Application
    Filed: January 20, 2004
    Publication date: July 21, 2005
    Applicant: Microsoft Corporation
    Inventors: Hagai Attias, Li Deng, Leo Lee
  • Publication number: 20050119887
    Abstract: A method is developed which includes 1) defining a switching state space model for a continuous valued hidden production-related parameter and the observed speech acoustics, and 2) approximating a posterior probability that provides the likelihood of a sequence of the hidden production-related parameters and a sequence of speech units based on a sequence of observed input values. In approximating the posterior probability, the boundaries of the speech units are not fixed but are optimally determined. Under one embodiment, a mixture of Gaussian approximation is used. In another embodiment, an HMM posterior approximation is used.
    Type: Application
    Filed: November 9, 2004
    Publication date: June 2, 2005
    Applicant: Microsoft Corporation
    Inventors: Hagai Attias, Leo Lee, Li Deng
  • Publication number: 20050114134
    Abstract: A method and apparatus tracks vocal tract resonance components, including both frequencies and bandwidths, in a speech signal. The components are tracked by defining a state equation that is linear with respect to a past vocal tract resonance vector and that predicts a current vocal tract resonance vector. An observation equation is also defined that is linear with respect to a current vocal tract resonance vector and that predicts at least one component of an observation vector. The state equation, the observation equation, and a sequence of observation vectors are used to identify a sequence of vocal tract resonance vectors using Kalman filter algorithm. Under one embodiment, the observation equation is defined based on a piecewise linear approximation to a non-linear function. The parameters of the linear approximation are selected based on pre-defined regions, which are determined from a crude estimate of a vocal tract resonance vector.
    Type: Application
    Filed: November 26, 2003
    Publication date: May 26, 2005
    Applicant: Microsoft Corporation
    Inventors: Li Deng, Hagai Attias, Alejandro Acero, Leo Lee
  • Publication number: 20040267536
    Abstract: A system and method facilitating speech detection and/or enhancement utilizing audio/video fusion is provided. The present invention fuses audio and video in a probabilistic generative model that implements cross-model, self-supervised learning, enabling rapid adaptation to audio visual data. The system can learn to detect and enhance speech in noise given only a short (e.g., 30 second) sequence of audio-visual data. In addition, it automatically learns to track the lips as they move around in the video.
    Type: Application
    Filed: June 27, 2003
    Publication date: December 30, 2004
    Inventors: John R. Hershey, Trausti Thor Kristjansson, Hagai Attias, Nebojsa Jojic
  • Publication number: 20040260548
    Abstract: A system and method that facilitate modeling unobserved speech dynamics based upon a hidden dynamic speech model in the form of segmental switching state space model that employs model parameters including those describing the unobserved speech dynamics and those describing the relationship between the unobserved speech dynamic vector and the observed acoustic feature vector is provided. The model parameters are modified based, at least in part, upon, a variational learning technique. In accordance with an aspect of the present invention, novel and powerful variational expectation maximization (EM) algorithm(s) for the segmental switching state space models used in speech applications, which are capable of capturing key internal (or hidden) dynamics of natural speech production, are provided. For example, modification of model parameters can be based upon an approximate mixture of Gaussian (MOG) posterior and/or based upon an approximate hidden Markov model (HMM) posterior using a variational technique.
    Type: Application
    Filed: June 20, 2003
    Publication date: December 23, 2004
    Inventors: Hagai Attias, Li Deng, Leo J. Lee
  • Publication number: 20040199386
    Abstract: A method is developed which includes 1) defining a switching state space model for a continuous valued hidden production-related parameter and the observed speech acoustics, and 2) approximating a posterior probability that provides the likelihood of a sequence of the hidden production-related parameters and a sequence of speech units based on a sequence of observed input values. In approximating the posterior probability, the boundaries of the speech units are not fixed but are optimally determined. Under one embodiment, a mixture of Gaussian approximation is used. In another embodiment, an HMM posterior approximation is used.
    Type: Application
    Filed: April 1, 2003
    Publication date: October 7, 2004
    Applicant: Microsoft Corporation
    Inventors: Hagai Attias, Leo Jingyu Lee, Li Deng
  • Publication number: 20040167753
    Abstract: A model-based system and method for global optimization that utilizes quantum mechanics in order to approximate the global minimum of a given problem (e.g., mathematical function). A quantum mechanical particle with a sufficiently large mass has a ground state solution to the Schrödinger Equation which is localized to the global minimum of the energy field, or potential, it experiences. A given function is modeled as a potential, and a quantum mechanical particle with a sufficiently large mass is placed in the potential. The ground state of the particle is determined, and the probability density function of the ground state of the particle is calculated. The peak of the probability density function is localized to the global minimum of the potential.
    Type: Application
    Filed: February 21, 2003
    Publication date: August 26, 2004
    Inventors: Oliver B. Downs, Hagai Attias, Christopher J.C. Burges, Robert L. Rounthwaite
  • Publication number: 20040019483
    Abstract: A method of speech recognition is provided that identifies a production-related dynamics value by performing a linear interpolation between a production-related dynamics value at a previous time and a production-related target using a time-dependent interpolation weight. The hidden production-related dynamics value is used to compute a predicted value that is compared to an observed value of acoustics to determine the likelihood of the observed acoustics given a sequence of hidden phonological units. In some embodiments, the production-related dynamics value at the previous time is selected from a set of continuous values. In addition, the likelihood of the observed acoustics given a sequence of hidden phonological units is combined with a score associated with a discrete class of production-related dynamic values at the previous time to determine a score for a current phonological state.
    Type: Application
    Filed: October 9, 2002
    Publication date: January 29, 2004
    Inventors: Li Deng, Jian-Iai Zhou, Frank Torsten Bernd Seide, Asela J.R. Gunawardana, Hagai Attias, Alejandro Acero, Xuedong Huang
  • Publication number: 20040002858
    Abstract: A system and method facilitating signal enhancement utilizing mixture models is provided. The invention includes a signal enhancement adaptive system having a speech model, a noise model and a plurality of adaptive filter parameters. The signal enhancement adaptive system employs probabilistic modeling to perform signal enhancement of a plurality of windowed frequency transformed input signals received, for example, for an array of microphones. The signal enhancement adaptive system incorporates information about the statistical structure of speech signals. The signal enhancement adaptive system can be embedded in an overall enhancement system which also includes components of signal windowing and frequency transformation.
    Type: Application
    Filed: June 27, 2002
    Publication date: January 1, 2004
    Inventors: Hagai Attias, Li Deng
  • Publication number: 20040002935
    Abstract: A system and method for generating responsibility vectors associated with multi-media files (e.g., audio and/or video files) is provided. The responsibility vectors are based upon responsibility of mixture components fitted to a mixture model for frames of the files. The responsibility vectors can be grouped based upon clustering related to extracted identifiable features of frames of the multi-media files. Once generated, responsibility vectors can be searched by a multi-media searching system.
    Type: Application
    Filed: June 27, 2002
    Publication date: January 1, 2004
    Inventor: Hagai Attias
  • Publication number: 20040001143
    Abstract: A system and method facilitating object tracking is provided. The invention includes an audio model that receives at least two audio input signals and a video model that receives a video input. The audio model and the video model employ probabilistic generative models which are combined to facilitate object tracking. Expectation maximization can be employed to modify trainable parameters of the audio model and the video model.
    Type: Application
    Filed: June 27, 2002
    Publication date: January 1, 2004
    Inventors: Matthew James Beal, Nebojsa Jojic, Hagai Attias
  • Publication number: 20030093269
    Abstract: A probability distribution for speech model parameters, such as auto-regression parameters, is used to identify a distribution of denoised values from a noisy signal. Under one embodiment, the probability distributions of the speech model parameters and the denoised values are adjusted to improve a variational inference so that the variational inference better approximates the joint probability of the speech model parameters and the denoised values given a noisy signal. In some embodiments, this improvement is performed during an expectation step in an expectation-maximization algorithm. The statistical model can also be used to identify an average spectrum for the clean signal and this average spectrum may be provided to a speech recognizer instead of the estimate of the clean signal.
    Type: Application
    Filed: November 15, 2001
    Publication date: May 15, 2003
    Inventors: Hagai Attias, John Carlton Platt, Li Deng, Alejandro Acero