Patents by Inventor Hagai Attias

Hagai Attias has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Managing media objects in a database

Publication number: 20060074973

Abstract: A method and apparatus are provided for organizing media objects in a database using contextual information for a media object and known media objects, categories, indexes and searches, to arrive at an inference for cataloging the media object in a database. The media object may then be cataloged in the database according to the inference. A method and apparatus are provided for clustering media objects by forming groups of unlabeled data and applying a distance metric to said group.

Type: Application

Filed: November 21, 2005

Publication date: April 6, 2006

Applicant: Microsoft Corporation

Inventors: John Platt, Jonathan Kagle, Hagai Attias, Victoria Milton
Method and apparatus for denoising and deverberation using variational inference and strong speech models

Patent number: 6990447

Abstract: A probability distribution for speech model parameters, such as auto-regression parameters, is used to identify a distribution of denoised values from a noisy signal. Under one embodiment, the probability distributions of the speech model parameters and the denoised values are adjusted to improve a variational inference so that the variational inference better approximates the joint probability of the speech model parameters and the denoised values given a noisy signal. In some embodiments, this improvement is performed during an expectation step in an expectation-maximization algorithm. The statistical model can also be used to identify an average spectrum for the clean signal and this average spectrum may be provided to a speech recognizer instead of the estimate of the clean signal.

Type: Grant

Filed: November 15, 2001

Date of Patent: January 24, 2006

Assignee: Microsoft Corportion

Inventors: Hagai Attias, John Carlton Platt, Li Deng, Alejandro Acero
Searching multimedia databases using multimedia queries

Publication number: 20050262068

Abstract: A system and method for generating responsibility vectors associated with multi-media files (e.g., audio and/or video files) is provided. The responsibility vectors are based upon responsibility of mixture components fitted to a mixture model for frames of the files. The responsibility vectors can be grouped based upon clustering related to extracted identifiable features of frames of the multi-media files. Once generated, responsibility vectors can be searched by a multi-media searching system. Also provided is a system for multi-media searching based, at least in part upon responsibility vectors associated with a query segment and multi-media files. The system can generate a query profile based, at least in part, upon responsibility vectors of frames of the query segment. The system can further generate segment profiles of segments of the multi-media files.

Type: Application

Filed: July 20, 2005

Publication date: November 24, 2005

Applicant: Microsoft Corporation

Inventor: Hagai Attias
Searching multi-media databases using multi-media queries

Patent number: 6957226

Abstract: A system and method for generating responsibility vectors associated with multi-media files (e.g., audio and/or video files) is provided. The responsibility vectors are based upon responsibility of mixture components fitted to a mixture model for frames of the files. The responsibility vectors can be grouped based upon clustering related to extracted identifiable features of frames of the multi-media files. Once generated, responsibility vectors can be searched by a multi-media searching system. Also provided is a system for multi-media searching based, at least in part upon responsibility vectors associated with a query segment and multi-media files. The system can generate a query profile based, at least in part, upon responsibility vectors of frames of the query segment. The system can further generate segment profiles of segments of the multi-media files.

Type: Grant

Filed: June 27, 2002

Date of Patent: October 18, 2005

Assignee: Microsoft Corporation

Inventor: Hagai Attias
Speaker detection and tracking using audiovisual data

Patent number: 6940540

Abstract: A system and method facilitating object tracking is provided. The system includes an audio model that receives at least two audio input signals and a video model that receives a video input. The audio model and the video model employ probabilistic generative models which are combined to facilitate object tracking. Expectation maximization can be employed to modify trainable parameters of the audio model and the video model.

Type: Grant

Filed: June 27, 2002

Date of Patent: September 6, 2005

Assignee: Microsoft Corporation

Inventors: Matthew James Beal, Nebojsa Jojic, Hagai Attias
Method and apparatus for scene learning and three-dimensional tracking using stereo video cameras

Publication number: 20050185834

Abstract: A method and apparatus are provided for learning a model for the appearance of an object while tracking the position of the object in three dimensions. Under embodiments of the present invention, this is achieved by combining a particle filtering technique for tracking the object's position with an expectation-maximization technique for learning the appearance of the object. Two stereo cameras are used to generate data for the learning and tracking.

Type: Application

Filed: February 20, 2004

Publication date: August 25, 2005

Applicant: Microsoft Corporation

Inventors: Trausti Kristjansson, Hagai Attias, John Hershey
Method of speech recognition using variational inference with switching state space models

Patent number: 6931374

Abstract: A method is developed which includes 1) defining a switching state space model for a continuous valued hidden production-related parameter and the observed speech acoustics, and 2) approximating a posterior probability that provides the likelihood of a sequence of the hidden production-related parameters and a sequence of speech units based on a sequence of observed input values. In approximating the posterior probability, the boundaries of the speech units are not fixed but are optimally determined. Under one embodiment, a mixture of Gaussian approximation is used. In another embodiment, an HMM posterior approximation is used.

Type: Grant

Filed: April 1, 2003

Date of Patent: August 16, 2005

Assignee: Microsoft Corporation

Inventors: Hagai Attias, Leo Jingyu Lee, Li Deng
Speaker detection and tracking using audiovisual data

Publication number: 20050171971

Abstract: A system and method facilitating object tracking is provided. The invention includes an audio model that receives at least two audio input signals and a video model that receives a video input. The audio model and the video model employ probabilistic generative models which are combined to facilitate object tracking. Expectation maximization can be employed to modify trainable parameters of the audio model and the video model.

Type: Application

Filed: March 31, 2005

Publication date: August 4, 2005

Applicant: Microsoft Corporation

Inventors: Matthew Beal, Nebojsa Jojic, Hagai Attias
Method of speech recognition using multimodal variational inference with switching state space models

Publication number: 20050159951

Abstract: A method of efficiently setting posterior probability parameters for a switching state space model begins by defining a window containing at least two but fewer than all of the frames. A separate posterior probability parameter is determined for each frame in the window. The window is then shifted sequentially from left to right in time so that it includes one or more subsequent frames in the sequence of frames. A separate posterior probability parameter is then determined for each frame in the shifted window. This method closely approximates a more rigorous solution but saves computational cost by two to three orders of magnitude. Further, a method of determining the optimal discrete state sequence in the switching state space model is invented that directly exploits the observation vector on a frame-by-frame basis and operates from left to right in time.

Type: Application

Filed: January 20, 2004

Publication date: July 21, 2005

Applicant: Microsoft Corporation

Inventors: Hagai Attias, Li Deng, Leo Lee
Method of speech recognition using variational inference with switching state space models

Publication number: 20050119887

Abstract: A method is developed which includes 1) defining a switching state space model for a continuous valued hidden production-related parameter and the observed speech acoustics, and 2) approximating a posterior probability that provides the likelihood of a sequence of the hidden production-related parameters and a sequence of speech units based on a sequence of observed input values. In approximating the posterior probability, the boundaries of the speech units are not fixed but are optimally determined. Under one embodiment, a mixture of Gaussian approximation is used. In another embodiment, an HMM posterior approximation is used.

Type: Application

Filed: November 9, 2004

Publication date: June 2, 2005

Applicant: Microsoft Corporation

Inventors: Hagai Attias, Leo Lee, Li Deng
Method and apparatus for continuous valued vocal tract resonance tracking using piecewise linear approximations

Publication number: 20050114134

Abstract: A method and apparatus tracks vocal tract resonance components, including both frequencies and bandwidths, in a speech signal. The components are tracked by defining a state equation that is linear with respect to a past vocal tract resonance vector and that predicts a current vocal tract resonance vector. An observation equation is also defined that is linear with respect to a current vocal tract resonance vector and that predicts at least one component of an observation vector. The state equation, the observation equation, and a sequence of observation vectors are used to identify a sequence of vocal tract resonance vectors using Kalman filter algorithm. Under one embodiment, the observation equation is defined based on a piecewise linear approximation to a non-linear function. The parameters of the linear approximation are selected based on pre-defined regions, which are determined from a crude estimate of a vocal tract resonance vector.

Type: Application

Filed: November 26, 2003

Publication date: May 26, 2005

Applicant: Microsoft Corporation

Inventors: Li Deng, Hagai Attias, Alejandro Acero, Leo Lee
Speech detection and enhancement using audio/video fusion

Publication number: 20040267536

Abstract: A system and method facilitating speech detection and/or enhancement utilizing audio/video fusion is provided. The present invention fuses audio and video in a probabilistic generative model that implements cross-model, self-supervised learning, enabling rapid adaptation to audio visual data. The system can learn to detect and enhance speech in noise given only a short (e.g., 30 second) sequence of audio-visual data. In addition, it automatically learns to track the lips as they move around in the video.

Type: Application

Filed: June 27, 2003

Publication date: December 30, 2004

Inventors: John R. Hershey, Trausti Thor Kristjansson, Hagai Attias, Nebojsa Jojic
Variational inference and learning for segmental switching state space models of hidden speech dynamics

Publication number: 20040260548

Abstract: A system and method that facilitate modeling unobserved speech dynamics based upon a hidden dynamic speech model in the form of segmental switching state space model that employs model parameters including those describing the unobserved speech dynamics and those describing the relationship between the unobserved speech dynamic vector and the observed acoustic feature vector is provided. The model parameters are modified based, at least in part, upon, a variational learning technique. In accordance with an aspect of the present invention, novel and powerful variational expectation maximization (EM) algorithm(s) for the segmental switching state space models used in speech applications, which are capable of capturing key internal (or hidden) dynamics of natural speech production, are provided. For example, modification of model parameters can be based upon an approximate mixture of Gaussian (MOG) posterior and/or based upon an approximate hidden Markov model (HMM) posterior using a variational technique.

Type: Application

Filed: June 20, 2003

Publication date: December 23, 2004

Inventors: Hagai Attias, Li Deng, Leo J. Lee
Method of speech recognition using variational inference with switching state space models

Publication number: 20040199386

Abstract: A method is developed which includes 1) defining a switching state space model for a continuous valued hidden production-related parameter and the observed speech acoustics, and 2) approximating a posterior probability that provides the likelihood of a sequence of the hidden production-related parameters and a sequence of speech units based on a sequence of observed input values. In approximating the posterior probability, the boundaries of the speech units are not fixed but are optimally determined. Under one embodiment, a mixture of Gaussian approximation is used. In another embodiment, an HMM posterior approximation is used.

Type: Application

Filed: April 1, 2003

Publication date: October 7, 2004

Applicant: Microsoft Corporation

Inventors: Hagai Attias, Leo Jingyu Lee, Li Deng
Quantum mechanical model-based system and method for global optimization

Publication number: 20040167753

Abstract: A model-based system and method for global optimization that utilizes quantum mechanics in order to approximate the global minimum of a given problem (e.g., mathematical function). A quantum mechanical particle with a sufficiently large mass has a ground state solution to the Schrödinger Equation which is localized to the global minimum of the energy field, or potential, it experiences. A given function is modeled as a potential, and a quantum mechanical particle with a sufficiently large mass is placed in the potential. The ground state of the particle is determined, and the probability density function of the ground state of the particle is calculated. The peak of the probability density function is localized to the global minimum of the potential.

Type: Application

Filed: February 21, 2003

Publication date: August 26, 2004

Inventors: Oliver B. Downs, Hagai Attias, Christopher J.C. Burges, Robert L. Rounthwaite
Method of speech recognition using time-dependent interpolation and hidden dynamic value classes

Publication number: 20040019483

Abstract: A method of speech recognition is provided that identifies a production-related dynamics value by performing a linear interpolation between a production-related dynamics value at a previous time and a production-related target using a time-dependent interpolation weight. The hidden production-related dynamics value is used to compute a predicted value that is compared to an observed value of acoustics to determine the likelihood of the observed acoustics given a sequence of hidden phonological units. In some embodiments, the production-related dynamics value at the previous time is selected from a set of continuous values. In addition, the likelihood of the observed acoustics given a sequence of hidden phonological units is combined with a score associated with a discrete class of production-related dynamic values at the previous time to determine a score for a current phonological state.

Type: Application

Filed: October 9, 2002

Publication date: January 29, 2004

Inventors: Li Deng, Jian-Iai Zhou, Frank Torsten Bernd Seide, Asela J.R. Gunawardana, Hagai Attias, Alejandro Acero, Xuedong Huang
Microphone array signal enhancement using mixture models

Publication number: 20040002858

Abstract: A system and method facilitating signal enhancement utilizing mixture models is provided. The invention includes a signal enhancement adaptive system having a speech model, a noise model and a plurality of adaptive filter parameters. The signal enhancement adaptive system employs probabilistic modeling to perform signal enhancement of a plurality of windowed frequency transformed input signals received, for example, for an array of microphones. The signal enhancement adaptive system incorporates information about the statistical structure of speech signals. The signal enhancement adaptive system can be embedded in an overall enhancement system which also includes components of signal windowing and frequency transformation.

Type: Application

Filed: June 27, 2002

Publication date: January 1, 2004

Inventors: Hagai Attias, Li Deng
Searching multi-media databases using multi-media queries

Publication number: 20040002935

Abstract: A system and method for generating responsibility vectors associated with multi-media files (e.g., audio and/or video files) is provided. The responsibility vectors are based upon responsibility of mixture components fitted to a mixture model for frames of the files. The responsibility vectors can be grouped based upon clustering related to extracted identifiable features of frames of the multi-media files. Once generated, responsibility vectors can be searched by a multi-media searching system.

Type: Application

Filed: June 27, 2002

Publication date: January 1, 2004

Inventor: Hagai Attias
Speaker detection and tracking using audiovisual data

Publication number: 20040001143

Abstract: A system and method facilitating object tracking is provided. The invention includes an audio model that receives at least two audio input signals and a video model that receives a video input. The audio model and the video model employ probabilistic generative models which are combined to facilitate object tracking. Expectation maximization can be employed to modify trainable parameters of the audio model and the video model.

Type: Application

Filed: June 27, 2002

Publication date: January 1, 2004

Inventors: Matthew James Beal, Nebojsa Jojic, Hagai Attias
Method and apparatus for denoising and deverberation using variational inference and strong speech models

Publication number: 20030093269

Abstract: A probability distribution for speech model parameters, such as auto-regression parameters, is used to identify a distribution of denoised values from a noisy signal. Under one embodiment, the probability distributions of the speech model parameters and the denoised values are adjusted to improve a variational inference so that the variational inference better approximates the joint probability of the speech model parameters and the denoised values given a noisy signal. In some embodiments, this improvement is performed during an expectation step in an expectation-maximization algorithm. The statistical model can also be used to identify an average spectrum for the clean signal and this average spectrum may be provided to a speech recognizer instead of the estimate of the clean signal.

Type: Application

Filed: November 15, 2001

Publication date: May 15, 2003

Inventors: Hagai Attias, John Carlton Platt, Li Deng, Alejandro Acero

prev 1 2 3 next