Patents by Inventor Hagai Attias

Hagai Attias has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20150007708
    Abstract: A beat analysis module is described for determining beat information associated with an audio item. The beat analysis module uses an Expectation-Maximization (EM) approach to determine an average beat period, where correlation is performed over diverse representations of the audio item. The beat analysis module can determine the beat information in a relative short period of time. As such, the beat analysis module can perform its analysis together with another application task (such as a game application task) without disrupting the real time performance of that application task. In one application, a user may select his or her own audio items to be used in conjunction with the application task.
    Type: Application
    Filed: September 26, 2014
    Publication date: January 8, 2015
    Applicant: Microsoft Corporation
    Inventors: Hagai ATTIAS, Darko KIROVSKI
  • Patent number: 8842177
    Abstract: Object tracking includes an audio model that receives at least two audio input signals and a video model that receives a video input. The audio model and the video model employ probabilistic generative models which are combined to facilitate object tracking. Expectation maximization can be employed to modify trainable parameters of the audio model and the video model.
    Type: Grant
    Filed: March 31, 2010
    Date of Patent: September 23, 2014
    Assignee: Microsoft Corporation
    Inventors: Matthew James Beal, Nebojsa Jojic, Hagai Attias
  • Publication number: 20100300271
    Abstract: A beat analysis module is described for determining beat information associated with an audio item. The beat analysis module uses an Expectation-Maximization (EM) approach to determine an average beat period, where correlation is performed over diverse representations of the audio item. The beat analysis module can determine the beat information in a relative short period of time. As such, the beat analysis module can perform its analysis together with another application task (such as a game application task) without disrupting the real time performance of that application task. In one application, a user may select his or her own audio items to be used in conjunction with the application task.
    Type: Application
    Filed: May 27, 2009
    Publication date: December 2, 2010
    Applicant: Microsoft Corporation
    Inventors: Hagai Attias, Darko Kirovski
  • Publication number: 20100194881
    Abstract: Object tracking includes an audio model that receives at least two audio input signals and a video model that receives a video input. The audio model and the video model employ probabilistic generative models which are combined to facilitate object tracking. Expectation maximization can be employed to modify trainable parameters of the audio model and the video model.
    Type: Application
    Filed: March 31, 2010
    Publication date: August 5, 2010
    Applicant: Microsoft Corporation
    Inventors: Matthew James Beal, Nebojsa Jojic, Hagai Attias
  • Patent number: 7692685
    Abstract: A system and method facilitating object tracking is provided. The invention includes an audio model that receives at least two audio input signals and a video model that receives a video input. The audio model and the video model employ probabilistic generative models which are combined to facilitate object tracking. Expectation maximization can be employed to modify trainable parameters of the audio model and the video model.
    Type: Grant
    Filed: March 31, 2005
    Date of Patent: April 6, 2010
    Assignee: Microsoft Corporation
    Inventors: Matthew James Beal, Nebojsa Jojic, Hagai Attias
  • Patent number: 7689413
    Abstract: A system and method facilitating speech detection and/or enhancement utilizing audio/video fusion is provided. The present invention fuses audio and video in a probabilistic generative model that implements cross-model, self-supervised learning, enabling rapid adaptation to audio visual data. The system can learn to detect and enhance speech in noise given only a short (e.g., 30 second) sequence of audio-visual data. In addition, it automatically learns to track the lips as they move around in the video.
    Type: Grant
    Filed: September 10, 2007
    Date of Patent: March 30, 2010
    Assignee: Microsoft Corporation
    Inventors: John R. Hershey, Trausti Thor Kristajanson, Hagai Attias, Nebojsa Jojic
  • Patent number: 7486815
    Abstract: A method and apparatus are provided for learning a model for the appearance of an object while tracking the position of the object in three dimensions. Under embodiments of the present invention, this is achieved by combining a particle filtering technique for tracking the object's position with an expectation-maximization technique for learning the appearance of the object. Two stereo cameras are used to generate data for the learning and tracking.
    Type: Grant
    Filed: February 20, 2004
    Date of Patent: February 3, 2009
    Assignee: Microsoft Corporation
    Inventors: Trausti Kristjansson, Hagai Attias, John R. Hershey
  • Patent number: 7487087
    Abstract: A method is developed which includes 1) defining a switching state space model for a continuous valued hidden production-related parameter and the observed speech acoustics, and 2) approximating a posterior probability that provides the likelihood of a sequence of the hidden production-related parameters and a sequence of speech units based on a sequence of observed input values. In approximating the posterior probability, the boundaries of the speech units are not fixed but are optimally determined. Under one embodiment, a mixture of Gaussian approximation is used. In another embodiment, an HMM posterior approximation is used.
    Type: Grant
    Filed: November 9, 2004
    Date of Patent: February 3, 2009
    Assignee: Microsoft Corporation
    Inventors: Hagai Attias, Leo Jingyu Lee, Li Deng
  • Patent number: 7480615
    Abstract: A method of efficiently setting posterior probability parameters for a switching state space model begins by defining a window containing at least two but fewer than all of the frames. A separate posterior probability parameter is determined for each frame in the window. The window is then shifted sequentially from left to right in time so that it includes one or more subsequent frames in the sequence of frames. A separate posterior probability parameter is then determined for each frame in the shifted window. This method closely approximates a more rigorous solution but saves computational cost by two to three orders of magnitude. Further, a method of determining the optimal discrete state sequence in the switching state space model is invented that directly exploits the observation vector on a frame-by-frame basis and operates from left to right in time.
    Type: Grant
    Filed: January 20, 2004
    Date of Patent: January 20, 2009
    Assignee: Microsoft Corporation
    Inventors: Hagai Attias, Li Deng, Leo Lee
  • Patent number: 7454336
    Abstract: A system and method that facilitate modeling unobserved speech dynamics based upon a hidden dynamic speech model in the form of segmental switching state space model that employs model parameters including those describing the unobserved speech dynamics and those describing the relationship between the unobserved speech dynamic vector and the observed acoustic feature vector is provided. The model parameters are modified based, at least in part, upon, a variational learning technique. In accordance with an aspect of the present invention, novel and powerful variational expectation maximization (EM) algorithm(s) for the segmental switching state space models used in speech applications, which are capable of capturing key internal (or hidden) dynamics of natural speech production, are provided. For example, modification of model parameters can be based upon an approximate mixture of Gaussian (MOG) posterior and/or based upon an approximate hidden Markov model (HMM) posterior using a variational technique.
    Type: Grant
    Filed: June 20, 2003
    Date of Patent: November 18, 2008
    Assignee: Microsoft Corporation
    Inventors: Hagai Attias, Li Deng, Leo J. Lee
  • Patent number: 7398162
    Abstract: A model-based system and method for global optimization that utilizes quantum mechanics in order to approximate the global minimum of a given problem (e.g., mathematical function). A quantum mechanical particle with a sufficiently large mass has a ground state solution to the Schrödinger Equation which is localized to the global minimum of the energy field, or potential, it experiences. A given function is modeled as a potential, and a quantum mechanical particle with a sufficiently large mass is placed in the potential. The ground state of the particle is determined, and the probability density function of the ground state of the particle is calculated. The peak of the probability density function is localized to the global minimum of the potential.
    Type: Grant
    Filed: February 21, 2003
    Date of Patent: July 8, 2008
    Assignee: Microsoft Corporation
    Inventors: Oliver B. Downs, Hagai Attias, Christopher J. C. Burges, Robert L. Rounthwaite
  • Publication number: 20080059174
    Abstract: A system and method facilitating speech detection and/or enhancement utilizing audio/video fusion is provided. The present invention fuses audio and video in a probabilistic generative model that implements cross-model, self-supervised learning, enabling rapid adaptation to audio visual data. The system can learn to detect and enhance speech in noise given only a short (e.g., 30 second) sequence of audio-visual data. In addition, it automatically learns to track the lips as they move around in the video.
    Type: Application
    Filed: September 10, 2007
    Publication date: March 6, 2008
    Applicant: MICROSOFT CORPORATION
    Inventors: John Hershey, Trausti Kristjansson, Hagai Attias, Nebojsa Jojic
  • Patent number: 7325008
    Abstract: A system and method for generating responsibility vectors associated with multi-media files (e.g., audio and/or video files) is provided. The responsibility vectors are based upon responsibility of mixture components fitted to a mixture model for frames of the files. The responsibility vectors can be grouped based upon clustering related to extracted identifiable features of frames of the multi-media files. Once generated, responsibility vectors can be searched by a multi-media searching system. Also provided is a system for multi-media searching based, at least in part upon responsibility vectors associated with a query segment and multi-media files. The system can generate a query profile based, at least in part, upon responsibility vectors of frames of the query segment. The system can further generate segment profiles of segments of the multi-media files.
    Type: Grant
    Filed: July 20, 2005
    Date of Patent: January 29, 2008
    Assignee: Microsoft Corporation
    Inventor: Hagai Attias
  • Patent number: 7269560
    Abstract: A system and method facilitating speech detection and/or enhancement utilizing audio/video fusion is provided. The present invention fuses audio and video in a probabilistic generative model that implements cross-model, self-supervised learning, enabling rapid adaptation to audio visual data. The system can learn to detect and enhance speech in noise given only a short (e.g., 30 second) sequence of audio-visual data. In addition, it automatically learns to track the lips as they move around in the video.
    Type: Grant
    Filed: June 27, 2003
    Date of Patent: September 11, 2007
    Assignee: Microsoft Corporation
    Inventors: John R. Hershey, Trausti Thor Kristjansson, Hagai Attias, Nebojsa Jojic
  • Publication number: 20070154033
    Abstract: Improved audio source separation is provided by providing an audio dictionary for each source to be separated. Thus the invention can be regarded as providing “partially blind” source separation as opposed to the more commonly considered “blind” source separation problem, where no prior information about the sources is given. The audio dictionaries are probabilistic source models, and can be derived from training data from the sources to be separated, or from similar sources. Thus a library of audio dictionaries can be developed to aid in source separation. An unmixing and deconvolutive transformation can be inferred by maximum likelihood (ML) given the received signals and the selected audio dictionaries as input to the ML calculation. Optionally, frequency-domain filtering of the separated signal estimates can be performed prior to reconstructing the time-domain separated signal estimates. Such filtering can be regarded as providing an “audio skin” for a recovered signal.
    Type: Application
    Filed: December 1, 2006
    Publication date: July 5, 2007
    Inventor: Hagai Attias
  • Patent number: 7206741
    Abstract: A speech signal is decoded by determining a production-related value for a current state based on an optimal production-related value at the end of a preceding state, the optimal production-related value being selected from a set of continuous values. The production-related value is used to determine a likelihood of a phone being represented by a set of observation vectors that are aligned with a path between the preceding state and the current state. The likelihood of the phone is combined with a score from the preceding state to determine a score for the current state, the score from the preceding state being associated with a discrete class of production-related values wherein the class matches the class of the optimal production-related value.
    Type: Grant
    Filed: December 6, 2005
    Date of Patent: April 17, 2007
    Assignee: Microsoft Corporation
    Inventors: Li Deng, Jian-lai Zhou, Frank Torsten Bernd Seide, Asela J. R. Gunawardana, Hagai Attias, Alejandro Acero, Xuedong Huang
  • Patent number: 7103541
    Abstract: A system and method facilitating signal enhancement utilizing mixture models is provided. The invention includes a signal enhancement adaptive system having a speech model, a noise model and a plurality of adaptive filter parameters. The signal enhancement adaptive system employs probabilistic modeling to perform signal enhancement of a plurality of windowed frequency transformed input signals received, for example, for an array of microphones. The signal enhancement adaptive system incorporates information about the statistical structure of speech signals. The signal enhancement adaptive system can be embedded in an overall enhancement system which also includes components of signal windowing and frequency transformation.
    Type: Grant
    Filed: June 27, 2002
    Date of Patent: September 5, 2006
    Assignee: Microsoft Corporation
    Inventors: Hagai Attias, Li Deng
  • Patent number: 7076503
    Abstract: A method and apparatus are provided for organizing media objects in a database using contextual information for a media object and known media objects, categories, indexes and searches, to arrive at an inference for cataloging the media object in a database. The media object may then be cataloged in the database according to the inference. A method and apparatus are provided for clustering media objects by forming groups of unlabeled data and applying a distance metric to said group.
    Type: Grant
    Filed: December 19, 2001
    Date of Patent: July 11, 2006
    Assignee: Microsoft Corporation
    Inventors: John Carlton Platt, Jonathan Kagle, Hagai Attias, Victoria Elizabeth Milton
  • Patent number: 7050975
    Abstract: A method of speech recognition is provided that identifies a production-related dynamics value by performing a linear interpolation between a production-related dynamics value at a previous time and a production-related target using a time-dependent interpolation weight. The hidden production-related dynamics value is used to compute a predicted value that is compared to an observed value of acoustics to determine the likelihood of the observed acoustics given a sequence of hidden phonological units. In some embodiments, the production-related dynamics value at the previous time is selected from a set of continuous values. In addition, the likelihood of the observed acoustics given a sequence of hidden phonological units is combined with a score associated with a discrete class of production-related dynamic values at the previous time to determine a score for a current phonological state.
    Type: Grant
    Filed: October 9, 2002
    Date of Patent: May 23, 2006
    Assignee: Microsoft Corporation
    Inventors: Li Deng, Jian-Iai Zhou, Frank Torsten Bernd Seide, Asela J. R. Gunawardana, Hagai Attias, Alejandro Acero, Xuedong Huang
  • Publication number: 20060085191
    Abstract: A speech signal is decoded by determining a production-related value for a current state based on an optimal production-related value at the end of a preceding state, the optimal production-related value being selected from a set of continuous values. The production-related value is used to determine a likelihood of a phone being represented by a set of observation vectors that are aligned with a path between the preceding state and the current state. The likelihood of the phone is combined with a score from the preceding state to determine a score for the current state, the score from the preceding state being associated with a discrete class of production-related values wherein the class matches the class of the optimal production-related value.
    Type: Application
    Filed: December 6, 2005
    Publication date: April 20, 2006
    Applicant: Microsoft Corporation
    Inventors: Li Deng, Jian-Iai Zhou, Frank Seide, Asela Gunawardana, Hagai Attias, Alejandro Acero, Xuedong Huang