Patents by Inventor Michiel Bacchiani

Michiel Bacchiani has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11308963
    Abstract: Systems and methods are described for improving endpoint detection of a voice query submitted by a user. In some implementations, a synchronized video data and audio data is received. A sequence of frames of the video data that includes images corresponding to lip movement on a face is determined. The audio data is endpointed based on first audio data that corresponds to a first frame of the sequence of frames and second audio data that corresponds to a last frame of the sequence of frames. A transcription of the endpointed audio data is generated by an automated speech recognizer. The generated transcription is then provided for output.
    Type: Grant
    Filed: July 23, 2020
    Date of Patent: April 19, 2022
    Assignee: GOOGLE LLC
    Inventors: Chanwoo Kim, Rajeev Nongpiur, Michiel Bacchiani
  • Publication number: 20200357401
    Abstract: Systems and methods are described for improving endpoint detection of a voice query submitted by a user. In some implementations, a synchronized video data and audio data is received. A sequence of frames of the video data that includes images corresponding to lip movement on a face is determined. The audio data is endpointed based on first audio data that corresponds to a first frame of the sequence of frames and second audio data that corresponds to a last frame of the sequence of frames. A transcription of the endpointed audio data is generated by an automated speech recognizer. The generated transcription is then provided for output.
    Type: Application
    Filed: July 23, 2020
    Publication date: November 12, 2020
    Inventors: Chanwoo Kim, Rajeev Nongpiur, Michiel Bacchiani
  • Patent number: 10339929
    Abstract: An example method includes receiving, by a computing system, an indication of one or more audible sounds that are detected by a first sensing device, the one or more audible sounds originating from a user; determining, by the computing system and based at least in part on an indication of one or more signals detected by a second sensing device, a distance between the user and the second sensing device; determining, by the computing system and based at least in part on the indication of the one or more audible sounds, one or more acoustic features that are associated with the one or more audible sounds; and determining, by the computing system, and based at least in part on the one or more acoustic features and the distance between the user and the second sensing device, one or more words that correspond to the audible sounds.
    Type: Grant
    Filed: June 27, 2017
    Date of Patent: July 2, 2019
    Assignee: GOOGLE LLC
    Inventors: Chan Woo Kim, Rajeev Conrad Nongpiur, Vijayaditya Peddinti, Michiel Bacchiani
  • Publication number: 20180374477
    Abstract: An example method includes receiving, by a computing system, an indication of one or more audible sounds that are detected by a first sensing device, the one or more audible sounds originating from a user; determining, by the computing system and based at least in part on an indication of one or more signals detected by a second sensing device, a distance between the user and the second sensing device; determining, by the computing system and based at least in part on the indication of the one or more audible sounds, one or more acoustic features that are associated with the one or more audible sounds; and determining, by the computing system, and based at least in part on the one or more acoustic features and the distance between the user and the second sensing device, one or more words that correspond to the audible sounds.
    Type: Application
    Filed: June 27, 2017
    Publication date: December 27, 2018
    Inventors: Chan Woo Kim, Rajeev Conrad Nongpiur, Vijayaditya Peddinti, Michiel Bacchiani
  • Publication number: 20050096908
    Abstract: Systems and methods relate to generating a language model for use in, for example, a spoken dialog system or some other application. The method comprises building a class-based language model, generating at least one sequence network and replacing class labels in the class-based language model with the at least one sequence network. In this manner, placeholders or tokens associated with classes can be inserted into the models at training time and word/phone networks can be built based on meta-data information at test time. Finally, the placeholder token can be replaced with the word/phone networks at run time to improve recognition of difficult words such as proper names.
    Type: Application
    Filed: October 29, 2004
    Publication date: May 5, 2005
    Applicant: AT&T Corp.
    Inventors: Michiel Bacchiani, Sameer Maskey, Brian Roark, Richard Sproat
  • Publication number: 20050096907
    Abstract: Disclosed are systems and methods for providing a spoken dialog system using meta-data to build language models to improve speech processing. Meta-data is generally defined as data outside received speech; for example, meta-data may be a customer profile having a name, address and purchase history of a caller to a spoken dialog system. The method comprises building tree clusters from meta-data and estimating a language model using the built tree clusters. The language model may be used by various modules in the spoken dialog system, such as the automatic speech recognition module and/or the dialog management module. Building the tree clusters from the meta-data may involve generating projections from the meta-data and further may comprise computing counts as a result of unigram tree clustering and then building both unigram trees and higher-order trees from the meta-data as well as computing node distances within the built trees that are used for estimating the language model.
    Type: Application
    Filed: October 29, 2004
    Publication date: May 5, 2005
    Applicant: AT&T Corp.
    Inventors: Michiel Bacchiani, Brian Roark