Patents by Inventor Michiel Bacchiani

Michiel Bacchiani has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Query endpointing based on lip detection

Patent number: 11308963

Abstract: Systems and methods are described for improving endpoint detection of a voice query submitted by a user. In some implementations, a synchronized video data and audio data is received. A sequence of frames of the video data that includes images corresponding to lip movement on a face is determined. The audio data is endpointed based on first audio data that corresponds to a first frame of the sequence of frames and second audio data that corresponds to a last frame of the sequence of frames. A transcription of the endpointed audio data is generated by an automated speech recognizer. The generated transcription is then provided for output.

Type: Grant

Filed: July 23, 2020

Date of Patent: April 19, 2022

Assignee: GOOGLE LLC

Inventors: Chanwoo Kim, Rajeev Nongpiur, Michiel Bacchiani
QUERY ENDPOINTING BASED ON LIP DETECTION

Publication number: 20200357401

Abstract: Systems and methods are described for improving endpoint detection of a voice query submitted by a user. In some implementations, a synchronized video data and audio data is received. A sequence of frames of the video data that includes images corresponding to lip movement on a face is determined. The audio data is endpointed based on first audio data that corresponds to a first frame of the sequence of frames and second audio data that corresponds to a last frame of the sequence of frames. A transcription of the endpointed audio data is generated by an automated speech recognizer. The generated transcription is then provided for output.

Type: Application

Filed: July 23, 2020

Publication date: November 12, 2020

Inventors: Chanwoo Kim, Rajeev Nongpiur, Michiel Bacchiani
Speech recognition using acoustic features in conjunction with distance information

Patent number: 10339929

Abstract: An example method includes receiving, by a computing system, an indication of one or more audible sounds that are detected by a first sensing device, the one or more audible sounds originating from a user; determining, by the computing system and based at least in part on an indication of one or more signals detected by a second sensing device, a distance between the user and the second sensing device; determining, by the computing system and based at least in part on the indication of the one or more audible sounds, one or more acoustic features that are associated with the one or more audible sounds; and determining, by the computing system, and based at least in part on the one or more acoustic features and the distance between the user and the second sensing device, one or more words that correspond to the audible sounds.

Type: Grant

Filed: June 27, 2017

Date of Patent: July 2, 2019

Assignee: GOOGLE LLC

Inventors: Chan Woo Kim, Rajeev Conrad Nongpiur, Vijayaditya Peddinti, Michiel Bacchiani
SPEECH RECOGNITION USING ACOUSTIC FEATURES IN CONJUNCTION WITH DISTANCE INFORMATION

Publication number: 20180374477

Abstract: An example method includes receiving, by a computing system, an indication of one or more audible sounds that are detected by a first sensing device, the one or more audible sounds originating from a user; determining, by the computing system and based at least in part on an indication of one or more signals detected by a second sensing device, a distance between the user and the second sensing device; determining, by the computing system and based at least in part on the indication of the one or more audible sounds, one or more acoustic features that are associated with the one or more audible sounds; and determining, by the computing system, and based at least in part on the one or more acoustic features and the distance between the user and the second sensing device, one or more words that correspond to the audible sounds.

Type: Application

Filed: June 27, 2017

Publication date: December 27, 2018

Inventors: Chan Woo Kim, Rajeev Conrad Nongpiur, Vijayaditya Peddinti, Michiel Bacchiani
System and method of using meta-data in speech processing

Publication number: 20050096908

Abstract: Systems and methods relate to generating a language model for use in, for example, a spoken dialog system or some other application. The method comprises building a class-based language model, generating at least one sequence network and replacing class labels in the class-based language model with the at least one sequence network. In this manner, placeholders or tokens associated with classes can be inserted into the models at training time and word/phone networks can be built based on meta-data information at test time. Finally, the placeholder token can be replaced with the word/phone networks at run time to improve recognition of difficult words such as proper names.

Type: Application

Filed: October 29, 2004

Publication date: May 5, 2005

Applicant: AT&T Corp.

Inventors: Michiel Bacchiani, Sameer Maskey, Brian Roark, Richard Sproat
System and method for using meta-data dependent language modeling for automatic speech recognition

Publication number: 20050096907

Abstract: Disclosed are systems and methods for providing a spoken dialog system using meta-data to build language models to improve speech processing. Meta-data is generally defined as data outside received speech; for example, meta-data may be a customer profile having a name, address and purchase history of a caller to a spoken dialog system. The method comprises building tree clusters from meta-data and estimating a language model using the built tree clusters. The language model may be used by various modules in the spoken dialog system, such as the automatic speech recognition module and/or the dialog management module. Building the tree clusters from the meta-data may involve generating projections from the meta-data and further may comprise computing counts as a result of unigram tree clustering and then building both unigram trees and higher-order trees from the meta-data as well as computing node distances within the built trees that are used for estimating the language model.

Type: Application

Filed: October 29, 2004

Publication date: May 5, 2005

Applicant: AT&T Corp.

Inventors: Michiel Bacchiani, Brian Roark

Query endpointing based on lip detection

QUERY ENDPOINTING BASED ON LIP DETECTION

Speech recognition using acoustic features in conjunction with distance information

SPEECH RECOGNITION USING ACOUSTIC FEATURES IN CONJUNCTION WITH DISTANCE INFORMATION

System and method of using meta-data in speech processing

System and method for using meta-data dependent language modeling for automatic speech recognition