Patents by Inventor Ehsan Variani

Ehsan Variani has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

COMPLEX LINEAR PROJECTION FOR ACOUSTIC MODELING

Publication number: 20180174575

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for speech recognition using complex linear projection are disclosed. In one aspect, a method includes the actions of receiving audio data corresponding to an utterance. The method further includes generating frequency domain data using the audio data. The method further includes processing the frequency domain data using complex linear projection. The method further includes providing the processed frequency domain data to a neural network trained as an acoustic model. The method further includes generating a transcription for the utterance that is determined based at least on output that the neural network provides in response to receiving the processed frequency domain data.

Type: Application

Filed: December 21, 2016

Publication date: June 21, 2018

Inventors: Samuel Bengio, Mirko Visontai, Christopher Walter George Thornton, Michiel A.U. Bacchiani, Tara N. Sainath, Ehsan Variani, Izhak Shafran
ENHANCED MULTI-CHANNEL ACOUSTIC MODELS

Publication number: 20180068675

Abstract: This specification describes computer-implemented methods and systems. One method includes receiving, by a neural network of a speech recognition system, first data representing a first raw audio signal and second data representing a second raw audio signal. The first raw audio signal and the second raw audio signal describe audio occurring at a same period of time. The method further includes generating, by a spatial filtering layer of the neural network, a spatial filtered output using the first data and the second data, and generating, by a spectral filtering layer of the neural network, a spectral filtered output using the spatial filtered output. Generating the spectral filtered output comprises processing frequency-domain data representing the spatial filtered output. The method still further includes processing, by one or more additional layers of the neural network, the spectral filtered output to predict sub-word units encoded in both the first raw audio signal and the second raw audio signal.

Type: Application

Filed: November 14, 2016

Publication date: March 8, 2018

Inventors: Ehsan Variani, Kevin William Wilson, Ron J. Weiss, Tara N. Sainath, Arun Narayanan
Speaker verification using neural networks

Patent number: 9401148

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for inputting speech data that corresponds to a particular utterance to a neural network; determining an evaluation vector based on output at a hidden layer of the neural network; comparing the evaluation vector with a reference vector that corresponds to a past utterance of a particular speaker; and based on comparing the evaluation vector and the reference vector, determining whether the particular utterance was likely spoken by the particular speaker.

Type: Grant

Filed: March 28, 2014

Date of Patent: July 26, 2016

Assignee: Google Inc.

Inventors: Xin Lei, Erik McDermott, Ehsan Variani, Ignacio L. Moreno
SPEAKER VERIFICATION USING NEURAL NETWORKS

Publication number: 20150127336

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for inputting speech data that corresponds to a particular utterance to a neural network; determining an evaluation vector based on output at a hidden layer of the neural network; comparing the evaluation vector with a reference vector that corresponds to a past utterance of a particular speaker; and based on comparing the evaluation vector and the reference vector, determining whether the particular utterance was likely spoken by the particular speaker.

Type: Application

Filed: March 28, 2014

Publication date: May 7, 2015

Applicant: Google Inc.

Inventors: Xin Lei, Erik McDermott, Ehsan Variani, Ignacio L. Moreno

prev 1 2

COMPLEX LINEAR PROJECTION FOR ACOUSTIC MODELING

ENHANCED MULTI-CHANNEL ACOUSTIC MODELS

Speaker verification using neural networks

SPEAKER VERIFICATION USING NEURAL NETWORKS