Patents by Inventor Daniel Povey

Daniel Povey has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Multistream acoustic models with dilations

Patent number: 11862146

Abstract: Audio signals of speech may be processed using an acoustic model. An acoustic model may be implemented with multiple streams of processing where different streams perform processing using different dilation rates. For example, a first stream may process features of the audio signal with one or more convolutional neural network layers having a first dilation rate, and a second stream may process features of the audio signal with one or more convolutional neural network layers having a second dilation rate. Each stream may compute a stream vector, and the stream vectors may be combined to a vector of speech unit scores, where the vector of speech unit scores provides information about the acoustic content of the audio signal. The vector of speech unit scores may be used for any appropriate application of speech, such as automatic speech recognition.

Type: Grant

Filed: July 2, 2020

Date of Patent: January 2, 2024

Assignee: ASAPP, INC.

Inventors: Kyu Jeong Han, Tao Ma, Daniel Povey
MULTISTREAM ACOUSTIC MODELS WITH DILATIONS

Publication number: 20210005182

Abstract: Audio signals of speech may be processed using an acoustic model. An acoustic model may be implemented with multiple streams of processing where different streams perform processing using different dilation rates. For example, a first stream may process features of the audio signal with one or more convolutional neural network layers having a first dilation rate, and a second stream may process features of the audio signal with one or more convolutional neural network layers having a second dilation rate. Each stream may compute a stream vector, and the stream vectors may be combined to a vector of speech unit scores, where the vector of speech unit scores provides information about the acoustic content of the audio signal. The vector of speech unit scores may be used for any appropriate application of speech, such as automatic speech recognition.

Type: Application

Filed: July 2, 2020

Publication date: January 7, 2021

Inventors: Kyu Jeong Han, Tao Ma, Daniel Povey
Subspace speech adaptation

Patent number: 8700400

Abstract: Subspace speech adaptation may be utilized for facilitating the recognition of speech containing short utterances. Speech training data may be received in a speech model by a computer. A first matrix may be determined for preconditioning speech statistics based on the speech training data. A second matrix may be determined for representing a basis for the speech to be recognized. A set of basis matrices may then be determined from the first matrix and the second matrix. Speech test data including a short utterance may then be received by the computer. The computer may then apply the set of basis matrices to the speech test data to produce a transcription. The transcription may represent speech recognition of the short utterance.

Type: Grant

Filed: December 30, 2010

Date of Patent: April 15, 2014

Assignee: Microsoft Corporation

Inventors: Daniel Povey, Kaisheng Yao, Yifan Gong
Subspace Speech Adaptation

Publication number: 20120173240

Abstract: Subspace speech adaptation may be utilized for facilitating the recognition of speech containing short utterances. Speech training data may be received in a speech model by a computer. A first matrix may be determined for preconditioning speech statistics based on the speech training data. A second matrix may be determined for representing a basis for the speech to be recognized. A set of basis matrices may then be determined from the first matrix and the second matrix. Speech test data including a short utterance may then be received by the computer. The computer may then apply the set of basis matrices to the speech test data to produce a transcription. The transcription may represent speech recognition of the short utterance.

Type: Application

Filed: December 30, 2010

Publication date: July 5, 2012

Applicant: MICROSOFT CORPORATION

Inventors: Daniel Povey, Kaisheng YAO, Yifan Gong
System and method for optimizing pattern recognition of non-gaussian parameters

Patent number: 8185480

Abstract: A method of optimizing a function of a parameter includes associating, with an objective function for initial value of parameters, an auxiliary function of parameters that could be optimized computationally more efficiently than an original objective function, obtaining parameters that are optimum for the auxiliary function, obtaining updated parameters by taking a weighted sum of the optimum of the auxiliary function and initial model parameters.

Type: Grant

Filed: April 2, 2008

Date of Patent: May 22, 2012

Assignee: International Business Machines Corporation

Inventors: Dimitri Kanevsky, David Nahamoo, Daniel Povey, Bhuvana Ramabhadran
SYSTEM AND METHOD FOR OPTIMIZING PATTERN RECOGNITION OF NON-GAUSSIAN PARAMETERS

Publication number: 20090254496

Abstract: A method of optimizing a function of a parameter includes associating, with an objective function for initial value of parameters, an auxiliary function of parameters that could be optimized computationally more efficiently than an original objective function, obtaining parameters that are optimum for the auxiliary function, obtaining updated parameters by taking a weighted sum of the optimum of the auxiliary function and initial model parameters.

Type: Application

Filed: April 2, 2008

Publication date: October 8, 2009

Applicant: International Buseinss Machines Corporation

Inventors: Dimitri Kanevsky, David Nahamoo, Daniel Povey, Bhuvana Ramabhadran

Multistream acoustic models with dilations

MULTISTREAM ACOUSTIC MODELS WITH DILATIONS

Subspace speech adaptation

Subspace Speech Adaptation

System and method for optimizing pattern recognition of non-gaussian parameters

SYSTEM AND METHOD FOR OPTIMIZING PATTERN RECOGNITION OF NON-GAUSSIAN PARAMETERS