Patents Examined by Athar N Pasha

Minimum word error rate training for attention-based sequence-to-sequence models

Patent number: 11107463

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer-readable storage media, for speech recognition using attention-based sequence-to-sequence models. In some implementations, audio data indicating acoustic characteristics of an utterance is received. A sequence of feature vectors indicative of the acoustic characteristics of the utterance is generated. The sequence of feature vectors is processed using a speech recognition model that has been trained using a loss function that uses N-best lists of decoded hypotheses, the speech recognition model including an encoder, an attention module, and a decoder. The encoder and decoder each include one or more recurrent neural network layers. A sequence of output vectors representing distributions over a predetermined set of linguistic units is obtained. A transcription for the utterance is obtained based on the sequence of output vectors. Data indicating the transcription of the utterance is provided.

Type: Grant

Filed: August 1, 2019

Date of Patent: August 31, 2021

Assignee: Google LLC

Inventors: Rohit Prakash Prabhavalkar, Tara N. Sainath, Yonghui Wu, Patrick An Phu Nguyen, Zhifeng Chen, Chung-Cheng Chiu, Anjuli Patricia Kannan
Method for embedding and executing audio semantics

Patent number: 11056127

Abstract: Aspects of the subject disclosure may include, for example, a device that includes a processing system having a processor and a memory that stores executable instructions that, when executed by the processing system, facilitate performance of operations, where the operations include determining parameters for adapting audio in the content to the device, wherein the device renders the content, and wherein the parameters are based on semantic metadata embedded in the content, adapting the audio in the content based on the parameters, and rendering the content, as adapted by the parameters, to represent a semantic in the semantic metadata. Other embodiments are disclosed.

Type: Grant

Filed: April 30, 2019

Date of Patent: July 6, 2021

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Eric Zavesky, Jason Decuir, Robert Gratz

prev 1 2 3 4 5

Minimum word error rate training for attention-based sequence-to-sequence models

Method for embedding and executing audio semantics