Patents by Inventor POVEY DANIEL

POVEY DANIEL has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

METHOD OF TRAINING SPEECH RECOGNITION MODEL, ELECTRONIC DEVICE AND STORAGE MEDIUM

Publication number: 20230386448

Abstract: A method of training a speech recognition model is provided. The method includes that: speech data of each of a plurality of training samples is inputted into a teacher model and a to-be-trained speech recognition model separately. Additionally, an embedding outputted by the teacher model and encoded data outputted by the to-be-trained speech recognition model are obtained. Furthermore, quantized codebook data is obtained by performing a multi-codebook quantization on the embedding. A loss is calculated based on the encoded data, the quantized codebook data, and text data in the training sample. Moreover, a trained speech recognition model is obtained by stopping training the to-be-trained speech recognition model when the loss is less than or equal to a preset loss threshold and/or trained times is greater than preset trained times.

Type: Application

Filed: December 9, 2022

Publication date: November 30, 2023

Applicant: BEIJING XIAOMI MOBILE SOFTWARE CO., LTD.

Inventors: Zengwei YAO, Liyong GUO, POVEY DANIEL, Long LIN, Fangjun KUANG, Wei KANG, Mingshuang LUO, Quandong WANG, Yuxiang KONG
METHOD AND APPARATUS FOR AUDIO PROCESSING, ELECTRONIC DEVICE AND STORAGE MEDIUM

Publication number: 20230386483

Abstract: A method and apparatus for audio processing, an electronic device and a storage medium are provided. The method includes: obtaining an audio encoding result, wherein each element in the audio encoding result has a coordinate in an audio frame number dimension and a coordinate in a text label sequence dimension; in response to an output result of an ith frame in a decoding path being a non-null character, respectively increasing the coordinate in the audio frame number dimension and the coordinate in the text label sequence dimension corresponding to an output position of the ith frame by 1 to obtain an output position of a (i+1)th frame in the decoding path; and determining an output result corresponding to the output position of the (i+1)th frame according to the output result of the ith frame in the decoding path and an element of the (i+1)th frame in the audio encoding result.

Type: Application

Filed: December 9, 2022

Publication date: November 30, 2023

Applicant: BEIJING XIAOMI MOBILE SOFTWARE CO., LTD.

Inventors: Mingshuang LUO, Fangjun KUANG, Liyong GUO, Long LIN, Wei KANG, Zengwei YAO, POVEY DANIEL

METHOD OF TRAINING SPEECH RECOGNITION MODEL, ELECTRONIC DEVICE AND STORAGE MEDIUM

METHOD AND APPARATUS FOR AUDIO PROCESSING, ELECTRONIC DEVICE AND STORAGE MEDIUM