Patents by Inventor Mingxin LIANG

Mingxin LIANG has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Method and apparatus for speech recognition, and storage medium

Patent number: 11756529

Abstract: Proposed are a method and apparatus for speech recognition, and a storage medium. The specific solution includes: obtaining audio data to be recognized; decoding the audio data to obtain a first syllable of a to-be-converted word, in which the first syllable is a combination of at least one phoneme corresponding to the to-be-converted word; obtaining a sentence to which the to-be-converted word belongs and a converted word in the sentence, and obtaining a second syllable of the converted word; encoding the first syllable and the second syllable to generate first encoding information of the first syllable; and decoding the first encoding information to obtain a text corresponding to the to-be-converted word.

Type: Grant

Filed: December 16, 2020

Date of Patent: September 12, 2023

Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventors: Liao Zhang, Xiaoyin Fu, Zhengxiang Jiang, Mingxin Liang, Junyao Shao, Qi Zhang, Zhijie Chen, Qiguang Zang
SAMPLE GENERATION METHOD, MODEL TRAINING METHOD, TRAJECTORY RECOGNITION METHOD, DEVICE, AND MEDIUM

Publication number: 20230195998

Abstract: Disclosed are a sample generation method, a model training method, a trajectory recognition method, a device, and a medium. The method is: determining a code result of a training Chinese character according to a preset code library, where the preset code library is generated based on code characters in a five-stroke code corpus; taking the code result as a training label of the training Chinese character; and generating a training sample according to both a writing trajectory and the training label of the training Chinese character. The amount of information carried in the training sample is enriched.

Type: Application

Filed: September 26, 2022

Publication date: June 22, 2023

Inventors: Yunze GAO, Xiaoping WANG, Penghao RAO, Fenfen SHENG, Mingxin LIANG
SPEECH RECOGNITION AND CODEC METHOD AND APPARATUS, ELECTRONIC DEVICE AND STORAGE MEDIUM

Publication number: 20230090590

Abstract: The present disclosure provides speech recognition and codec methods and apparatuses, an electronic device and a storage medium, and relates to the field of artificial intelligence such as intelligent speech, deep learning and natural language processing. The speech recognition method may include: acquiring an audio feature of to-be-recognized speech; encoding the audio feature to obtain an encoding feature; truncating the encoding feature to obtain continuous N feature fragments, N being a positive integer greater than one; and acquiring, for any one of the feature segments, corresponding historical feature abstraction information, encoding the feature segment in combination with the historical feature abstraction information, and decoding an encoding result to obtain a recognition result corresponding to the feature segment, wherein the historical feature abstraction information is information obtained by feature abstraction of recognized historical feature fragments.

Type: Application

Filed: May 6, 2022

Publication date: March 23, 2023

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventors: Xiaoyin FU, Zhijie CHEN, Mingxin LIANG, Mingshun YANG, Lei JIA, Haifeng WANG
METHOD FOR TRAINING SPEECH RECOGNITION MODEL, DEVICE AND STORAGE MEDIUM

Publication number: 20220310064

Abstract: A method for training a speech recognition model, a device and a storage medium, which relate to the field of computer technologies, and particularly to the fields of speech recognition technologies, deep learning technologies, or the like, are disclosed. The method for training a speech recognition model includes: obtaining a fusion probability of each of at least one candidate text corresponding to a speech based on an acoustic decoding model and a language model; selecting a preset number of one or more candidate texts based on the fusion probability of each of the at least one candidate text, and determining a predicted text based on the preset number of one or more candidate texts; and obtaining a loss function based on the predicted text and a standard text corresponding to the speech, and training the speech recognition model based on the loss function.

Type: Application

Filed: January 10, 2022

Publication date: September 29, 2022

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventors: Junyao SHAO, Xiaoyin FU, Qiguang ZANG, Zhijie CHEN, Mingxin LIANG, Huanxin ZHENG, Sheng QIAN
METHOD OF RECOGNIZING SPEECH OFFLINE, ELECTRONIC DEVICE, AND STORAGE MEDIUM

Publication number: 20220108684

Abstract: The present disclosure provides a method of recognizing speech offline, electronic device, and a storage medium, relating to a field of artificial intelligence such as speech recognition, natural language processing, and deep learning. The method may include: decoding speech data to be recognized into a syllable recognition result; transforming the syllable recognition result into a corresponding text as a speech recognition result of the speech data.

Type: Application

Filed: December 16, 2021

Publication date: April 7, 2022

Inventors: Xiaoyin FU, Mingxin LIANG, Zhijie CHEN, Qiguang ZANG, Zhengxiang JIANG, Liao ZHANG, Qi ZHANG, Lei JIA
Method, device and storage medium for predicting punctuation in text

Patent number: 11216615

Abstract: The disclosure provides a method, a device and a storage medium for predicting a punctuation in a text. The method includes: inputting a text to be predicted into a sequence tagging model to obtain at least one prediction result and a corresponding first score of each character in the text to be predicted; generating a text to be inputted corresponding to each of the at least one prediction result; obtaining a second score corresponding to each of the at least one prediction result; determining a punctuation existence situation of the corresponding character based on the first score and the second score corresponding to each of the at least one prediction result; and performing punctuation processing on the text to be predicted based on the punctuation existence situation of each character in the text to be predicted to obtain a punctuated text corresponding to the text to be predicted.

Type: Grant

Filed: September 29, 2020

Date of Patent: January 4, 2022

Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.

Inventors: Mingxin Liang, Xiaoyin Fu
METHOD AND APPARATUS FOR SPEECH RECOGNITION, AND STORAGE MEDIUM

Publication number: 20210375264

Abstract: Proposed are a method and apparatus for speech recognition, and a storage medium. The specific solution includes: obtaining audio data to be recognized; decoding the audio data to obtain a first syllable of a to-be-converted word, in which the first syllable is a combination of at least one phoneme corresponding to the to-be-converted word; obtaining a sentence to which the to-be-converted word belongs and a converted word in the sentence, and obtaining a second syllable of the converted word; encoding the first syllable and the second syllable to generate first encoding information of the first syllable; and decoding the first encoding information to obtain a text corresponding to the to-be-converted word.

Type: Application

Filed: December 16, 2020

Publication date: December 2, 2021

Inventors: Liao ZHANG, Xiaoyin FU, Zhengxiang JIANG, Mingxin LIANG, Junyao SHAO, Qi ZHANG, Zhijie CHEN, Qiguang ZANG
METHOD, DEVICE AND STORAGE MEDIUM FOR PREDICTING PUNCTUATION IN TEXT

Publication number: 20210224480

Abstract: The disclosure provides a method, a device and a storage medium for predicting a punctuation in a text. The method includes: inputting a text to be predicted into a sequence tagging model to obtain at least one prediction result and a corresponding first score of each character in the text to be predicted; generating a text to be inputted corresponding to each of the at least one prediction result; obtaining a second score corresponding to each of the at least one prediction result; determining a punctuation existence situation of the corresponding character based on the first score and the second score corresponding to each of the at least one prediction result; and performing punctuation processing on the text to be predicted based on the punctuation existence situation of each character in the text to be predicted to obtain a punctuated text corresponding to the text to be predicted.

Type: Application

Filed: September 29, 2020

Publication date: July 22, 2021

Inventors: Mingxin Liang, Xiaoyin Fu
Methods, devices and computer-readable storage media for real-time speech recognition

Patent number: 10854193

Abstract: Methods, apparatuses, devices and computer-readable storage media for real-time speech recognition are provided. The method includes: based on an input speech signal, obtaining truncating information for truncating a sequence of features of the speech signal; based on the truncating information, truncating the sequence of features into a plurality of subsequences; and for each subsequence in the plurality of subsequences, obtaining a real-time recognition result through attention mechanism.

Type: Grant

Filed: February 6, 2019

Date of Patent: December 1, 2020

Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.

Inventors: Xiaoyin Fu, Jinfeng Bai, Zhijie Chen, Mingxin Liang, Xu Chen, Lei Jia
METHODS, DEVICES AND COMPUTER-READABLE STORAGE MEDIA FOR REAL-TIME SPEECH RECOGNITION

Publication number: 20200219486

Abstract: Methods, apparatuses, devices and computer-readable storage media for real-time speech recognition are provided. The method includes: based on an input speech signal, obtaining truncating information for truncating a sequence of features of the speech signal; based on the truncating information, truncating the sequence of features into a plurality of subsequences; and for each subsequence in the plurality of subsequences, obtaining a real-time recognition result through attention mechanism.

Type: Application

Filed: February 6, 2019

Publication date: July 9, 2020

Inventors: Xiaoyin FU, Jinfeng BAI, Zhijie CHEN, Mingxin LIANG, Xu CHEN, Lei JIA