Patents by Inventor Junyao SHAO

Junyao SHAO has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Speech control method, electronic device, and storage medium

Patent number: 11893988

Abstract: The disclosure provides a speech control method, a speech control apparatus, an electronic device, and a storage medium. The method includes: acquiring target audio data sent by a client, the target audio data including audio data collected by the client within a target duration before wake-up and audio data collected by the client after wake-up; performing speech recognition on the target audio data; and controlling the client based on an instruction recognized from a second audio segment of the target audio data in response to recognizing a wake-up word from a first audio segment at beginning of the target audio data; in which, the second audio segment is later than the first audio segment or has an overlapping portion with the first audio segment.

Type: Grant

Filed: June 24, 2021

Date of Patent: February 6, 2024

Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventors: Song Yang, Saisai Zou, Jieyi Cao, Junyao Shao
Method and apparatus for speech recognition, and storage medium

Patent number: 11756529

Abstract: Proposed are a method and apparatus for speech recognition, and a storage medium. The specific solution includes: obtaining audio data to be recognized; decoding the audio data to obtain a first syllable of a to-be-converted word, in which the first syllable is a combination of at least one phoneme corresponding to the to-be-converted word; obtaining a sentence to which the to-be-converted word belongs and a converted word in the sentence, and obtaining a second syllable of the converted word; encoding the first syllable and the second syllable to generate first encoding information of the first syllable; and decoding the first encoding information to obtain a text corresponding to the to-be-converted word.

Type: Grant

Filed: December 16, 2020

Date of Patent: September 12, 2023

Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventors: Liao Zhang, Xiaoyin Fu, Zhengxiang Jiang, Mingxin Liang, Junyao Shao, Qi Zhang, Zhijie Chen, Qiguang Zang
METHOD FOR TRAINING SPEECH RECOGNITION MODEL, DEVICE AND STORAGE MEDIUM

Publication number: 20220310064

Abstract: A method for training a speech recognition model, a device and a storage medium, which relate to the field of computer technologies, and particularly to the fields of speech recognition technologies, deep learning technologies, or the like, are disclosed. The method for training a speech recognition model includes: obtaining a fusion probability of each of at least one candidate text corresponding to a speech based on an acoustic decoding model and a language model; selecting a preset number of one or more candidate texts based on the fusion probability of each of the at least one candidate text, and determining a predicted text based on the preset number of one or more candidate texts; and obtaining a loss function based on the predicted text and a standard text corresponding to the speech, and training the speech recognition model based on the loss function.

Type: Application

Filed: January 10, 2022

Publication date: September 29, 2022

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventors: Junyao SHAO, Xiaoyin FU, Qiguang ZANG, Zhijie CHEN, Mingxin LIANG, Huanxin ZHENG, Sheng QIAN
Method and apparatus for speech recognition

Patent number: 11393458

Abstract: Embodiments of the present disclosure relate to a method and apparatus for speech recognition. The method includes: determining, based on an acoustic score of a speech frame in a speech signal, a non-silence frame in the speech signal; determining a buffer frame between adjacent non-silence frames based on the acoustic score of the speech frame, a modeling unit corresponding to the buffer frame characterizing a beginning or end of a sentence; and decoding a speech frame after removing the buffer frame from the speech signal, to obtain a speech recognition result.

Type: Grant

Filed: December 3, 2019

Date of Patent: July 19, 2022

Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventors: Junyao Shao, Sheng Qian
Method, apparatus, device and computer readable storage medium for recognizing and decoding voice based on streaming attention model

Patent number: 11355113

Abstract: A method, apparatus, device, and computer readable storage medium for recognizing and decoding a voice based on a streaming attention model are provided. The method may include generating a plurality of acoustic paths for decoding the voice using the streaming attention model, and then merging acoustic paths with identical last syllables of the plurality of acoustic paths to obtain a plurality of merged acoustic paths. The method may further include selecting a preset number of acoustic paths from the plurality of merged acoustic paths as retained candidate acoustic paths. Embodiments of the present disclosure present a concept that acoustic score calculating of a current voice fragment is only affected by its last voice fragment and has nothing to do with earlier voice history, and merge acoustic paths with the identical last syllables of the plurality of candidate acoustic paths.

Type: Grant

Filed: March 9, 2020

Date of Patent: June 7, 2022

Assignee: Baidu Online Network Technology (Beijing) Co., Ltd.

Inventors: Junyao Shao, Sheng Qian, Lei Jia
METHOD FOR DISPLAYING STREAMING SPEECH RECOGNITION RESULT, ELECTRONIC DEVICE, AND STORAGE MEDIUM

Publication number: 20220068265

Abstract: The disclosure discloses a method for displaying a streaming speech recognition result, relates to a field of speech technologies, deep learning technologies and natural language processing technologies. The method includes: obtaining a plurality of continuous speech segments of an input audio stream, and simulating an end of a target speech segment in the plurality of continuous speech segments as a sentence ending, performing feature extraction on a current speech segment to be recognized based on a first feature extraction mode when the current speech segment is the target speech segment; performing feature extraction on the current speech segment based on a second feature extraction mode when the current speech segment is not the target speech segment; and obtaining a real-time recognition result by inputting a feature sequence extracted from the current speech segment into a streaming multi-layer truncated attention model, and displaying the real-time recognition result.

Type: Application

Filed: November 8, 2021

Publication date: March 3, 2022

Inventors: Junyao SHAO, Sheng QIAN
METHOD AND APPARATUS FOR SPEECH RECOGNITION, AND STORAGE MEDIUM

Publication number: 20210375264

Abstract: Proposed are a method and apparatus for speech recognition, and a storage medium. The specific solution includes: obtaining audio data to be recognized; decoding the audio data to obtain a first syllable of a to-be-converted word, in which the first syllable is a combination of at least one phoneme corresponding to the to-be-converted word; obtaining a sentence to which the to-be-converted word belongs and a converted word in the sentence, and obtaining a second syllable of the converted word; encoding the first syllable and the second syllable to generate first encoding information of the first syllable; and decoding the first encoding information to obtain a text corresponding to the to-be-converted word.

Type: Application

Filed: December 16, 2020

Publication date: December 2, 2021

Inventors: Liao ZHANG, Xiaoyin FU, Zhengxiang JIANG, Mingxin LIANG, Junyao SHAO, Qi ZHANG, Zhijie CHEN, Qiguang ZANG
SPEECH CONTROL METHOD, ELECTRONIC DEVICE, AND STORAGE MEDIUM

Publication number: 20210319795

Abstract: The disclosure provides a speech control method, a speech control apparatus, an electronic device, and a storage medium. The method includes: acquiring target audio data sent by a client, the target audio data including audio data collected by the client within a target duration before wake-up and audio data collected by the client after wake-up; performing speech recognition on the target audio data; and controlling the client based on an instruction recognized from a second audio segment of the target audio data in response to recognizing a wake-up word from a first audio segment at beginning of the target audio data; in which, the second audio segment is later than the first audio segment or has an overlapping portion with the first audio segment.

Type: Application

Filed: June 24, 2021

Publication date: October 14, 2021

Inventors: Song YANG, Saisai ZOU, Jieyi CAO, Junyao SHAO
Method and apparatus for voice identification, device and computer readable storage medium

Patent number: 11145314

Abstract: Embodiments of the present disclosure provide a method and apparatus for voice identification, a device and a computer readable storage medium. The method may include: for an inputted voice signal, obtaining a first piece of decoded acoustic information by a first acoustic model and obtaining a second piece of decoded acoustic information by a second acoustic model, where the second acoustic model being generated by joint modeling of acoustic model and language model. The method may further include determining a first group of candidate identification results based on the first piece of decoded acoustic information, determining a second group of candidate identification results based on the second piece of decoded acoustic information, and then determining a final identification result for the voice signal based on the first group of candidate identification results and the second group of candidate identification results.

Type: Grant

Filed: March 6, 2020

Date of Patent: October 12, 2021

Assignee: Baidu Online Network Technology (Beijing) Co., Ltd.

Inventors: Xingyuan Peng, Junyao Shao, Lei Jia
METHOD AND APPARATUS FOR VOICE IDENTIFICATION, DEVICE AND COMPUTER READABLE STORAGE MEDIUM

Publication number: 20210056975

Abstract: Embodiments of the present disclosure provide a method and apparatus for voice identification, a device and a computer readable storage medium. The method may include: for an inputted voice signal, obtaining a first piece of decoded acoustic information by a first acoustic model and obtaining a second piece of decoded acoustic information by a second acoustic model, where the second acoustic model being generated by joint modeling of acoustic model and language model. The method may further include determining a first group of candidate identification results based on the first piece of decoded acoustic information, determining a second group of candidate identification results based on the second piece of decoded acoustic information, and then determining a final identification result for the voice signal based on the first group of candidate identification results and the second group of candidate identification results.

Type: Application

Filed: March 6, 2020

Publication date: February 25, 2021

Inventors: Xingyuan PENG, Junyao SHAO, Lei JIA
METHOD, APPARATUS, DEVICE AND COMPUTER READABLE STORAGE MEDIUM FOR RECOGNIZING AND DECODING VOICE BASED ON STREAMING ATTENTION MODEL

Publication number: 20210020175

Abstract: A method, apparatus, device, and computer readable storage medium for recognizing and decoding a voice based on a streaming attention model are provided. The method may include generating a plurality of acoustic paths for decoding the voice using the streaming attention model, and then merging acoustic paths with identical last syllables of the plurality of acoustic paths to obtain a plurality of merged acoustic paths. The method may further include selecting a preset number of acoustic paths from the plurality of merged acoustic paths as retained candidate acoustic paths. Embodiments of the present disclosure present a concept that acoustic score calculating of a current voice fragment is only affected by its last voice fragment and has nothing to do with earlier voice history, and merge acoustic paths with the identical last syllables of the plurality of candidate acoustic paths.

Type: Application

Filed: March 9, 2020

Publication date: January 21, 2021

Inventors: Junyao SHAO, Sheng QIAN, Lei JIA
METHOD AND APPARATUS FOR SPEECH RECOGNITION

Publication number: 20200365144

Abstract: Embodiments of the present disclosure relate to a method and apparatus for speech recognition. The method includes: determining, based on an acoustic score of a speech frame in a speech signal, a non-silence frame in the speech signal; determining a buffer frame between adjacent non-silence frames based on the acoustic score of the speech frame, a modeling unit corresponding to the buffer frame characterizing a beginning or end of a sentence; and decoding a speech frame after removing the buffer frame from the speech signal, to obtain a speech recognition result.

Type: Application

Filed: December 3, 2019

Publication date: November 19, 2020

Inventors: Junyao SHAO, Sheng QIAN