Patents by Inventor Jinfeng BAI
Jinfeng BAI has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 12118989Abstract: The present disclosure provides a speech processing method, and a method for generating a speech processing model, related to a field of signal processing technologies. The speech processing method includes: obtaining M speech signals to be processed and N reference signals; performing sub-band decomposition on each of the M speech signals and each of the N reference signals to obtain frequency-band components in each speech signal and each reference signal; processing the frequency-band components in each speech signal and each reference signal by using an echo cancellation model, to obtain an ideal ratio mask corresponding to the N reference signals in each frequency band of each speech signal; and performing echo cancellation on each frequency-band component of each speech signal based on the ideal ratio mask corresponding to the N reference signals in each frequency band of each speech signal, to obtain M echo-cancelled speech signals.Type: GrantFiled: October 21, 2021Date of Patent: October 15, 2024Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.Inventors: Xu Chen, Jinfeng Bai, Runqiang Han, Lei Jia
-
Patent number: 12112746Abstract: The present disclosure provides a method and a device for processing voice interaction, an electronic device and a storage medium. The method includes: determining a first integrity of a voice instruction from a user by using a pre-trained integrity detection model in response to detecting that the voice instruction from the user is not a high-frequency instruction; determining a waiting duration for the voice instruction based on the first integrity and a preset integrity threshold, wherein the waiting duration for the voice instruction indicates a length of period between a time when a voice interaction device determines that receiving the voice instruction is completed and a time when the voice interaction device performs an operation in response to the voice instruction of the user; and controlling the voice interaction device to respond to the voice instruction of the user based on the waiting duration.Type: GrantFiled: September 15, 2021Date of Patent: October 8, 2024Assignee: Beijing Baidu Netcom Science Technology Co., Ltd.Inventors: Jinfeng Bai, Zhijian Wang, Cong Gao
-
Patent number: 11830482Abstract: Embodiments of the present disclosure relate to a method and an apparatus for speech interaction, and a computer readable storage medium. The method may include determining text information corresponding to a received speech signal. The method also includes obtaining label information of the text information by labeling elements in the text information. In addition, the method further includes determining first intention information of the text information based on the label information. The method further includes determining a semantic of the text information based on the first intention information and the label information.Type: GrantFiled: June 8, 2020Date of Patent: November 28, 2023Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTDInventors: Zhen Wu, Yufang Wu, Hua Liang, Jiaxiang Ge, Xingyuan Peng, Jinfeng Bai, Lei Jia
-
Patent number: 11823662Abstract: The present disclosure discloses a control method and a control apparatus for speech interaction. The detailed implementation solution of the control method for the speech interaction includes: collecting an audio signal; detecting a wake-up word in the audio signal to obtain a wake-up word result; and playing a prompt tone and/or executing a speech instruction in the audio signal based on the wake-up word result.Type: GrantFiled: January 26, 2021Date of Patent: November 21, 2023Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.Inventors: Cong Gao, Saisai Zou, Jinfeng Bai, Lei Jia
-
Patent number: 11735168Abstract: A method and an apparatus for recognizing a voice are provided. The method may include: inputting a target voice into a pre-trained voice recognition model to obtain an initial text output by at least one recognition network in the voice recognition model, the recognition network including a plurality of preset types of processing layers, and at least one type of processing layer of the recognition network being obtained by training based on a voice sample in a preset direction interval; and determining a voice recognition result of the target voice, based on the initial text.Type: GrantFiled: March 23, 2021Date of Patent: August 22, 2023Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.Inventors: Xin Li, Bin Huang, Ce Zhang, Jinfeng Bai, Lei Jia
-
Patent number: 11620983Abstract: The disclosure provides a speech recognition method, a device and a computer-readable storage medium. The method includes obtaining a first voice signal collected from a first microphone in a microphone array and a second voice signal collected from a second microphone in the microphone array, the microphone array including at least two microphones, such as two, three or six microphones. The method further includes extracting enhanced features associated with the first voice signal and the second voice signal through a neural network, and obtaining a speech recognition result based on the enhanced features extracted.Type: GrantFiled: August 10, 2020Date of Patent: April 4, 2023Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTDInventors: Ce Zhang, Bin Huang, Xin Li, Jinfeng Bai, Xu Chen, Lei Jia
-
Patent number: 11615784Abstract: The present disclosure discloses a control method and a control apparatus for speech interaction. The detailed implementation solution of the control method for the speech interaction includes: collecting an audio signal; detecting a wake-up word in the audio signal to obtain a wake-up word result; and playing a prompt tone and/or executing a speech instruction in the audio signal based on the wake-up word result.Type: GrantFiled: December 11, 2020Date of Patent: March 28, 2023Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.Inventors: Cong Gao, Saisai Zou, Jinfeng Bai, Lei Jia
-
Patent number: 11503155Abstract: The present disclosure discloses an interactive voice-control method and apparatus, a device and a medium. The method includes: obtaining a sound signal at a voice interaction device and recognized information that is recognized from the sound signal; determining an interaction confidence of the sound signal based at least on at least one of an acoustic feature representation of the sound signal and a semantic feature representation associated with the recognized information; determining a matching status between the recognized information and the sound signal; and providing the interaction confidence and the matching status for controlling a response of the voice interaction device to the sound signal.Type: GrantFiled: September 24, 2020Date of Patent: November 15, 2022Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.Inventors: Jinfeng Bai, Chuanlei Zhai, Xu Chen, Tao Chen, Xiaokong Ma, Ce Zhang, Zhen Wu, Xingyuan Peng, Zhijian Wang, Sheng Qian, Guibin Wang, Lei Jia
-
Patent number: 11488577Abstract: The present application discloses a training method and an apparatus for a speech synthesis model, electronic device, and storage medium. The method includes: taking a syllable input sequence, a phoneme input sequence and a Chinese character input sequence of a current sample as inputs of an encoder of a model to be trained, to obtain encoded representations of these three sequences at an output end of the encoder; fusing the encoded representations of these three sequences, to obtain a weighted combination of these three sequences; taking the weighted combination as an input of an attention module, to obtain a weighted average of the weighted combination at each moment at an output end of the attention module; taking the weighted average as an input of a decoder of the model to be trained, to obtain a speech Mel spectrum of the current sample at an output end of the decoder.Type: GrantFiled: June 19, 2020Date of Patent: November 1, 2022Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.Inventors: Zhipeng Chen, Jinfeng Bai, Lei Jia
-
Patent number: 11393490Abstract: According to embodiments of the present disclosure, a method, apparatus, device, and computer readable storage medium for voice interaction are provided. The method includes: determining a text corresponding to the voice signal based on a voice feature of a received voice signal. The method further includes: determining, based on the voice feature and the text, a matching degree between a reference voice feature of an element in the text and a target voice feature of the element. The method further includes: determining a first possibility that the voice signal is an executable command based on the text. The method further includes: determining a second possibility that the voice signal is the executable command based on the voice feature.Type: GrantFiled: June 8, 2020Date of Patent: July 19, 2022Assignee: Baidu Online Network Technology (Beijing) Co., Ltd.Inventors: Zhijian Wang, Jinfeng Bai, Sheng Qian, Lei Jia
-
Patent number: 11322151Abstract: According to embodiments of the disclosure, a method and an apparatus for processing a speech signal, and a computer-readable storage medium are provided. The method includes obtaining a set of speech feature representations of a speech signal received. The method also includes generating a set of source text feature representations based on a text recognized from the speech signal, each source text feature representation corresponding to an element in the text. The method also includes generating a set of target text feature representations based on the set of speech feature representations and the set of source text feature representations. The method also includes determining a match degree between the set of target text feature representations and a set of reference text feature representations predefined for the text, the match degree indicating an accuracy of recognizing of the text.Type: GrantFiled: June 22, 2020Date of Patent: May 3, 2022Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTDInventors: Chuanlei Zhai, Xu Chen, Jinfeng Bai, Lei Jia
-
Patent number: 11250854Abstract: A method, apparatus, device, and storage medium for voice interaction. A specific embodiment of the method includes: extracting an acoustic feature from received voice data, the acoustic feature indicating a short-term amplitude spectrum characteristic of the voice data; applying the acoustic feature to a type recognition model to determine an intention type of the voice data, the intention type being one of an interaction intention type and a non-interaction intention type, and the type recognition model being constructed based on the acoustic feature of training voice data; and performing an interaction operation indicated by the voice data, based on determining that the intention type is the interaction intention type.Type: GrantFiled: June 8, 2020Date of Patent: February 15, 2022Assignee: Baidu Online Network Technology (Beijing) Co., Ltd.Inventors: Xiaokong Ma, Ce Zhang, Jinfeng Bai, Lei Jia
-
Publication number: 20220044678Abstract: The present disclosure provides a speech processing method, and a method for generating a speech processing model, related to a field of signal processing technologies. The speech processing method includes: obtaining M speech signals to be processed and N reference signals; performing sub-band decomposition on each of the M speech signals and each of the N reference signals to obtain frequency-band components in each speech signal and each reference signal; processing the frequency-band components in each speech signal and each reference signal by using an echo cancellation model, to obtain an ideal ratio mask corresponding to the N reference signals in each frequency band of each speech signal; and performing echo cancellation on each frequency-band component of each speech signal based on the ideal ratio mask corresponding to the N reference signals in each frequency band of each speech signal, to obtain M echo-cancelled speech signals.Type: ApplicationFiled: October 21, 2021Publication date: February 10, 2022Inventors: Xu CHEN, Jinfeng BAI, Runqiang HAN, Lei JIA
-
Publication number: 20220005474Abstract: The present disclosure provides a method and a device for processing voice interaction, an electronic device and a storage medium. The method includes: determining a first integrity of a voice instruction from a user by using a pre-trained integrity detection model in response to detecting that the voice instruction from the user is not a high-frequency instruction; determining a waiting duration for the voice instruction based on the first integrity and a preset integrity threshold, wherein the waiting duration for the voice instruction indicates a length of period between a time when a voice interaction device determines that receiving the voice instruction is completed and a time when the voice interaction device performs an operation in response to the voice instruction of the user; and controlling the voice interaction device to respond to the voice instruction of the user based on the waiting duration.Type: ApplicationFiled: September 15, 2021Publication date: January 6, 2022Inventors: Jinfeng BAI, Zhijian WANG, Cong GAO
-
Publication number: 20210407494Abstract: The present disclosure discloses a control method and a control apparatus for speech interaction. The detailed implementation solution of the control method for the speech interaction includes: collecting an audio signal; detecting a wake-up word in the audio signal to obtain a wake-up word result; and playing a prompt tone and/or executing a speech instruction in the audio signal based on the wake-up word result.Type: ApplicationFiled: December 11, 2020Publication date: December 30, 2021Inventors: Cong GAO, Saisai ZOU, Jinfeng BAI, Lei JIA
-
Publication number: 20210407496Abstract: The present disclosure discloses a control method and a control apparatus for speech interaction. The detailed implementation solution of the control method for the speech interaction includes: collecting an audio signal; detecting a wake-up word in the audio signal to obtain a wake-up word result; and playing a prompt tone and/or executing a speech instruction in the audio signal based on the wake-up word result.Type: ApplicationFiled: January 26, 2021Publication date: December 30, 2021Inventors: Cong GAO, Saisai ZOU, Jinfeng BAI, Lei JIA
-
Publication number: 20210319802Abstract: The disclosure provides a method for processing a speech signal, an electronic device and a storage medium. The method includes: obtaining a speech signal to be processed and a reference speech signal; obtaining a frequency-domain speech signal to be processed and a reference frequency-domain speech signal by respectively preprocessing the speech signal to be processed and the reference speech signal; obtaining a frequency-domain speech signal ratio by inputting the frequency-domain speech signal to be processed and the reference frequency-domain speech signal into a complex neural network model; and obtaining a target frequency-domain speech signal based on the frequency-domain speech signal ratio and the frequency-domain speech signal to be processed, and obtaining a target speech signal by processing the target frequency-domain speech signal.Type: ApplicationFiled: June 8, 2021Publication date: October 14, 2021Inventor: Jinfeng BAI
-
Publication number: 20210233518Abstract: A method and an apparatus for recognizing a voice are provided. The method may include: inputting a target voice into a pre-trained voice recognition model to obtain an initial text output by at least one recognition network in the voice recognition model, the recognition network including a plurality of preset types of processing layers, and at least one type of processing layer of the recognition network being obtained by training based on a voice sample in a preset direction interval; and determining a voice recognition result of the target voice, based on the initial text.Type: ApplicationFiled: March 23, 2021Publication date: July 29, 2021Inventors: Xin LI, Bin HUANG, Ce ZHANG, Jinfeng BAI, Lei JIA
-
Publication number: 20210210113Abstract: The present disclosure provides a method and apparatus for detecting a voice, relates to the fields of voice processing and deep learning technology. The method may include: acquiring a target voice; and inputting the target voice into a pre-trained deep neural network to obtain whether the target voice has a sub-voice in each of a plurality of preset direction intervals, the deep neural network being used to predict whether the voice has a sub-voice in each of the plurality of direction intervals.Type: ApplicationFiled: March 22, 2021Publication date: July 8, 2021Inventors: Xin Li, Bin Huang, Ce Zhang, Jinfeng Bai, Lei Jia
-
Publication number: 20210158799Abstract: The disclosure provides a speech recognition method, a device and a computer-readable storage medium. The method includes obtaining a first voice signal collected from a first microphone in a microphone array and a second voice signal collected from a second microphone in the microphone array, the microphone array including at least two microphones, such as two, three or six microphones. The method further includes extracting enhanced features associated with the first voice signal and the second voice signal through a neural network, and obtaining a speech recognition result based on the enhanced features extracted.Type: ApplicationFiled: August 10, 2020Publication date: May 27, 2021Inventors: Ce ZHANG, Bin Huang, Xin Li, Jinfeng Bai, Xu Chen, Lei Jia