Patents by Inventor Haolei Yuan
Haolei Yuan has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11482237Abstract: The present disclosure discloses a method performed at a terminal for reconstructing a speech signal, and a computer storage medium, and relates to the field of speech recognition. The method includes: collecting, by the terminal, a plurality of sound signals through a plurality of sensors of a microphone array; determining, by the terminal, a first speech signal in the plurality of sound signals; performing, by the terminal, signal separation on the first speech signal to obtain a second speech signal; and performing, by the terminal, reconstruction on the second speech signal through a distortion recovery model to obtain a reconstructed speech signal; the distortion recovery model being obtained by training based on a clean speech signal and a distorted speech signal. The embodiments of the present disclosure improve accuracy of speech recognition results.Type: GrantFiled: April 23, 2020Date of Patent: October 25, 2022Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITEDInventor: Haolei Yuan
-
Publication number: 20220254134Abstract: This application discloses a region recognition method, apparatus, and device, and a readable storage medium, and relates to the field of artificial intelligence.Type: ApplicationFiled: April 27, 2022Publication date: August 11, 2022Applicant: Tencent Technology (Shenzhen) Company LimitedInventors: Yuqiang REN, Xingjia PAN, Weiming DONG, Xudong ZHU, Haolei YUAN, Xiaowei GUO, Changsheng XU
-
Patent number: 10924614Abstract: A speech signal processing method is performed at a terminal device, including: obtaining a recorded signal and a to-be-output speech signal, the recorded signal including a noise signal and an echo signal; calculating a loop transfer function according to the recorded signal and the speech signal; calculating a power spectrum of the echo signal and a power spectrum of the noise signal according to the recorded signal, the speech signal, and the loop transfer function; calculating a frequency weighted coefficient according to the two power spectra of the echo signal and the noise signal; adjusting a frequency amplitude of the speech signal based on the frequency weighted coefficient; and outputting the adjusted speech signal to a speaker electrically coupled to the terminal device. As such, the frequency amplitude of the speech signal is automatically adjusted according to the relative frequency distribution of a noise signal and the speech signal.Type: GrantFiled: January 28, 2020Date of Patent: February 16, 2021Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITEDInventor: Haolei Yuan
-
Patent number: 10878803Abstract: A method, device, and storage medium for converting text to speech are described. The method includes obtaining target text; synthesizing first machine speech corresponding to the target text; and selecting an asynchronous machine speech whose prosodic feature matches a prosodic feature of the first machine speech from an asynchronous machine speech library. The method also includes searching a synchronous machine speech library for a first synchronous machine speech corresponding to the asynchronous machine speech; synthesizing, based on a prosodic feature of the first synchronous machine speech, second machine speech corresponding to the target text; and selecting a second synchronous machine speech matching an acoustic feature of the second machine speech from the synchronous machine speech library. The method further includes splicing speaker speech units corresponding to the synchronous machine speech unit in a speaker speech library, to obtain a target speaker speech.Type: GrantFiled: March 22, 2019Date of Patent: December 29, 2020Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITEDInventors: Haolei Yuan, Xiao Mei
-
Patent number: 10832652Abstract: A method is performed by at least one processor, and includes acquiring training speech data by concatenating speech segments having a lowest target cost among candidate concatenation solutions, and extracting training speech segments of a first annotation type, from the training speech data, the first annotation type being used for annotating that a speech continuity of a respective one of the training speech segments is superior to a preset condition.Type: GrantFiled: August 14, 2017Date of Patent: November 10, 2020Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITEDInventors: Haolei Yuan, Fuzhang Wu, Binghua Qian
-
Publication number: 20200251124Abstract: The present disclosure discloses a method performed at a terminal for reconstructing a speech signal, and a computer storage medium, and relates to the field of speech recognition. The method includes: collecting, by the terminal, a plurality of sound signals through a plurality of sensors of a microphone array; determining, by the terminal, a first speech signal in the plurality of sound signals; performing, by the terminal, signal separation on the first speech signal to obtain a second speech signal; and performing, by the terminal, reconstruction on the second speech signal through a distortion recovery model to obtain a reconstructed speech signal; the distortion recovery model being obtained by training based on a clean speech signal and a distorted speech signal. The embodiments of the present disclosure improve accuracy of speech recognition results.Type: ApplicationFiled: April 23, 2020Publication date: August 6, 2020Inventor: Haolei YUAN
-
Publication number: 20200168237Abstract: A speech signal processing method is performed at a terminal device, including: obtaining a recorded signal and a to-be-output speech signal, the recorded signal including a noise signal and an echo signal; calculating a loop transfer function according to the recorded signal and the speech signal; calculating a power spectrum of the echo signal and a power spectrum of the noise signal according to the recorded signal, the speech signal, and the loop transfer function; calculating a frequency weighted coefficient according to the two power spectra of the echo signal and the noise signal; adjusting a frequency amplitude of the speech signal based on the frequency weighted coefficient; and outputting the adjusted speech signal to a speaker electrically coupled to the terminal device. As such, the frequency amplitude of the speech signal is automatically adjusted according to the relative frequency distribution of a noise signal and the speech signal.Type: ApplicationFiled: January 28, 2020Publication date: May 28, 2020Inventor: Haolei YUAN
-
Patent number: 10586551Abstract: A speech signal processing method is performed at a terminal device, including: obtaining a recorded signal and a to-be-output speech signal, the recorded signal including a noise signal and an echo signal; calculating a loop transfer function according to the recorded signal and the speech signal; calculating a power spectrum of the echo signal and a power spectrum of the noise signal according to the recorded signal, the speech signal, and the loop transfer function; calculating a frequency weighted coefficient according to the two power spectra of the echo signal and the noise signal; adjusting a frequency amplitude of the speech signal based on the frequency weighted coefficient; and outputting the adjusted speech signal to a speaker electrically coupled to the terminal device. As such, the frequency amplitude of the speech signal is automatically adjusted according to the relative frequency distribution of a noise signal and the speech signal.Type: GrantFiled: August 30, 2017Date of Patent: March 10, 2020Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITEDInventor: Haolei Yuan
-
Publication number: 20190221201Abstract: A method, device, and storage medium for converting text to speech are described. The method includes obtaining target text; synthesizing first machine speech corresponding to the target text; and selecting an asynchronous machine speech whose prosodic feature matches a prosodic feature of the first machine speech from an asynchronous machine speech library. The method also includes searching a synchronous machine speech library for a first synchronous machine speech corresponding to the asynchronous machine speech; synthesizing, based on a prosodic feature of the first synchronous machine speech, second machine speech corresponding to the target text; and selecting a second synchronous machine speech matching an acoustic feature of the second machine speech from the synchronous machine speech library. The method further includes splicing speaker speech units corresponding to the synchronous machine speech unit in a speaker speech library, to obtain a target speaker speech.Type: ApplicationFiled: March 22, 2019Publication date: July 18, 2019Applicant: Tencent Technology (Shenzhen) Company LimitedInventors: Haolei YUAN, Xiao MEI
-
Publication number: 20190189109Abstract: A method is performed by at least one processor, and includes acquiring training speech data by concatenating speech segments having a lowest target cost among candidate concatenation solutions, and extracting training speech segments of a first annotation type, from the training speech data, the first annotation type being used for annotating that a speech continuity of a respective one of the training speech segments is superior to a preset condition.Type: ApplicationFiled: August 14, 2017Publication date: June 20, 2019Applicant: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITEDInventors: Haolei YUAN, Fuzhang WU, Binghua QIAN
-
Publication number: 20170365270Abstract: A speech signal processing method is performed at a terminal device, including: obtaining a recorded signal and a to-be-output speech signal, the recorded signal including a noise signal and an echo signal; calculating a loop transfer function according to the recorded signal and the speech signal; calculating a power spectrum of the echo signal and a power spectrum of the noise signal according to the recorded signal, the speech signal, and the loop transfer function; calculating a frequency weighted coefficient according to the two power spectra of the echo signal and the noise signal; adjusting a frequency amplitude of the speech signal based on the frequency weighted coefficient; and outputting the adjusted speech signal to a speaker electrically coupled to the terminal device. As such, the frequency amplitude of the speech signal is automatically adjusted according to the relative frequency distribution of a noise signal and the speech signal.Type: ApplicationFiled: August 30, 2017Publication date: December 21, 2017Inventor: Haolei Yuan