Patents by Inventor Haolei Yuan

Haolei Yuan has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11482237
    Abstract: The present disclosure discloses a method performed at a terminal for reconstructing a speech signal, and a computer storage medium, and relates to the field of speech recognition. The method includes: collecting, by the terminal, a plurality of sound signals through a plurality of sensors of a microphone array; determining, by the terminal, a first speech signal in the plurality of sound signals; performing, by the terminal, signal separation on the first speech signal to obtain a second speech signal; and performing, by the terminal, reconstruction on the second speech signal through a distortion recovery model to obtain a reconstructed speech signal; the distortion recovery model being obtained by training based on a clean speech signal and a distorted speech signal. The embodiments of the present disclosure improve accuracy of speech recognition results.
    Type: Grant
    Filed: April 23, 2020
    Date of Patent: October 25, 2022
    Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
    Inventor: Haolei Yuan
  • Publication number: 20220254134
    Abstract: This application discloses a region recognition method, apparatus, and device, and a readable storage medium, and relates to the field of artificial intelligence.
    Type: Application
    Filed: April 27, 2022
    Publication date: August 11, 2022
    Applicant: Tencent Technology (Shenzhen) Company Limited
    Inventors: Yuqiang REN, Xingjia PAN, Weiming DONG, Xudong ZHU, Haolei YUAN, Xiaowei GUO, Changsheng XU
  • Patent number: 10924614
    Abstract: A speech signal processing method is performed at a terminal device, including: obtaining a recorded signal and a to-be-output speech signal, the recorded signal including a noise signal and an echo signal; calculating a loop transfer function according to the recorded signal and the speech signal; calculating a power spectrum of the echo signal and a power spectrum of the noise signal according to the recorded signal, the speech signal, and the loop transfer function; calculating a frequency weighted coefficient according to the two power spectra of the echo signal and the noise signal; adjusting a frequency amplitude of the speech signal based on the frequency weighted coefficient; and outputting the adjusted speech signal to a speaker electrically coupled to the terminal device. As such, the frequency amplitude of the speech signal is automatically adjusted according to the relative frequency distribution of a noise signal and the speech signal.
    Type: Grant
    Filed: January 28, 2020
    Date of Patent: February 16, 2021
    Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
    Inventor: Haolei Yuan
  • Patent number: 10878803
    Abstract: A method, device, and storage medium for converting text to speech are described. The method includes obtaining target text; synthesizing first machine speech corresponding to the target text; and selecting an asynchronous machine speech whose prosodic feature matches a prosodic feature of the first machine speech from an asynchronous machine speech library. The method also includes searching a synchronous machine speech library for a first synchronous machine speech corresponding to the asynchronous machine speech; synthesizing, based on a prosodic feature of the first synchronous machine speech, second machine speech corresponding to the target text; and selecting a second synchronous machine speech matching an acoustic feature of the second machine speech from the synchronous machine speech library. The method further includes splicing speaker speech units corresponding to the synchronous machine speech unit in a speaker speech library, to obtain a target speaker speech.
    Type: Grant
    Filed: March 22, 2019
    Date of Patent: December 29, 2020
    Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
    Inventors: Haolei Yuan, Xiao Mei
  • Patent number: 10832652
    Abstract: A method is performed by at least one processor, and includes acquiring training speech data by concatenating speech segments having a lowest target cost among candidate concatenation solutions, and extracting training speech segments of a first annotation type, from the training speech data, the first annotation type being used for annotating that a speech continuity of a respective one of the training speech segments is superior to a preset condition.
    Type: Grant
    Filed: August 14, 2017
    Date of Patent: November 10, 2020
    Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
    Inventors: Haolei Yuan, Fuzhang Wu, Binghua Qian
  • Publication number: 20200251124
    Abstract: The present disclosure discloses a method performed at a terminal for reconstructing a speech signal, and a computer storage medium, and relates to the field of speech recognition. The method includes: collecting, by the terminal, a plurality of sound signals through a plurality of sensors of a microphone array; determining, by the terminal, a first speech signal in the plurality of sound signals; performing, by the terminal, signal separation on the first speech signal to obtain a second speech signal; and performing, by the terminal, reconstruction on the second speech signal through a distortion recovery model to obtain a reconstructed speech signal; the distortion recovery model being obtained by training based on a clean speech signal and a distorted speech signal. The embodiments of the present disclosure improve accuracy of speech recognition results.
    Type: Application
    Filed: April 23, 2020
    Publication date: August 6, 2020
    Inventor: Haolei YUAN
  • Publication number: 20200168237
    Abstract: A speech signal processing method is performed at a terminal device, including: obtaining a recorded signal and a to-be-output speech signal, the recorded signal including a noise signal and an echo signal; calculating a loop transfer function according to the recorded signal and the speech signal; calculating a power spectrum of the echo signal and a power spectrum of the noise signal according to the recorded signal, the speech signal, and the loop transfer function; calculating a frequency weighted coefficient according to the two power spectra of the echo signal and the noise signal; adjusting a frequency amplitude of the speech signal based on the frequency weighted coefficient; and outputting the adjusted speech signal to a speaker electrically coupled to the terminal device. As such, the frequency amplitude of the speech signal is automatically adjusted according to the relative frequency distribution of a noise signal and the speech signal.
    Type: Application
    Filed: January 28, 2020
    Publication date: May 28, 2020
    Inventor: Haolei YUAN
  • Patent number: 10586551
    Abstract: A speech signal processing method is performed at a terminal device, including: obtaining a recorded signal and a to-be-output speech signal, the recorded signal including a noise signal and an echo signal; calculating a loop transfer function according to the recorded signal and the speech signal; calculating a power spectrum of the echo signal and a power spectrum of the noise signal according to the recorded signal, the speech signal, and the loop transfer function; calculating a frequency weighted coefficient according to the two power spectra of the echo signal and the noise signal; adjusting a frequency amplitude of the speech signal based on the frequency weighted coefficient; and outputting the adjusted speech signal to a speaker electrically coupled to the terminal device. As such, the frequency amplitude of the speech signal is automatically adjusted according to the relative frequency distribution of a noise signal and the speech signal.
    Type: Grant
    Filed: August 30, 2017
    Date of Patent: March 10, 2020
    Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
    Inventor: Haolei Yuan
  • Publication number: 20190221201
    Abstract: A method, device, and storage medium for converting text to speech are described. The method includes obtaining target text; synthesizing first machine speech corresponding to the target text; and selecting an asynchronous machine speech whose prosodic feature matches a prosodic feature of the first machine speech from an asynchronous machine speech library. The method also includes searching a synchronous machine speech library for a first synchronous machine speech corresponding to the asynchronous machine speech; synthesizing, based on a prosodic feature of the first synchronous machine speech, second machine speech corresponding to the target text; and selecting a second synchronous machine speech matching an acoustic feature of the second machine speech from the synchronous machine speech library. The method further includes splicing speaker speech units corresponding to the synchronous machine speech unit in a speaker speech library, to obtain a target speaker speech.
    Type: Application
    Filed: March 22, 2019
    Publication date: July 18, 2019
    Applicant: Tencent Technology (Shenzhen) Company Limited
    Inventors: Haolei YUAN, Xiao MEI
  • Publication number: 20190189109
    Abstract: A method is performed by at least one processor, and includes acquiring training speech data by concatenating speech segments having a lowest target cost among candidate concatenation solutions, and extracting training speech segments of a first annotation type, from the training speech data, the first annotation type being used for annotating that a speech continuity of a respective one of the training speech segments is superior to a preset condition.
    Type: Application
    Filed: August 14, 2017
    Publication date: June 20, 2019
    Applicant: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
    Inventors: Haolei YUAN, Fuzhang WU, Binghua QIAN
  • Publication number: 20170365270
    Abstract: A speech signal processing method is performed at a terminal device, including: obtaining a recorded signal and a to-be-output speech signal, the recorded signal including a noise signal and an echo signal; calculating a loop transfer function according to the recorded signal and the speech signal; calculating a power spectrum of the echo signal and a power spectrum of the noise signal according to the recorded signal, the speech signal, and the loop transfer function; calculating a frequency weighted coefficient according to the two power spectra of the echo signal and the noise signal; adjusting a frequency amplitude of the speech signal based on the frequency weighted coefficient; and outputting the adjusted speech signal to a speaker electrically coupled to the terminal device. As such, the frequency amplitude of the speech signal is automatically adjusted according to the relative frequency distribution of a noise signal and the speech signal.
    Type: Application
    Filed: August 30, 2017
    Publication date: December 21, 2017
    Inventor: Haolei Yuan