Patents by Inventor Eryu Wang

Eryu Wang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Data parallel processing method and apparatus based on multiple graphic processing units

Patent number: 10282809

Abstract: A parallel data processing method based on multiple graphic processing units (GPUs) is provided, including: creating, in a central processing unit (CPU), a plurality of worker threads for controlling a plurality of worker groups respectively, the worker groups including one or more GPUs; binding each worker thread to a corresponding GPU; loading a plurality of batches of training data from a nonvolatile memory to GPU video memories in the plurality of worker groups; and controlling the plurality of GPUs to perform data processing in parallel through the worker threads. The method can enhance efficiency of multi-GPU parallel data processing. In addition, a parallel data processing apparatus is further provided.

Type: Grant

Filed: July 14, 2016

Date of Patent: May 7, 2019

Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED

Inventors: Xing Jin, Yi Li, Yongqiang Zou, Zhimao Guo, Eryu Wang, Wei Xue, Bo Chen, Yong Li, Chunjian Bao, Lei Xiao
Systems and methods for audio command recognition with speaker authentication

Patent number: 10013985

Abstract: The present application discloses a method, an electronic system and a non-transitory computer readable storage medium for recognizing audio commands in an electronic device. The electronic device obtains audio data based on an audio signal provided by a user and extracts characteristic audio fingerprint features from the audio data. The electronic device further determines whether the corresponding audio signal is generated by an authorized user by comparing the characteristic audio fingerprint features with an audio fingerprint model for the authorized user and with a universal background model that represents user-independent audio fingerprint features, respectively. When the corresponding audio signal is generated by the authorized user of the electronic device, an audio command is extracted from the audio data, and an operation is performed according to the audio command.

Type: Grant

Filed: December 3, 2015

Date of Patent: July 3, 2018

Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED

Inventors: Shuai Yue, Xiang Zhang, Li Lu, Feng Rao, Eryu Wang, Haibo Liu, Bo Chen, Jian Liu, Lu Li
Method and device for voiceprint recognition

Patent number: 9940935

Abstract: A method is performed at a device having one or more processors and memory. The device establishes a first-level Deep Neural Network (DNN) model based on unlabeled speech data, the unlabeled speech data containing no speaker labels and the first-level DNN model specifying a plurality of basic voiceprint features for the unlabeled speech data. The device establishes a second-level DNN model by tuning the first-level DNN model based on labeled speech data, the labeled speech data containing speech samples with respective speaker labels, wherein the second-level DNN model specifies a plurality of high-level voiceprint features. Using the second-level DNN model, registers a first high-level voiceprint feature sequence for a user based on a registration speech sample received from the user. The device performs speaker verification for the user based on the first high-level voiceprint feature sequence registered for the user.

Type: Grant

Filed: August 18, 2016

Date of Patent: April 10, 2018

Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED

Inventors: Eryu Wang, Li Lu, Xiang Zhang, Haibo Liu, Lou Li, Feng Rao, Duling Lu, Shuai Yue, Bo Chen
Method and system of adding punctuation and establishing language model using a punctuation weighting applied to chinese speech recognized text

Patent number: 9811517

Abstract: A method of processing information content based on a Chinese language model is performed at a computer, the method including: identifying a plurality of expressions in the information content extracted from a speech input through speech recognition that is queued to be processed; dividing the expressions into a plurality of characteristic units according to semantic features and predetermined characteristics associated with each characteristic unit, each including a subset of the expressions and the predetermined characteristics at least including a respective integer number of expressions that are included in the characteristic unit; extracting, from the Chinese language model, a plurality of probabilities for punctuation marks associated with each characteristic unit; and in accordance with the probabilities, associating a respective punctuation mark with each characteristic unit included in the information content.

Type: Grant

Filed: January 6, 2014

Date of Patent: November 7, 2017

Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED

Inventors: Haibo Liu, Eryu Wang, Xiang Zhang, Li Lu, Shuai Yue, Qiuge Liu, Bo Chen, Jian Liu, Lu Li
Systems and methods for adding punctuations by detecting silences in a voice using plurality of aggregate weights which obey a linear relationship

Patent number: 9779728

Abstract: Systems and methods are provided for adding punctuations. For example, one or more first feature units are identified in a voice file taken as a whole; the voice file is divided into multiple segments by detecting silences in the voice file; one or more second feature units are identified in the voice file; a first aggregate weight of first punctuation states of the voice file and a second aggregate weight of second punctuation states of the voice file are determined, using a language model established based on word separation and third semantic features; a weighted calculation is performed to generate a third aggregate weight based on a linear combination associated with the first aggregate weight and the second aggregate weight; and one or more final punctuations are added to the voice file based on at least information associated with the third aggregate weight.

Type: Grant

Filed: January 22, 2014

Date of Patent: October 3, 2017

Assignee: Tencent Technology (Shenzhen) Company Limited

Inventors: Haibo Liu, Eryu Wang, Xiang Zhang, Shuai Yue, Lu Li, Li Lu, Jian Liu, Bo Chen
Method and system for building a topic specific language model for use in automatic speech recognition

Patent number: 9697821

Abstract: An automatic speech recognition method includes at a computer having one or more processors and memory for storing one or more programs to be executed by the processors, obtaining a plurality of speech corpus categories through classifying and calculating raw speech corpus; obtaining a plurality of classified language models that respectively correspond to the plurality of speech corpus categories through a language model training applied on each speech corpus category; obtaining an interpolation language model through implementing a weighted interpolation on each classified language model and merging the interpolated plurality of classified language models; constructing a decoding resource in accordance with an acoustic model and the interpolation language model; and decoding input speech using the decoding resource, and outputting a character string with a highest probability as a recognition result of the input speech.

Type: Grant

Filed: December 16, 2013

Date of Patent: July 4, 2017

Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED

Inventors: Feng Rao, Li Lu, Bo Chen, Shuai Yue, Xiang Zhang, Eryu Wang, Dadong Xie, Lou Li, Duling Lu
METHOD AND DEVICE FOR VOICEPRINT RECOGNITION

Publication number: 20160358610

Abstract: A method is performed at a device having one or more processors and memory. The device establishes a first-level Deep Neural Network (DNN) model based on unlabeled speech data, the unlabeled speech data containing no speaker labels and the first-level DNN model specifying a plurality of basic voiceprint features for the unlabeled speech data. The device establishes a second-level DNN model by tuning the first-level DNN model based on labeled speech data, the labeled speech data containing speech samples with respective speaker labels, wherein the second-level DNN model specifies a plurality of high-level voiceprint features. Using the second-level DNN model, registers a first high-level voiceprint feature sequence for a user based on a registration speech sample received from the user. The device performs speaker verification for the user based on the first high-level voiceprint feature sequence registered for the user.

Type: Application

Filed: August 18, 2016

Publication date: December 8, 2016

Inventors: Eryu WANG, Li LU, Xiang ZHANG, Haibo LIU, Lou LI, Feng RAO, Duling LU, Shuai YUE, Bo CHEN
Method and device for parallel processing in model training

Patent number: 9508347

Abstract: A method and a device for training a DNN model includes: at a device including one or more processors and memory: establishing an initial DNN model; dividing a training data corpus into a plurality of disjoint data subsets; for each of the plurality of disjoint data subsets, providing the data subset to a respective training processing unit of a plurality of training processing units operating in parallel, wherein the respective training processing unit applies a Stochastic Gradient Descent (SGD) process to update the initial DNN model to generate a respective DNN sub-model based on the data subset; and merging the respective DNN sub-models generated by the plurality of training processing units to obtain an intermediate DNN model, wherein the intermediate DNN model is established as either the initial DNN model for a next training iteration or a final DNN model in accordance with a preset convergence condition.

Type: Grant

Filed: December 16, 2013

Date of Patent: November 29, 2016

Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED

Inventors: Eryu Wang, Li Lu, Xiang Zhang, Haibo Liu, Feng Rao, Lou Li, Shuai Yue, Bo Chen
Method and device for voiceprint recognition

Patent number: 9502038

Abstract: A method and device for voiceprint recognition, include: establishing a first-level Deep Neural Network (DNN) model based on unlabeled speech data, the unlabeled speech data containing no speaker labels and the first-level DNN model specifying a plurality of basic voiceprint features for the unlabeled speech data; obtaining a plurality of high-level voiceprint features by tuning the first-level DNN model based on labeled speech data, the labeled speech data containing speech samples with respective speaker labels, and the tuning producing a second-level DNN model specifying the plurality of high-level voiceprint features; based on the second-level DNN model, registering a respective high-level voiceprint feature sequence for a user based on a registration speech sample received from the user; and performing speaker verification for the user based on the respective high-level voiceprint feature sequence registered for the user.

Type: Grant

Filed: December 12, 2013

Date of Patent: November 22, 2016

Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED

Inventors: Eryu Wang, Li Lu, Xiang Zhang, Haibo Liu, Lou Li, Feng Rao, Duling Lu, Shuai Yue, Bo Chen
DATA PARALLEL PROCESSING METHOD AND APPARATUS BASED ON MULTIPLE GRAPHIC PROCESSING UNITS

Publication number: 20160321777

Abstract: A parallel data processing method based on multiple graphic processing units (GPUs) is provided, including: creating, in a central processing unit (CPU), a plurality of worker threads for controlling a plurality of worker groups respectively, the worker groups including one or more GPUs; binding each worker thread to a corresponding GPU; loading a plurality of batches of training data from a nonvolatile memory to GPU video memories in the plurality of worker groups; and controlling the plurality of GPUs to perform data processing in parallel through the worker threads. The method can enhance efficiency of multi-GPU parallel data processing. In addition, a parallel data processing apparatus is further provided.

Type: Application

Filed: July 14, 2016

Publication date: November 3, 2016

Inventors: Xing Jin, Yi Li, Yongqiang Zou, Zhimao Guo, Eryu Wang, Wei Xue, Bo Chen, Yong Li, Chunjian Bao, Lei Xiao
Keyword detection with international phonetic alphabet by foreground model and background model

Patent number: 9466289

Abstract: An electronic device with one or more processors and memory trains an acoustic model with an international phonetic alphabet (IPA) phoneme mapping collection and audio samples in different languages, where the acoustic model includes: a foreground model; and a background model. The device generates a phone decoder based on the trained acoustic model. The device collects keyword audio samples, decodes the keyword audio samples with the phone decoder to generate phoneme sequence candidates, and selects a keyword phoneme sequence from the phoneme sequence candidates. After obtaining the keyword phoneme sequence, the device detects one or more keywords in an input audio signal with the trained acoustic model, including: matching phonemic keyword portions of the input audio signal with phonemes in the keyword phoneme sequence with the foreground model; and filtering out phonemic non-keyword portions of the input audio signal with the background model.

Type: Grant

Filed: December 11, 2013

Date of Patent: October 11, 2016

Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED

Inventors: Li Lu, Xiang Zhang, Shuai Yue, Feng Rao, Eryu Wang, Lu Li
Method and system for adding punctuation to voice files

Patent number: 9442910

Abstract: A method and system for adding punctuation to a voice file is disclosed. The method includes: utilizing silence or pause duration detection to divide a voice file into a plurality of speech segments for processing, the voice file includes a plurality of features units; identifying all features units that appear in the voice file according to every term or expression and semantics features of the every term or expression that form each of the plurality of speech segments; using a linguistic model to determine a sum of weight of various punctuation modes in the voice file according to all the feature units, the linguistic model is built upon semantics features of various parsed out terms or expressions from a body text of a spoken sentence according to a language library; and adding punctuations to the voice file based on the determined sum of weight of the various punctuation modes.

Type: Grant

Filed: March 19, 2014

Date of Patent: September 13, 2016

Assignee: Tencent Technology (Shenzhen) Co., Ltd.

Inventors: Haibo Liu, Eryu Wang, Xiang Zhang, Li Lu, Shuai Yue, Bo Chen, Lou Li, Jian Liu
Method and device for acoustic language model training

Patent number: 9396723

Abstract: A method and a device for training an acoustic language model, include: conducting word segmentation for training samples in a training corpus using an initial language model containing no word class labels, to obtain initial word segmentation data containing no word class labels; performing word class replacement for the initial word segmentation data containing no word class labels, to obtain first word segmentation data containing word class labels; using the first word segmentation data containing word class labels to train a first language model containing word class labels; using the first language model containing word class labels to conduct word segmentation for the training samples in the training corpus, to obtain second word segmentation data containing word class labels; and in accordance with the second word segmentation data meeting one or more predetermined criteria, using the second word segmentation data containing word class labels to train the acoustic language model.

Type: Grant

Filed: December 17, 2013

Date of Patent: July 19, 2016

Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED

Inventors: Duling Lu, Lu Li, Feng Rao, Bo Chen, Li Lu, Xiang Zhang, Eryu Wang, Shuai Yue
Method and apparatus for performing speech keyword retrieval

Patent number: 9355637

Abstract: A method and an apparatus are provided for retrieving keyword. The apparatus configures at least two types of language models in a model file, where each type of language model includes a recognition model and a corresponding decoding model; the apparatus extracts a speech feature from the to-be-processed speech data; performs language matching on the extracted speech feature by using recognition models in the model file one by one, and determines a recognition model based on a language matching rate; and determines a decoding model corresponding to the recognition model; decoding the extracted speech feature by using the determined decoding model, and obtains a word recognition result after the decoding; and matches a keyword in a keyword dictionary and the word recognition result, and outputs a matched keyword.

Type: Grant

Filed: February 11, 2015

Date of Patent: May 31, 2016

Assignee: Tencent Technology (Shenzhen) Company Limited

Inventors: Jianxiong Ma, Lu Li, Li Lu, Xiang Zhang, Shuai Yue, Feng Rao, Eryu Wang, Linghui Kong
SYSTEMS AND METHODS FOR AUDIO COMMAND RECOGNITION

Publication number: 20160086609

Abstract: The present application discloses a method, an electronic system and a non-transitory computer readable storage medium for recognizing audio commands in an electronic device. The electronic device obtains audio data based on an audio signal provided by a user and extracts characteristic audio fingerprint features from the audio data. The electronic device further determines whether the corresponding audio signal is generated by an authorized user by comparing the characteristic audio fingerprint features with an audio fingerprint model for the authorized user and with a universal background model that represents user-independent audio fingerprint features, respectively. When the corresponding audio signal is generated by the authorized user of the electronic device, an audio command is extracted from the audio data, and an operation is performed according to the audio command.

Type: Application

Filed: December 3, 2015

Publication date: March 24, 2016

Inventors: Shuai Yue, Xiang Zhang, Li Lu, Feng Rao, Eryu Wang, Haibo Liu, Bo Chen, Jian Liu, Lu Li
Method and apparatus for performing speech keyword retrieval

Patent number: 9257118

Abstract: A method and an apparatus are provided for retrieving keyword. The apparatus configures at least two types of language models in a model file, where each type of language model includes a recognition model and a corresponding decoding model; the apparatus extracts a speech feature from the to-be-processed speech data; performs language matching on the extracted speech feature by using recognition models in the model file one by one, and determines a recognition model based on a language matching rate; and determines a decoding model corresponding to the recognition model; decoding the extracted speech feature by using the determined decoding model, and obtains a word recognition result after the decoding; and matches a keyword in a keyword dictionary and the word recognition result, and outputs a matched keyword.

Type: Grant

Filed: February 11, 2015

Date of Patent: February 9, 2016

Assignee: Tencent Technology (Shenzhen) Company Limited

Inventors: Jianxiong Ma, Lu Li, Li Lu, Xiang Zhang, Shuai Yue, Feng Rao, Eryu Wang, Linghui Kong
Keyword detection for speech recognition

Patent number: 9230541

Abstract: This application discloses a method implemented of recognizing a keyword in a speech that includes a sequence of audio frames further including a current frame and a subsequent frame. A candidate keyword is determined for the current frame using a decoding network that includes keywords and filler words of multiple languages, and used to determine a confidence score for the audio frame sequence. A word option is also determined for the subsequent frame based on the decoding network, and when the candidate keyword and the word option are associated with two distinct types of languages, the confidence score of the audio frame sequence is updated at least based on a penalty factor associated with the two distinct types of languages. The audio frame sequence is then determined to include both the candidate keyword and the word option by evaluating the updated confidence score according to a keyword determination criterion.

Type: Grant

Filed: December 11, 2014

Date of Patent: January 5, 2016

Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED

Inventors: Lu Ll, Li Lu, Jianxiong Ma, Linghui Kong, Feng Rao, Shuai Yue, Xiang Zhang, Haibo Liu, Eryu Wang, Bo Chen
User authentication method and apparatus based on audio and video data

Patent number: 9177131

Abstract: A computer-implemented method is performed at a server having one or more processors and memory storing programs executed by the one or more processors for authenticating a user from video and audio data. The method includes: receiving a login request from a mobile device, the login request including video data and audio data; extracting a group of facial features from the video data; extracting a group of audio features from the audio data and recognizing a sequence of words in the audio data; identifying a first user account whose respective facial features match the group of facial features and a second user account whose respective audio features match the group of audio features. If the first user account is the same as the second user account, retrieve the sequence of words associated with the user account and compare the sequences of words for authentication purpose.

Type: Grant

Filed: April 25, 2014

Date of Patent: November 3, 2015

Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED

Inventors: Xiang Zhang, Li Lu, Eryu Wang, Shuai Yue, Feng Rao, Haibo Liu, Lou Li, Duling Lu, Bo Chen
Method and Apparatus For Performing Speech Keyword Retrieval

Publication number: 20150154955

Abstract: A method and an apparatus are provided for retrieving keyword. The apparatus configures at least two types of language models in a model file, where each type of language model includes a recognition model and a corresponding decoding model; the apparatus extracts a speech feature from the to-be-processed speech data; performs language matching on the extracted speech feature by using recognition models in the model file one by one, and determines a recognition model based on a language matching rate; and determines a decoding model corresponding to the recognition model; decoding the extracted speech feature by using the determined decoding model, and obtains a word recognition result after the decoding; and matches a keyword in a keyword dictionary and the word recognition result, and outputs a matched keyword.

Type: Application

Filed: February 11, 2015

Publication date: June 4, 2015

Applicant: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED

Inventors: Jianxiong MA, Lu LI, Li LU, Xiang ZHANG, Shuai YUE, Feng RAO, Eryu WANG, Linghui KONG
Keyword Detection For Speech Recognition

Publication number: 20150095032

Abstract: This application discloses a method implemented of recognizing a keyword in a speech that includes a sequence of audio frames further including a current frame and a subsequent frame. A candidate keyword is determined for the current frame using a decoding network that includes keywords and filler words of multiple languages, and used to determine a confidence score for the audio frame sequence. A word option is also determined for the subsequent frame based on the decoding network, and when the candidate keyword and the word option are associated with two distinct types of languages, the confidence score of the audio frame sequence is updated at least based on a penalty factor associated with the two distinct types of languages. The audio frame sequence is then determined to include both the candidate keyword and the word option by evaluating the updated confidence score according to a keyword determination criterion.

Type: Application

Filed: December 11, 2014

Publication date: April 2, 2015

Inventors: Lu LI, Li Lu, Jianxiong Ma, Linghui Kong, Feng Rao, Shuai Yue, Xiang Zhang, Haibo Liu, Eryu Wang, Bo Chen

1 2 next