Patents by Inventor Shuai Yue

Shuai Yue has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20140350918
    Abstract: A method and system for adding punctuation to a voice file is disclosed. The method includes: utilizing silence or pause duration detection to divide a voice file into a plurality of speech segments for processing, the voice file includes a plurality of features units; identifying all features units that appear in the voice file according to every term or expression and semantics features of the every term or expression that form each of the plurality of speech segments; using a linguistic model to determine a sum of weight of various punctuation modes in the voice file according to all the feature units, the linguistic model is built upon semantics features of various parsed out terms or expressions from a body text of a spoken sentence according to a language library; and adding punctuations to the voice file based on the determined sum of weight of the various punctuation modes.
    Type: Application
    Filed: March 19, 2014
    Publication date: November 27, 2014
    Applicant: TENCENT TECHNOLOGY (SHENZHEN) CO., LTD.
    Inventors: Haibo LIU, Eryu WANG, Xiang ZHANG, Li LU, Shuai YUE, Bo CHEN, Lou LI, Jian LIU
  • Publication number: 20140350934
    Abstract: Systems and methods are provided for voice identification. For example, audio characteristics are extracted from acquired voice signals; a syllable confusion network is identified based on at least information associated with the audio characteristics; a word lattice is generated based on at least information associated with the syllable confusion network and a predetermined phonetic dictionary; and an optimal character sequence is calculated in the word lattice as an identification result.
    Type: Application
    Filed: May 30, 2014
    Publication date: November 27, 2014
    Applicant: Tencent Technology (Shenzhen) Company Limited
    Inventors: Lou Li, Li Lu, Xiang Zhang, Feng Rao, Shuai Yue, Bo Chen, Jianxiong Ma, Haibo Liu
  • Publication number: 20140350939
    Abstract: Systems and methods are provided for adding punctuations. For example, one or more first feature units are identified in a voice file taken as a whole; the voice file is divided into multiple segments: one or more second feature units are identified in the voice file; a first aggregate weight of first punctuation states of the voice file and a second aggregate weight of second punctuation states of the voice file are determined, using a language model established based on word separation and third semantic features; a weighted calculation is performed to generate a third aggregate weight based on at least information associated with the first aggregate weight and the second aggregate weight; and one or more final punctuations are added to the voice file based on at least information associated with the third aggregate weight.
    Type: Application
    Filed: January 22, 2014
    Publication date: November 27, 2014
    Applicant: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
    Inventors: Haibo Liu, Eryu Wang, Xiang Zhang, Shuai Yue, Lu Li, Li Lu, Jian Liu, Bo Chen
  • Publication number: 20140324426
    Abstract: The present invention, pertaining to the field of speech recognition, discloses a reminder setting method and apparatus. The method includes: acquiring speech signals; acquiring time information in speech signals by using keyword recognition, and determining reminder time for reminder setting according to the time information; acquiring text sequence corresponding to the speech signals by using continuous speech recognition, and determining reminder content for reminder setting according to the time information and the text sequence; and setting a reminder according to the reminder time and the reminder content.
    Type: Application
    Filed: May 28, 2013
    Publication date: October 30, 2014
    Inventors: Li LU, Feng RAO, Song LIU, Zongyao TANG, Xiang ZHANG, Shuai YUE, Bo CHEN
  • Publication number: 20140236591
    Abstract: A method of recognizing speech is provided that includes generating a decoding network that includes a primary sub-network and a classification sub-network. The primary sub-network includes a classification node corresponding to the classification sub-network. The classification sub-network corresponds to a group of uncommon words. A speech input is received and decoded by instantiating a token in the primary sub-network and passing the token through the primary network. When the token reaches the classification node, the method includes transferring the token to the classification sub-network and passing the token through the classification sub-network. When the token reaches an accept node of the classification sub-network, the method includes returning a result of the token passing through the classification sub-network to the primary sub-network. The result includes one or more words in the group of uncommon words. A string corresponding to the speech input is output that includes the one or more words.
    Type: Application
    Filed: April 28, 2014
    Publication date: August 21, 2014
    Applicant: Tencent Technology (Shenzhen) Company Limited
    Inventors: Shuai YUE, Li Lu, Xiang Zhang, Dadong Xie, Bo Chen, Feng Rao
  • Publication number: 20140236600
    Abstract: An electronic device with one or more processors and memory trains an acoustic model with an international phonetic alphabet (IPA) phoneme mapping collection and audio samples in different languages, where the acoustic model includes: a foreground model; and a background model. The device generates a phone decoder based on the trained acoustic model. The device collects keyword audio samples, decodes the keyword audio samples with the phone decoder to generate phoneme sequence candidates, and selects a keyword phoneme sequence from the phoneme sequence candidates. After obtaining the keyword phoneme sequence, the device detects one or more keywords in an input audio signal with the trained acoustic model, including: matching phonemic keyword portions of the input audio signal with phonemes in the keyword phoneme sequence with the foreground model; and filtering out phonemic non-keyword portions of the input audio signal with the background model.
    Type: Application
    Filed: December 11, 2013
    Publication date: August 21, 2014
    Applicant: Tencent Technology (Shenzhen) Company Limited
    Inventors: Li LU, Xiang ZHANG, Shuai YUE, Feng RAO, Eryu WANG, Lu LI
  • Publication number: 20140237576
    Abstract: A computer-implemented method is performed at a server having one or more processors and memory storing programs executed by the one or more processors for authenticating a user from video and audio data. The method includes: receiving a login request from a mobile device, the login request including video data and audio data; extracting a group of facial features from the video data; extracting a group of audio features from the audio data and recognizing a sequence of words in the audio data; identifying a first user account whose respective facial features match the group of facial features and a second user account whose respective audio features match the group of audio features. If the first user account is the same as the second user account, retrieve the sequence of words associated with the user account and compare the sequences of words for authentication purpose.
    Type: Application
    Filed: April 25, 2014
    Publication date: August 21, 2014
    Applicant: Tencent Technology (Shenzhen) Company Limited
    Inventors: Xiang ZHANG, Li LU, Eryu WANG, Shuai YUE, Feng RAO, Haibo LlU, Lou LI, Duling LU, Bo CHEN
  • Publication number: 20140222417
    Abstract: A method and a device for training an acoustic language model, include: conducting word segmentation for training samples in a training corpus using an initial language model containing no word class labels, to obtain initial word segmentation data containing no word class labels; performing word class replacement for the initial word segmentation data containing no word class labels, to obtain first word segmentation data containing word class labels; using the first word segmentation data containing word class labels to train a first language model containing word class labels; using the first language model containing word class labels to conduct word segmentation for the training samples in the training corpus, to obtain second word segmentation data containing word class labels; and in accordance with the second word segmentation data meeting one or more predetermined criteria, using the second word segmentation data containing word class labels to train the acoustic language model.
    Type: Application
    Filed: December 17, 2013
    Publication date: August 7, 2014
    Applicant: Tencent Technology (Shenzhen) Company Limited
    Inventors: Duling LU, Lu LI, Feng RAO, Bo CHEN, Li LU, Xiang ZHANG, Eryu WANG, Shuai YUE
  • Publication number: 20140214401
    Abstract: A computer-implemented method is performed at a device having one or more processors and memory storing programs executed by the one or more processors. The method comprises: selecting a target word in a target sentence; from the target sentence, acquiring a first sequence of words that precede the target word and a second sequence of words that succeed the target word; from a sentence database, searching and acquiring a group of words, each of which separates the first sequence of words from the second sequence of words in a sentence; creating a candidate sentence for each of the candidate words by replacing the target word in the target sentence with each of the candidate words; determining the fittest sentence among the candidate sentences according to a linguistic model; and suggesting the candidate word within the fittest sentence as a correction.
    Type: Application
    Filed: December 13, 2013
    Publication date: July 31, 2014
    Applicant: Tencent Technology (Shenzhen) Company Limited
    Inventors: Lou LI, Qiang CHENG, Feng RAO, Li LU, Xiang ZHANG, Shuai YUE, Bo CHEN
  • Publication number: 20140214417
    Abstract: A method and device for voiceprint recognition, include: establishing a first-level Deep Neural Network (DNN) model based on unlabeled speech data, the unlabeled speech data containing no speaker labels and the first-level DNN model specifying a plurality of basic voiceprint features for the unlabeled speech data; obtaining a plurality of high-level voiceprint features by tuning the first-level DNN model based on labeled speech data, the labeled speech data containing speech samples with respective speaker labels, and the tuning producing a second-level DNN model specifying the plurality of high-level voiceprint features; based on the second-level DNN model, registering a respective high-level voiceprint feature sequence for a user based on a registration speech sample received from the user; and performing speaker verification for the user based on the respective high-level voiceprint feature sequence registered for the user.
    Type: Application
    Filed: December 12, 2013
    Publication date: July 31, 2014
    Applicant: Tencent Technology (Shenzhen) Company Limited
    Inventors: Eryu WANG, Li LU, Xiang ZHANG, Haibo LIU, Lou LI, Feng RAO, Duling LU, Shuai YUE, Bo CHEN
  • Publication number: 20140214419
    Abstract: An automatic speech recognition method includes at a computer having one or more processors and memory for storing one or more programs to be executed by the processors, obtaining a plurality of speech corpus categories through classifying and calculating raw speech corpus; obtaining a plurality of classified language models that respectively correspond to the plurality of speech corpus categories through a language model training applied on each speech corpus category; obtaining an interpolation language model through implementing a weighted interpolation on each classified language model and merging the interpolated plurality of classified language models; constructing a decoding resource in accordance with an acoustic model and the interpolation language model; and decoding input speech using the decoding resource, and outputting a character string with a highest probability as a recognition result of the input speech.
    Type: Application
    Filed: December 16, 2013
    Publication date: July 31, 2014
    Applicant: Tencent Technology (Shenzhen) Company Limited
    Inventors: Feng Rao, Li Lu, Bo Chen, Shuai Yue, Xiang Zhang, Eryu Wang, Dadong Xie, Lou Li, Duling Lu
  • Publication number: 20140214416
    Abstract: A method of recognizing speech commands includes generating a background acoustic model for a sound using a first sound sample, the background acoustic model characterized by a first precision metric. A foreground acoustic model is generated for the sound using a second sound sample, the foreground acoustic model characterized by a second precision metric. A third sound sample is received and decoded by assigning a weight to the third sound sample corresponding to a probability that the sound sample originated in a foreground using the foreground acoustic model and the background acoustic model. The method further includes determining if the weight meets predefined criteria for assigning the third sound sample to the foreground and, when the weight meets the predefined criteria, interpreting the third sound sample as a portion of a speech command. Otherwise, recognition of the third sound sample as a portion of a speech command is forgone.
    Type: Application
    Filed: December 13, 2013
    Publication date: July 31, 2014
    Applicant: Tencent Technology (Shenzhen) Company Limited
    Inventors: Shuai YUE, Li LU, Xiang ZHANG, Dadong XIE, Haibo LIU, Bo CHEN, Jian LIU
  • Publication number: 20140214406
    Abstract: A method of processing information content based on a language model is performed at a computer, the method including the following steps: identifying a plurality of expressions in the information content that is queued to be processed; dividing the plurality of expressions into a plurality of characteristic units according to semantic features and predetermined characteristics associated with each of the plurality of characteristic units, each characteristic unit including a subset of the plurality of expressions and the predetermined characteristics at least including a respective integer number of expressions that are included in the characteristic unit; extracting, from the language model, a plurality of probabilities for a plurality of punctuation marks associated with each of the plurality of characteristic units; and in accordance with the extracted probabilities, associating a respective punctuation mark with each of the plurality of characteristic units included in the information content.
    Type: Application
    Filed: January 6, 2014
    Publication date: July 31, 2014
    Applicant: Tencent Technology (Shenzhen) Company Limited
    Inventors: Haibo LIU, Eryu WANG, Xiang ZHANG, Li LU, Shuai YUE, Qiuge LIU, Bo CHEN, Jian LIU, Lu LI
  • Publication number: 20130162532
    Abstract: A method and system for gesture-based human-machine interaction and computer-readable medium are provided. The system includes a capturing module, a positioning module, and a transforming module. The method includes the steps of: capturing images from a user's video streams, positioning coordinates of three or more predetermined color blocks in the foreground, simulating movements of a mouse according to the coordinates of the first color block, and simulating click actions of the mouse according to the coordinates of the other color blocks. The embodiments according to the current disclosure position coordinates of a plurality of color blocks through processing the captured user's video streams, and simulate mouse actions according to the coordinates of the color blocks. Processing apparatuses like computers may be extended to facilitate gesture-based human-machine interactions through a very simple way, and a touch-sensitive interaction effect can be simulated, without the presence of a touch screen.
    Type: Application
    Filed: August 16, 2011
    Publication date: June 27, 2013
    Applicant: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
    Inventors: Tong Cheng, Shuai Yue