Patents by Inventor Guangsen WANG

Guangsen WANG has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11972754
    Abstract: Methods and apparatuses are provided for performing sequence to sequence (Seq2Seq) speech recognition training performed by at least one processor. The method includes acquiring a training set comprising a plurality of pairs of input data and target data corresponding to the input data, encoding the input data into a sequence of hidden states, performing a connectionist temporal classification (CTC) model training based on the sequence of hidden states, performing an attention model training based on the sequence of hidden states, and decoding the sequence of hidden states to generate target labels by independently performing the CTC model training and the attention model training.
    Type: Grant
    Filed: December 22, 2021
    Date of Patent: April 30, 2024
    Assignee: TENCENT AMERICA LLC
    Inventors: Jia Cui, Chao Weng, Guangsen Wang, Jun Wang, Chengzhu Yu, Dan Su, Dong Yu
  • Patent number: 11803618
    Abstract: A method and apparatus are provided that analyzing sequence-to-sequence data, such as sequence-to-sequence speech data or sequence-to-sequence machine translation data for example, by minimum Bayes risk (MBR) training a sequence-to-sequence model and within introduction of applications of softmax smoothing to an N-best generation of the MBR training of the sequence-to-sequence model.
    Type: Grant
    Filed: November 17, 2022
    Date of Patent: October 31, 2023
    Assignee: TENCENT AMERICA LLC
    Inventors: Chao Weng, Jia Cui, Guangsen Wang, Jun Wang, Chengzhu Yu, Dan Su, Dong Yu
  • Patent number: 11798534
    Abstract: Embodiments described herein provide an Adapt-and-Adjust (A2) mechanism for multilingual speech recognition model that combines both adaptation and adjustment methods as an integrated end-to-end training to improve the models' generalization and mitigate the long-tailed issue. Specifically, a multilingual language model mBERT is utilized, and converted into an autoregressive transformer decoder. In addition, a cross-attention module is added to the encoder on top of the mBERT's self-attention layer in order to explore the acoustic space in addition to the text space. The joint training of the encoder and mBERT decoder can bridge the semantic gap between the speech and the text.
    Type: Grant
    Filed: January 29, 2021
    Date of Patent: October 24, 2023
    Assignee: salesforce.com, inc.
    Inventors: Guangsen Wang, Chu Hong Hoi, Genta Indra Winata
  • Publication number: 20230237275
    Abstract: Embodiments provide a software framework for evaluating and troubleshooting real-world task-oriented bot systems. Specifically, the evaluation framework includes a generator that infers dialog acts and entities from bot definitions and generates test cases for the system via model-based paraphrasing. The framework may also include a simulator for task-oriented dialog user simulation that supports both regression testing and end-to-end evaluation. The framework may also include a remediator to analyze and visualize the simulation results, remedy some of the identified issues, and provide actionable suggestions for improving the task-oriented dialog system.
    Type: Application
    Filed: June 2, 2022
    Publication date: July 27, 2023
    Inventors: Guangsen Wang, Samson Min Rong Tan, Shafiq Rayhan Joty, Gang Wu, Chu Hong Hoi, Ka Chun Au
  • Publication number: 20230092440
    Abstract: A method and apparatus are provided that analyzing sequence-to-sequence data, such as sequence-to-sequence speech data or sequence-to-sequence machine translation data for example, by minimum Bayes risk (MBR) training a sequence-to-sequence model and within introduction of applications of softmax smoothing to an N-best generation of the MBR training of the sequence-to-sequence model.
    Type: Application
    Filed: November 17, 2022
    Publication date: March 23, 2023
    Applicant: TENCENT AMERICA LLC
    Inventors: Chao WENG, Jia CUI, Guangsen WANG, Jun WANG, Chengzhu YU, Dan SU, Dong YU
  • Patent number: 11551136
    Abstract: A method and apparatus are provided that analyzing sequence-to-sequence data, such as sequence-to-sequence speech data or sequence-to-sequence machine translation data for example, by minimum Bayes risk (MBR) training a sequence-to-sequence model and within introduction of applications of softmax smoothing to an N-best generation of the MBR training of the sequence-to-sequence model.
    Type: Grant
    Filed: November 14, 2018
    Date of Patent: January 10, 2023
    Assignee: TENCENT AMERICA LLC
    Inventors: Chao Weng, Jia Cui, Guangsen Wang, Jun Wang, Chengzhu Yu, Dan Su, Dong Yu
  • Publication number: 20220115005
    Abstract: Methods and apparatuses are provided for performing sequence to sequence (Seq2Seq) speech recognition training performed by at least one processor. The method includes acquiring a training set comprising a plurality of pairs of input data and target data corresponding to the input data, encoding the input data into a sequence of hidden states, performing a connectionist temporal classification (CTC) model training based on the sequence of hidden states, performing an attention model training based on the sequence of hidden states, and decoding the sequence of hidden states to generate target labels by independently performing the CTC model training and the attention model training.
    Type: Application
    Filed: December 22, 2021
    Publication date: April 14, 2022
    Applicant: TENCENT AMERICA LLC
    Inventors: Jia CUI, Chao WENG, Guangsen WANG, Jun WANG, Chengzhu YU, Dan SU, Dong YU
  • Publication number: 20220108688
    Abstract: Embodiments described herein provide an Adapt-and-Adjust (A2) mechanism for multilingual speech recognition model that combines both adaptation and adjustment methods as an integrated end-to-end training to improve the models' generalization and mitigate the long-tailed issue. Specifically, a multilingual language model mBERT is utilized, and converted into an autoregressive transformer decoder. In addition, a cross-attention module is added to the encoder on top of the mBERT's self-attention layer in order to explore the acoustic space in addition to the text space. The joint training of the encoder and mBERT decoder can bridge the semantic gap between the speech and the text.
    Type: Application
    Filed: January 29, 2021
    Publication date: April 7, 2022
    Inventors: Guangsen Wang, Chu Hong Hoi, Genta Indra Winata
  • Patent number: 11257481
    Abstract: Methods and apparatuses are provided for performing sequence to sequence (Seq2Seq) speech recognition training performed by at least one processor. The method includes acquiring a training set comprising a plurality of pairs of input data and target data corresponding to the input data, encoding the input data into a sequence of hidden states, performing a connectionist temporal classification (CTC) model training based on the sequence of hidden states, performing an attention model training based on the sequence of hidden states, and decoding the sequence of hidden states to generate target labels by independently performing the CTC model training and the attention model training.
    Type: Grant
    Filed: October 24, 2018
    Date of Patent: February 22, 2022
    Assignee: TENCENT AMERICA LLC
    Inventors: Jia Cui, Chao Weng, Guangsen Wang, Jun Wang, Chengzhu Yu, Dan Su, Dong Yu
  • Patent number: 10672382
    Abstract: Methods and apparatuses are provided for performing end-to-end speech recognition training performed by at least one processor. The method includes receiving, by the at least one processor, one or more input speech frames, generating, by the at least one processor, a sequence of encoder hidden states by transforming the input speech frames, computing, by the at least one processor, attention weights based on each of the sequence of encoder hidden states and a current decoder hidden state, performing, by the at least one processor, a decoding operation based on a previous embedded label prediction information and a previous attentional hidden state information generated based on the attention weights; and generating a current embedded label prediction information based on a result of the decoding operation and the attention weights.
    Type: Grant
    Filed: October 15, 2018
    Date of Patent: June 2, 2020
    Assignee: TENCENT AMERICA LLC
    Inventors: Chao Weng, Jia Cui, Guangsen Wang, Jun Wang, Chengzhu Yu, Dan Su, Dong Yu
  • Publication number: 20200151623
    Abstract: A method and apparatus are provided that analyzing sequence-to-sequence data, such as sequence-to-sequence speech data or sequence-to-sequence machine translation data for example, by minimum Bayes risk (MBR) training a sequence-to-sequence model and within introduction of applications of softmax smoothing to an N-best generation of the MBR training of the sequence-to-sequence model.
    Type: Application
    Filed: November 14, 2018
    Publication date: May 14, 2020
    Applicant: TENCENT America LLC
    Inventors: Chao WENG, Jia CUI, Guangsen WANG, Jun WANG, Chengzhu YU, Dan SU, Dong YU
  • Publication number: 20200135174
    Abstract: Methods and apparatuses are provided for performing sequence to sequence (Seq2Seq) speech recognition training performed by at least one processor. The method includes acquiring a training set comprising a plurality of pairs of input data and target data corresponding to the input data, encoding the input data into a sequence of hidden states, performing a connectionist temporal classification (CTC) model training based on the sequence of hidden states, performing an attention model training based on the sequence of hidden states, and decoding the sequence of hidden states to generate target labels by independently performing the CTC model training and the attention model training.
    Type: Application
    Filed: October 24, 2018
    Publication date: April 30, 2020
    Applicant: TENCENT AMERICA LLC
    Inventors: Jia CUI, Chao WENG, Guangsen WANG, Jun WANG, Chengzhu YU, Dan SU, Dong YU
  • Publication number: 20200118547
    Abstract: Methods and apparatuses are provided for performing end-to-end speech recognition training performed by at least one processor. The method includes receiving, by the at least one processor, one or more input speech frames, generating, by the at least one processor, a sequence of encoder hidden states by transforming the input speech frames, computing, by the at least one processor, attention weights based on each of the sequence of encoder hidden states and a current decoder hidden state, performing, by the at least one processor, a decoding operation based on a previous embedded label prediction information and a previous attentional hidden state information generated based on the attention weights; and generating a current embedded label prediction information based on a result of the decoding operation and the attention weights.
    Type: Application
    Filed: October 15, 2018
    Publication date: April 16, 2020
    Applicant: TENCENT AMERICA LLC
    Inventors: Chao WENG, Jia Cui, Guangsen WANG, Jun Wang, Chengzhu Yu, Dan Su, Dong Yu