Patents by Inventor Dan Su

Dan Su has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20210021925
    Abstract: A far-field pickup device including a device body and a microphone pickup unit is provided. The microphone pickup unit is configured to collect user speech and an echo of a first sound signal output by the device body, and transmit, to the device body, a signal obtained through digital conversion of the collected user speech and the echo. The device body includes a signal playback source, a synchronizing signal generator, a horn, a delay determining unit, and an echo cancellation unit configured to perform echo cancellation on the signal transmitted by the microphone pickup unit to obtain a collected human voice signal.
    Type: Application
    Filed: September 25, 2020
    Publication date: January 21, 2021
    Applicant: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LTD
    Inventors: Ji Meng ZHENG, Meng YU, Dan SU
  • Publication number: 20210005216
    Abstract: A multi-person speech separation method is provided for a terminal. The method includes extracting a hybrid speech feature from a hybrid speech signal requiring separation, N human voices being mixed in the hybrid speech signal, N being a positive integer greater than or equal to 2; extracting a masking coefficient of the hybrid speech feature by using a generative adversarial network (GAN) model, to obtain a masking matrix corresponding to the N human voices, wherein the GAN model comprises a generative network model and an adversarial network model; and performing a speech separation on the masking matrix corresponding to the N human voices and the hybrid speech signal by using the GAN model, and outputting N separated speech signals corresponding to the N human voices.
    Type: Application
    Filed: September 17, 2020
    Publication date: January 7, 2021
    Inventors: Lianwu CHEN, Meng YU, Yanmin QIAN, Dan SU, Dong YU
  • Publication number: 20200393896
    Abstract: A method for gaze estimation. The method includes processing training data and determining one or more local-learning base gaze estimation model based on the training data. The local-learning base gaze estimation model(s) can be used for determining one or both of: 2D gaze points in a scene image and 3D gaze points in scene camera coordinates.
    Type: Application
    Filed: June 17, 2019
    Publication date: December 17, 2020
    Inventors: Youfu Li, Dan Su
  • Publication number: 20200380949
    Abstract: This application relates to a speech synthesis method and apparatus, a model training method and apparatus, and a computer device. The method includes: obtaining to-be-processed linguistic data; encoding the linguistic data, to obtain encoded linguistic data; obtaining an embedded vector for speech feature conversion, the embedded vector being generated according to a residual between synthesized reference speech data and reference speech data that correspond to the same reference linguistic data; and decoding the encoded linguistic data according to the embedded vector, to obtain target synthesized speech data on which the speech feature conversion is performed. The solution provided in this application can prevent quality of a synthesized speech from being affected by a semantic feature in a mel-frequency cepstrum.
    Type: Application
    Filed: August 21, 2020
    Publication date: December 3, 2020
    Inventors: Xixin WU, Mu WANG, Shiyin KANG, Dan SU, Dong YU
  • Publication number: 20200372905
    Abstract: A mixed speech recognition method, a mixed speech recognition apparatus, and a computer-readable storage medium are provided. The mixed speech recognition method includes: monitoring an input of speech input and detecting an enrollment speech and a mixed speech; acquiring speech features of a target speaker based on the enrollment speech; and determining speech belonging to the target speaker in the mixed speech based on the speech features of the target speaker. The enrollment speech includes preset speech information, and the mixed speech is non-enrollment speech inputted after the enrollment speech.
    Type: Application
    Filed: August 10, 2020
    Publication date: November 26, 2020
    Inventors: Jun WANG, Jie CHEN, Dan SU, Dong YU
  • Publication number: 20200357386
    Abstract: A method for detecting a keyword, applied to a terminal, includes: extracting a speech eigenvector of a speech signal; obtaining, according to the speech eigenvector, a posterior probability of each target character being a key character in any keyword in an acquisition time period of the speech signal; obtaining confidences of at least two target character combinations according to the posterior probability of each target character; and determining that the speech signal includes the keyword upon determining that all the confidences of the at least two target character combinations meet a preset condition. The target character is a character in the speech signal whose pronunciation matches a pronunciation of the key character. Each target character combination includes at least one target character, and a confidence of a target character combination represents a probability of the target character combination being the keyword or a part of the keyword.
    Type: Application
    Filed: July 20, 2020
    Publication date: November 12, 2020
    Inventors: Yi GAO, Meng YU, Dan SU, Jie CHEN, Min LUO
  • Publication number: 20200357389
    Abstract: A data processing method based on simultaneous interpretation, applied to a server in a simultaneous interpretation system, including: obtaining audio transmitted by a simultaneous interpretation device; processing the audio by using a simultaneous interpretation model to obtain an initial text; transmitting the initial text to a user terminal; receiving a modified text fed back by the user terminal, the modified text being obtained after the user terminal modifies the initial text; and updating the simultaneous interpretation model according to the initial text and the modified text.
    Type: Application
    Filed: July 28, 2020
    Publication date: November 12, 2020
    Inventors: Jingliang BAI, Caisheng OUYANG, Haikang LIU, Lianwu CHEN, Qi CHEN, Yulu ZHANG, Min LUO, Dan SU
  • Publication number: 20200286465
    Abstract: A speech keyword recognition method includes: obtaining first speech segments based on a to-be-recognized speech signal; obtaining first probabilities respectively corresponding to the first speech segments by using a preset first classification model. A first probability of a first speech segment is obtained from probabilities of the first speech segment respectively corresponding to pre-determined word segmentation units of a pre-determined keyword.
    Type: Application
    Filed: May 27, 2020
    Publication date: September 10, 2020
    Inventors: Jun WANG, Dan SU, Dong YU
  • Publication number: 20200224483
    Abstract: To address the deficiencies in the prior art, the present invention provides a laser sensor for human body safety protection for use in a revolving door, which comprises: a laser scan range calculation section and an application analysis section. The laser scan range calculation section comprises a laser emitting device, a laser deflecting device, an optical signal receiving device, and an analyzing and processing device. Wherein, the laser emitting device emits laser signals to the laser deflecting device; the laser deflecting device deflects the laser signals by a pre-set angle and forms at least one laser scan area; the optical signal receiving device receives returned laser signals and transmits said signals to the analyzing and processing device. The analyzing and processing device comprises a trigger point distance analyzing module, which makes analysis to obtain trigger point distance information according to signals sent from the optical signal receiving device.
    Type: Application
    Filed: January 29, 2018
    Publication date: July 16, 2020
    Applicant: B.E.A ELECTRONICS (BEIJING) CO., LTD.
    Inventors: Dan Su, Jixiang Liu, Yuhe Xu
  • Patent number: 10672382
    Abstract: Methods and apparatuses are provided for performing end-to-end speech recognition training performed by at least one processor. The method includes receiving, by the at least one processor, one or more input speech frames, generating, by the at least one processor, a sequence of encoder hidden states by transforming the input speech frames, computing, by the at least one processor, attention weights based on each of the sequence of encoder hidden states and a current decoder hidden state, performing, by the at least one processor, a decoding operation based on a previous embedded label prediction information and a previous attentional hidden state information generated based on the attention weights; and generating a current embedded label prediction information based on a result of the decoding operation and the attention weights.
    Type: Grant
    Filed: October 15, 2018
    Date of Patent: June 2, 2020
    Assignee: TENCENT AMERICA LLC
    Inventors: Chao Weng, Jia Cui, Guangsen Wang, Jun Wang, Chengzhu Yu, Dan Su, Dong Yu
  • Publication number: 20200151623
    Abstract: A method and apparatus are provided that analyzing sequence-to-sequence data, such as sequence-to-sequence speech data or sequence-to-sequence machine translation data for example, by minimum Bayes risk (MBR) training a sequence-to-sequence model and within introduction of applications of softmax smoothing to an N-best generation of the MBR training of the sequence-to-sequence model.
    Type: Application
    Filed: November 14, 2018
    Publication date: May 14, 2020
    Applicant: TENCENT America LLC
    Inventors: Chao WENG, Jia CUI, Guangsen WANG, Jun WANG, Chengzhu YU, Dan SU, Dong YU
  • Publication number: 20200135174
    Abstract: Methods and apparatuses are provided for performing sequence to sequence (Seq2Seq) speech recognition training performed by at least one processor. The method includes acquiring a training set comprising a plurality of pairs of input data and target data corresponding to the input data, encoding the input data into a sequence of hidden states, performing a connectionist temporal classification (CTC) model training based on the sequence of hidden states, performing an attention model training based on the sequence of hidden states, and decoding the sequence of hidden states to generate target labels by independently performing the CTC model training and the attention model training.
    Type: Application
    Filed: October 24, 2018
    Publication date: April 30, 2020
    Applicant: TENCENT AMERICA LLC
    Inventors: Jia CUI, Chao WENG, Guangsen WANG, Jun WANG, Chengzhu YU, Dan SU, Dong YU
  • Publication number: 20200118547
    Abstract: Methods and apparatuses are provided for performing end-to-end speech recognition training performed by at least one processor. The method includes receiving, by the at least one processor, one or more input speech frames, generating, by the at least one processor, a sequence of encoder hidden states by transforming the input speech frames, computing, by the at least one processor, attention weights based on each of the sequence of encoder hidden states and a current decoder hidden state, performing, by the at least one processor, a decoding operation based on a previous embedded label prediction information and a previous attentional hidden state information generated based on the attention weights; and generating a current embedded label prediction information based on a result of the decoding operation and the attention weights.
    Type: Application
    Filed: October 15, 2018
    Publication date: April 16, 2020
    Applicant: TENCENT AMERICA LLC
    Inventors: Chao WENG, Jia Cui, Guangsen WANG, Jun Wang, Chengzhu Yu, Dan Su, Dong Yu
  • Publication number: 20200051549
    Abstract: Embodiments of the present invention provide a speech signal processing model training method, an electronic device and a storage medium. The embodiments of the present invention determines a target training loss function based on a training loss function of each of one or more speech signal processing tasks; inputs a task input feature of each speech signal processing task into a starting multi-task neural network, and updates model parameters of a shared layer and each of one or more task layers of the starting multi-task neural network corresponding to the one or more speech signal processing tasks by minimizing the target training loss function as a training objective, until the starting multi-task neural network converges, to obtain a speech signal processing model.
    Type: Application
    Filed: October 17, 2019
    Publication date: February 13, 2020
    Applicant: Tencent Technology (Shenzhen) Company Limited
    Inventors: Lianwu Chen, Meng Yu, Min Luo, Dan Su
  • Patent number: 10551712
    Abstract: Disclosed is a display apparatus. The display apparatus comprises a display and a field-induced visibility-controlling layer provided on light-outgoing side of the display, wherein the field-induced visibility-controlling layer can be switched between a transparent state and a mirror state by adjusting voltage applied, such that when the field-induced visibility-controlling layer is in the transparent state, the display is visible through the field-induced visibility-controlling layer; and when the field-induced visibility-controlling layer is in the mirror state, a mirror shielding the display is formed therein.
    Type: Grant
    Filed: October 12, 2016
    Date of Patent: February 4, 2020
    Assignees: BOE TECHNOLOGY GROUP CO., LTD., BEIJING BOE OPTOELECTRONICS TECHNOLOGY CO., LTD.
    Inventors: Zhipeng Feng, Dan Su, Zongze He, Shuo Li, Jianguang Yang, Liang Zhang
  • Publication number: 20190214020
    Abstract: The present disclosure relates to a method, apparatus, and system for speaker verification. The method includes: acquiring an audio recording; extracting speech signals from the audio recording; extracting features of the extracted speech signals; and determining whether the extracted speech signals represent speech by a predetermined speaker based on the extracted features and a speaker model trained with reference voice data of the predetermined speaker.
    Type: Application
    Filed: March 14, 2019
    Publication date: July 11, 2019
    Applicant: BEIJING DIDI INFINITY TECHNOLOGY AND DEVELOPMENT CO., LTD.
    Inventors: Jie CHEN, Dan SU, Tianxiao FU, Na HU
  • Patent number: 10276167
    Abstract: The present disclosure relates to a method, apparatus, and system for speaker verification. The method includes: acquiring an audio recording; extracting speech signals from the audio recording; extracting features of the extracted speech signals; and determining whether the extracted speech signals represent speech by a predetermined speaker based on the extracted features and a speaker model trained with reference voice data of the predetermined speaker.
    Type: Grant
    Filed: January 17, 2018
    Date of Patent: April 30, 2019
    Assignee: BEIJING DIDI INFINITY TECHNOLOGY AND DEVELOPMENT CO., LTD.
    Inventors: Jie Chen, Dan Su, Tianxiao Fu, Na Hu
  • Patent number: 10277589
    Abstract: The present invention discloses a voiceprint verification method, apparatus, storage medium and device.
    Type: Grant
    Filed: November 3, 2015
    Date of Patent: April 30, 2019
    Assignee: Baidu Online Network Technology (Beijing) Co., Ltd.
    Inventors: Dan Su, Yong Guan
  • Patent number: 10210817
    Abstract: The present application discloses a circuit for driving a light emitting diode light source assembly having a plurality of light emitting diode groups, each group having at least one light emitting diode. The circuit includes a processor configured to determine a set brightness level, calculate a number of light emitting diode groups required to be on to achieve the set brightness level, and select the number of light emitting diode groups to be turned on at allocated positions in the light emitting diode light source assembly; and a driving sub-circuit configured to turn on the number of light emitting diode groups at the allocated positions. The number of light emitting diode groups is a positive integer N, N is less than a total number of the plurality of light emitting diode groups.
    Type: Grant
    Filed: November 14, 2016
    Date of Patent: February 19, 2019
    Assignees: BOE TECHNOLOGY GROUP CO., LTD., BEIJING BOE OPTOELECTRONICS TECHNOLOGY CO., LTD.
    Inventors: Liang Zhang, Shuo Li, Zhipeng Feng, Jianguang Yang, Dan Su, Zongze He, Jieqiong Wang
  • Publication number: 20180358020
    Abstract: The present disclosure relates to a method, apparatus, and system for speaker verification. The method includes: acquiring an audio recording; extracting speech signals from the audio recording; extracting features of the extracted speech signals; and determining whether the extracted speech signals represent speech by a predetermined speaker based on the extracted features and a speaker model trained with reference voice data of the predetermined speaker.
    Type: Application
    Filed: January 17, 2018
    Publication date: December 13, 2018
    Applicant: BEIJING DIDI INFINITY TECHNOLOGY AND DEVELOPMENT CO., LTD.
    Inventors: Jie CHEN, Dan SU, Tianxiao FU, Na HU