Patents by Inventor Dan Su

Dan Su has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

FAR-FIELD PICKUP DEVICE AND METHOD FOR COLLECTING VOICE SIGNAL IN FAR-FIELD PICKUP DEVICE

Publication number: 20210021925

Abstract: A far-field pickup device including a device body and a microphone pickup unit is provided. The microphone pickup unit is configured to collect user speech and an echo of a first sound signal output by the device body, and transmit, to the device body, a signal obtained through digital conversion of the collected user speech and the echo. The device body includes a signal playback source, a synchronizing signal generator, a horn, a delay determining unit, and an echo cancellation unit configured to perform echo cancellation on the signal transmitted by the microphone pickup unit to obtain a collected human voice signal.

Type: Application

Filed: September 25, 2020

Publication date: January 21, 2021

Applicant: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LTD

Inventors: Ji Meng ZHENG, Meng YU, Dan SU
MULTI-PERSON SPEECH SEPARATION METHOD AND APPARATUS

Publication number: 20210005216

Abstract: A multi-person speech separation method is provided for a terminal. The method includes extracting a hybrid speech feature from a hybrid speech signal requiring separation, N human voices being mixed in the hybrid speech signal, N being a positive integer greater than or equal to 2; extracting a masking coefficient of the hybrid speech feature by using a generative adversarial network (GAN) model, to obtain a masking matrix corresponding to the N human voices, wherein the GAN model comprises a generative network model and an adversarial network model; and performing a speech separation on the masking matrix corresponding to the N human voices and the hybrid speech signal by using the GAN model, and outputting N separated speech signals corresponding to the N human voices.

Type: Application

Filed: September 17, 2020

Publication date: January 7, 2021

Inventors: Lianwu CHEN, Meng YU, Yanmin QIAN, Dan SU, Dong YU
SYSTEM AND METHOD FOR GAZE ESTIMATION

Publication number: 20200393896

Abstract: A method for gaze estimation. The method includes processing training data and determining one or more local-learning base gaze estimation model based on the training data. The local-learning base gaze estimation model(s) can be used for determining one or both of: 2D gaze points in a scene image and 3D gaze points in scene camera coordinates.

Type: Application

Filed: June 17, 2019

Publication date: December 17, 2020

Inventors: Youfu Li, Dan Su
VOICE SYNTHESIS METHOD, MODEL TRAINING METHOD, DEVICE AND COMPUTER DEVICE

Publication number: 20200380949

Abstract: This application relates to a speech synthesis method and apparatus, a model training method and apparatus, and a computer device. The method includes: obtaining to-be-processed linguistic data; encoding the linguistic data, to obtain encoded linguistic data; obtaining an embedded vector for speech feature conversion, the embedded vector being generated according to a residual between synthesized reference speech data and reference speech data that correspond to the same reference linguistic data; and decoding the encoded linguistic data according to the embedded vector, to obtain target synthesized speech data on which the speech feature conversion is performed. The solution provided in this application can prevent quality of a synthesized speech from being affected by a semantic feature in a mel-frequency cepstrum.

Type: Application

Filed: August 21, 2020

Publication date: December 3, 2020

Inventors: Xixin WU, Mu WANG, Shiyin KANG, Dan SU, Dong YU
MIXED SPEECH RECOGNITION METHOD AND APPARATUS, AND COMPUTER-READABLE STORAGE MEDIUM

Publication number: 20200372905

Abstract: A mixed speech recognition method, a mixed speech recognition apparatus, and a computer-readable storage medium are provided. The mixed speech recognition method includes: monitoring an input of speech input and detecting an enrollment speech and a mixed speech; acquiring speech features of a target speaker based on the enrollment speech; and determining speech belonging to the target speaker in the mixed speech based on the speech features of the target speaker. The enrollment speech includes preset speech information, and the mixed speech is non-enrollment speech inputted after the enrollment speech.

Type: Application

Filed: August 10, 2020

Publication date: November 26, 2020

Inventors: Jun WANG, Jie CHEN, Dan SU, Dong YU
METHOD FOR DETECTING KEYWORD IN SPEECH SIGNAL, TERMINAL, AND STORAGE MEDIUM

Publication number: 20200357386

Abstract: A method for detecting a keyword, applied to a terminal, includes: extracting a speech eigenvector of a speech signal; obtaining, according to the speech eigenvector, a posterior probability of each target character being a key character in any keyword in an acquisition time period of the speech signal; obtaining confidences of at least two target character combinations according to the posterior probability of each target character; and determining that the speech signal includes the keyword upon determining that all the confidences of the at least two target character combinations meet a preset condition. The target character is a character in the speech signal whose pronunciation matches a pronunciation of the key character. Each target character combination includes at least one target character, and a confidence of a target character combination represents a probability of the target character combination being the keyword or a part of the keyword.

Type: Application

Filed: July 20, 2020

Publication date: November 12, 2020

Inventors: Yi GAO, Meng YU, Dan SU, Jie CHEN, Min LUO
DATA PROCESSING METHOD BASED ON SIMULTANEOUS INTERPRETATION, COMPUTER DEVICE, AND STORAGE MEDIUM

Publication number: 20200357389

Abstract: A data processing method based on simultaneous interpretation, applied to a server in a simultaneous interpretation system, including: obtaining audio transmitted by a simultaneous interpretation device; processing the audio by using a simultaneous interpretation model to obtain an initial text; transmitting the initial text to a user terminal; receiving a modified text fed back by the user terminal, the modified text being obtained after the user terminal modifies the initial text; and updating the simultaneous interpretation model according to the initial text and the modified text.

Type: Application

Filed: July 28, 2020

Publication date: November 12, 2020

Inventors: Jingliang BAI, Caisheng OUYANG, Haikang LIU, Lianwu CHEN, Qi CHEN, Yulu ZHANG, Min LUO, Dan SU
SPEECH KEYWORD RECOGNITION METHOD AND APPARATUS, COMPUTER-READABLE STORAGE MEDIUM, AND COMPUTER DEVICE

Publication number: 20200286465

Abstract: A speech keyword recognition method includes: obtaining first speech segments based on a to-be-recognized speech signal; obtaining first probabilities respectively corresponding to the first speech segments by using a preset first classification model. A first probability of a first speech segment is obtained from probabilities of the first speech segment respectively corresponding to pre-determined word segmentation units of a pre-determined keyword.

Type: Application

Filed: May 27, 2020

Publication date: September 10, 2020

Inventors: Jun WANG, Dan SU, Dong YU
Human Body Safety Protection Laser Sensor for Revolving

Publication number: 20200224483

Abstract: To address the deficiencies in the prior art, the present invention provides a laser sensor for human body safety protection for use in a revolving door, which comprises: a laser scan range calculation section and an application analysis section. The laser scan range calculation section comprises a laser emitting device, a laser deflecting device, an optical signal receiving device, and an analyzing and processing device. Wherein, the laser emitting device emits laser signals to the laser deflecting device; the laser deflecting device deflects the laser signals by a pre-set angle and forms at least one laser scan area; the optical signal receiving device receives returned laser signals and transmits said signals to the analyzing and processing device. The analyzing and processing device comprises a trigger point distance analyzing module, which makes analysis to obtain trigger point distance information according to signals sent from the optical signal receiving device.

Type: Application

Filed: January 29, 2018

Publication date: July 16, 2020

Applicant: B.E.A ELECTRONICS (BEIJING) CO., LTD.

Inventors: Dan Su, Jixiang Liu, Yuhe Xu
Input-feeding architecture for attention based end-to-end speech recognition

Patent number: 10672382

Abstract: Methods and apparatuses are provided for performing end-to-end speech recognition training performed by at least one processor. The method includes receiving, by the at least one processor, one or more input speech frames, generating, by the at least one processor, a sequence of encoder hidden states by transforming the input speech frames, computing, by the at least one processor, attention weights based on each of the sequence of encoder hidden states and a current decoder hidden state, performing, by the at least one processor, a decoding operation based on a previous embedded label prediction information and a previous attentional hidden state information generated based on the attention weights; and generating a current embedded label prediction information based on a result of the decoding operation and the attention weights.

Type: Grant

Filed: October 15, 2018

Date of Patent: June 2, 2020

Assignee: TENCENT AMERICA LLC

Inventors: Chao Weng, Jia Cui, Guangsen Wang, Jun Wang, Chengzhu Yu, Dan Su, Dong Yu
N- BEST SOFTMAX SMOOTHING FOR MINIMUM BAYES RISK TRAINING OF ATTENTION BASED SEQUENCE-TO-SEQUENCE MODELS

Publication number: 20200151623

Abstract: A method and apparatus are provided that analyzing sequence-to-sequence data, such as sequence-to-sequence speech data or sequence-to-sequence machine translation data for example, by minimum Bayes risk (MBR) training a sequence-to-sequence model and within introduction of applications of softmax smoothing to an N-best generation of the MBR training of the sequence-to-sequence model.

Type: Application

Filed: November 14, 2018

Publication date: May 14, 2020

Applicant: TENCENT America LLC

Inventors: Chao WENG, Jia CUI, Guangsen WANG, Jun WANG, Chengzhu YU, Dan SU, Dong YU
MULTI-TASK TRAINING ARCHITECTURE AND STRATEGY FOR ATTENTION-BASED SPEECH RECOGNITION SYSTEM

Publication number: 20200135174

Abstract: Methods and apparatuses are provided for performing sequence to sequence (Seq2Seq) speech recognition training performed by at least one processor. The method includes acquiring a training set comprising a plurality of pairs of input data and target data corresponding to the input data, encoding the input data into a sequence of hidden states, performing a connectionist temporal classification (CTC) model training based on the sequence of hidden states, performing an attention model training based on the sequence of hidden states, and decoding the sequence of hidden states to generate target labels by independently performing the CTC model training and the attention model training.

Type: Application

Filed: October 24, 2018

Publication date: April 30, 2020

Applicant: TENCENT AMERICA LLC

Inventors: Jia CUI, Chao WENG, Guangsen WANG, Jun WANG, Chengzhu YU, Dan SU, Dong YU
INPUT-FEEDING ARCHITECTURE FOR ATTENTION BASED END-TO-END SPEECH RECOGNITION

Publication number: 20200118547

Abstract: Methods and apparatuses are provided for performing end-to-end speech recognition training performed by at least one processor. The method includes receiving, by the at least one processor, one or more input speech frames, generating, by the at least one processor, a sequence of encoder hidden states by transforming the input speech frames, computing, by the at least one processor, attention weights based on each of the sequence of encoder hidden states and a current decoder hidden state, performing, by the at least one processor, a decoding operation based on a previous embedded label prediction information and a previous attentional hidden state information generated based on the attention weights; and generating a current embedded label prediction information based on a result of the decoding operation and the attention weights.

Type: Application

Filed: October 15, 2018

Publication date: April 16, 2020

Applicant: TENCENT AMERICA LLC

Inventors: Chao WENG, Jia Cui, Guangsen WANG, Jun Wang, Chengzhu Yu, Dan Su, Dong Yu
SPEECH SIGNAL PROCESSING MODEL TRAINING METHOD, ELECTRONIC DEVICE AND STORAGE MEDIUM

Publication number: 20200051549

Abstract: Embodiments of the present invention provide a speech signal processing model training method, an electronic device and a storage medium. The embodiments of the present invention determines a target training loss function based on a training loss function of each of one or more speech signal processing tasks; inputs a task input feature of each speech signal processing task into a starting multi-task neural network, and updates model parameters of a shared layer and each of one or more task layers of the starting multi-task neural network corresponding to the one or more speech signal processing tasks by minimizing the target training loss function as a training objective, until the starting multi-task neural network converges, to obtain a speech signal processing model.

Type: Application

Filed: October 17, 2019

Publication date: February 13, 2020

Applicant: Tencent Technology (Shenzhen) Company Limited

Inventors: Lianwu Chen, Meng Yu, Min Luo, Dan Su
Display apparatus switchable between display state and mirror state

Patent number: 10551712

Abstract: Disclosed is a display apparatus. The display apparatus comprises a display and a field-induced visibility-controlling layer provided on light-outgoing side of the display, wherein the field-induced visibility-controlling layer can be switched between a transparent state and a mirror state by adjusting voltage applied, such that when the field-induced visibility-controlling layer is in the transparent state, the display is visible through the field-induced visibility-controlling layer; and when the field-induced visibility-controlling layer is in the mirror state, a mirror shielding the display is formed therein.

Type: Grant

Filed: October 12, 2016

Date of Patent: February 4, 2020

Assignees: BOE TECHNOLOGY GROUP CO., LTD., BEIJING BOE OPTOELECTRONICS TECHNOLOGY CO., LTD.

Inventors: Zhipeng Feng, Dan Su, Zongze He, Shuo Li, Jianguang Yang, Liang Zhang
METHOD, APPARATUS AND SYSTEM FOR SPEAKER VERIFICATION

Publication number: 20190214020

Abstract: The present disclosure relates to a method, apparatus, and system for speaker verification. The method includes: acquiring an audio recording; extracting speech signals from the audio recording; extracting features of the extracted speech signals; and determining whether the extracted speech signals represent speech by a predetermined speaker based on the extracted features and a speaker model trained with reference voice data of the predetermined speaker.

Type: Application

Filed: March 14, 2019

Publication date: July 11, 2019

Applicant: BEIJING DIDI INFINITY TECHNOLOGY AND DEVELOPMENT CO., LTD.

Inventors: Jie CHEN, Dan SU, Tianxiao FU, Na HU
Method, apparatus and system for speaker verification

Patent number: 10276167

Abstract: The present disclosure relates to a method, apparatus, and system for speaker verification. The method includes: acquiring an audio recording; extracting speech signals from the audio recording; extracting features of the extracted speech signals; and determining whether the extracted speech signals represent speech by a predetermined speaker based on the extracted features and a speaker model trained with reference voice data of the predetermined speaker.

Type: Grant

Filed: January 17, 2018

Date of Patent: April 30, 2019

Assignee: BEIJING DIDI INFINITY TECHNOLOGY AND DEVELOPMENT CO., LTD.

Inventors: Jie Chen, Dan Su, Tianxiao Fu, Na Hu
Voiceprint verification method, apparatus, storage medium and device

Patent number: 10277589

Abstract: The present invention discloses a voiceprint verification method, apparatus, storage medium and device.

Type: Grant

Filed: November 3, 2015

Date of Patent: April 30, 2019

Assignee: Baidu Online Network Technology (Beijing) Co., Ltd.

Inventors: Dan Su, Yong Guan
Driving method and driving circuit for light emitting diode light source assembly

Patent number: 10210817

Abstract: The present application discloses a circuit for driving a light emitting diode light source assembly having a plurality of light emitting diode groups, each group having at least one light emitting diode. The circuit includes a processor configured to determine a set brightness level, calculate a number of light emitting diode groups required to be on to achieve the set brightness level, and select the number of light emitting diode groups to be turned on at allocated positions in the light emitting diode light source assembly; and a driving sub-circuit configured to turn on the number of light emitting diode groups at the allocated positions. The number of light emitting diode groups is a positive integer N, N is less than a total number of the plurality of light emitting diode groups.

Type: Grant

Filed: November 14, 2016

Date of Patent: February 19, 2019

Assignees: BOE TECHNOLOGY GROUP CO., LTD., BEIJING BOE OPTOELECTRONICS TECHNOLOGY CO., LTD.

Inventors: Liang Zhang, Shuo Li, Zhipeng Feng, Jianguang Yang, Dan Su, Zongze He, Jieqiong Wang
METHOD, APPARATUS AND SYSTEM FOR SPEAKER VERIFICATION

Publication number: 20180358020

Abstract: The present disclosure relates to a method, apparatus, and system for speaker verification. The method includes: acquiring an audio recording; extracting speech signals from the audio recording; extracting features of the extracted speech signals; and determining whether the extracted speech signals represent speech by a predetermined speaker based on the extracted features and a speaker model trained with reference voice data of the predetermined speaker.

Type: Application

Filed: January 17, 2018

Publication date: December 13, 2018

Applicant: BEIJING DIDI INFINITY TECHNOLOGY AND DEVELOPMENT CO., LTD.

Inventors: Jie CHEN, Dan SU, Tianxiao FU, Na HU

prev 1 2 3 4 next