Patents by Inventor Zhengkun Tian

Zhengkun Tian has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Hierarchical generated audio detection system

Patent number: 11763836

Abstract: Disclosed is a hierarchical generated audio detection system, comprising an audio preprocessing module, a CQCC feature extraction module, a LFCC feature extraction module, a first-stage lightweight coarse-level detection model and a second-stage fine-level deep identification model; the audio preprocessing module preprocesses collected audio or video data to obtain an audio clip with a length not exceeding the limit; inputting the audio clip into CQCC feature extraction module and LFCC feature extraction module respectively to obtain CQCC feature and LFCC feature; inputting CQCC feature or LFCC feature into the first-stage lightweight coarse-level detection model for first-stage screening to screen out the first-stage real audio and the first-stage generated audio; inputting the CQCC feature or LFCC feature of the first-stage generated audio into the second-stage fine-level deep identification model to identify the second-stage real audio and the second-stage generated audio, and the second-stage generated au

Type: Grant

Filed: February 17, 2022

Date of Patent: September 19, 2023

Assignee: INSTITUTE OF AUTOMATION, CHINESE ACADEMY OF SCIENCES

Inventors: Jianhua Tao, Zhengkun Tian, Jiangyan Yi
Method for training speech recognition model, method and system for speech recognition

Patent number: 11580957

Abstract: Disclosed are a method for training speech recognition model, a method and a system for speech recognition. The disclosure relates to field of speech recognition and includes: inputting an audio training sample into the acoustic encoder to represent acoustic features of the audio training sample in an encoded way and determine an acoustic encoded state vector; inputting a preset vocabulary into the language predictor to determine text prediction vector; inputting the text prediction vector into the text mapping layer to obtain a text output probability distribution; calculating a first loss function according to a target text sequence corresponding to the audio training sample and the text output probability distribution; inputting the text prediction vector and the acoustic encoded state vector into the joint network to calculate a second loss function, and performing iterative optimization according to the first loss function and the second loss function.

Type: Grant

Filed: June 9, 2022

Date of Patent: February 14, 2023

Assignee: INSTITUTE OF AUTOMATION, CHINESE ACADEMY OF SCIENCES

Inventors: Jianhua Tao, Zhengkun Tian, Jiangyan Yi
HIERARCHICAL GENERATED AUDIO DETECTION SYSTEM

Publication number: 20230027645

Abstract: Disclosed is a hierarchical generated audio detection system, comprising an audio preprocessing module, a CQCC feature extraction module, a LFCC feature extraction module, a first-stage lightweight coarse-level detection model and a second-stage fine-level deep identification model; the audio preprocessing module preprocesses collected audio or video data to obtain an audio clip with a length not exceeding the limit; inputting the audio clip into CQCC feature extraction module and LFCC feature extraction module respectively to obtain CQCC feature and LFCC feature; inputting CQCC feature or LFCC feature into the first-stage lightweight coarse-level detection model for first-stage screening to screen out the first-stage real audio and the first-stage generated audio; inputting the CQCC feature or LFCC feature of the first-stage generated audio into the second-stage fine-level deep identification model to identify the second-stage real audio and the second-stage generated audio, and the second-stage generated au

Type: Application

Filed: February 17, 2022

Publication date: January 26, 2023

Applicant: INSTITUTE OF AUTOMATION, CHINESE ACADEMY OF SCIENCES

Inventors: Jianhua TAO, Zhengkun TIAN, Jiangyan YI
Method, system for speech recognition, electronic device and storage medium

Patent number: 11501759

Abstract: Disclosed are a method and a system for speech recognition, an electronic device and a storage medium, which relates to the technical field of speech recognition. Embodiments of the application comprise performing encoded representation on an audio to be recognized to obtain an acoustic encoded state vector sequence of the audio to be recognized; performing sparse encoding on the acoustic encoded state vector sequence of the audio to be recognized to obtain an acoustic encoded sparse vector; determining a text prediction vector of each label in a preset vocabulary; recognizing the audio to be recognized and determining a text content corresponding to the audio to be recognized according to the acoustic encoded sparse vector and the text prediction vector. The acoustic encoded sparse vector of the audio to be recognized is obtained by performing sparse encoding on the acoustic encoded state vector of the audio to be recognized.

Type: Grant

Filed: July 19, 2022

Date of Patent: November 15, 2022

Assignee: INSTITUTE OF AUTOMATION, CHINESE ACADEMY OF SCIENCES

Inventors: Jianhua Tao, Zhengkun Tian, Jiangyan Yi

Hierarchical generated audio detection system

Method for training speech recognition model, method and system for speech recognition

HIERARCHICAL GENERATED AUDIO DETECTION SYSTEM

Method, system for speech recognition, electronic device and storage medium