Patents by Inventor Zhengkun Tian

Zhengkun Tian has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11763836
    Abstract: Disclosed is a hierarchical generated audio detection system, comprising an audio preprocessing module, a CQCC feature extraction module, a LFCC feature extraction module, a first-stage lightweight coarse-level detection model and a second-stage fine-level deep identification model; the audio preprocessing module preprocesses collected audio or video data to obtain an audio clip with a length not exceeding the limit; inputting the audio clip into CQCC feature extraction module and LFCC feature extraction module respectively to obtain CQCC feature and LFCC feature; inputting CQCC feature or LFCC feature into the first-stage lightweight coarse-level detection model for first-stage screening to screen out the first-stage real audio and the first-stage generated audio; inputting the CQCC feature or LFCC feature of the first-stage generated audio into the second-stage fine-level deep identification model to identify the second-stage real audio and the second-stage generated audio, and the second-stage generated au
    Type: Grant
    Filed: February 17, 2022
    Date of Patent: September 19, 2023
    Assignee: INSTITUTE OF AUTOMATION, CHINESE ACADEMY OF SCIENCES
    Inventors: Jianhua Tao, Zhengkun Tian, Jiangyan Yi
  • Patent number: 11580957
    Abstract: Disclosed are a method for training speech recognition model, a method and a system for speech recognition. The disclosure relates to field of speech recognition and includes: inputting an audio training sample into the acoustic encoder to represent acoustic features of the audio training sample in an encoded way and determine an acoustic encoded state vector; inputting a preset vocabulary into the language predictor to determine text prediction vector; inputting the text prediction vector into the text mapping layer to obtain a text output probability distribution; calculating a first loss function according to a target text sequence corresponding to the audio training sample and the text output probability distribution; inputting the text prediction vector and the acoustic encoded state vector into the joint network to calculate a second loss function, and performing iterative optimization according to the first loss function and the second loss function.
    Type: Grant
    Filed: June 9, 2022
    Date of Patent: February 14, 2023
    Assignee: INSTITUTE OF AUTOMATION, CHINESE ACADEMY OF SCIENCES
    Inventors: Jianhua Tao, Zhengkun Tian, Jiangyan Yi
  • Publication number: 20230027645
    Abstract: Disclosed is a hierarchical generated audio detection system, comprising an audio preprocessing module, a CQCC feature extraction module, a LFCC feature extraction module, a first-stage lightweight coarse-level detection model and a second-stage fine-level deep identification model; the audio preprocessing module preprocesses collected audio or video data to obtain an audio clip with a length not exceeding the limit; inputting the audio clip into CQCC feature extraction module and LFCC feature extraction module respectively to obtain CQCC feature and LFCC feature; inputting CQCC feature or LFCC feature into the first-stage lightweight coarse-level detection model for first-stage screening to screen out the first-stage real audio and the first-stage generated audio; inputting the CQCC feature or LFCC feature of the first-stage generated audio into the second-stage fine-level deep identification model to identify the second-stage real audio and the second-stage generated audio, and the second-stage generated au
    Type: Application
    Filed: February 17, 2022
    Publication date: January 26, 2023
    Applicant: INSTITUTE OF AUTOMATION, CHINESE ACADEMY OF SCIENCES
    Inventors: Jianhua TAO, Zhengkun TIAN, Jiangyan YI
  • Patent number: 11501759
    Abstract: Disclosed are a method and a system for speech recognition, an electronic device and a storage medium, which relates to the technical field of speech recognition. Embodiments of the application comprise performing encoded representation on an audio to be recognized to obtain an acoustic encoded state vector sequence of the audio to be recognized; performing sparse encoding on the acoustic encoded state vector sequence of the audio to be recognized to obtain an acoustic encoded sparse vector; determining a text prediction vector of each label in a preset vocabulary; recognizing the audio to be recognized and determining a text content corresponding to the audio to be recognized according to the acoustic encoded sparse vector and the text prediction vector. The acoustic encoded sparse vector of the audio to be recognized is obtained by performing sparse encoding on the acoustic encoded state vector of the audio to be recognized.
    Type: Grant
    Filed: July 19, 2022
    Date of Patent: November 15, 2022
    Assignee: INSTITUTE OF AUTOMATION, CHINESE ACADEMY OF SCIENCES
    Inventors: Jianhua Tao, Zhengkun Tian, Jiangyan Yi