Patents by Inventor Zejun Ma

Zejun Ma has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240169988
    Abstract: The present disclosure discloses a method and device of generating acoustic features, speech model training, and speech recognition. By acquiring the acoustic information vector of the current speech frame and the information weight of the current speech frame, and according to the accumulated information weight corresponding to the previous speech frame, the retention rate corresponding to the current speech frame, and the information weight of the current speech frame, the accumulated information weight corresponding to the current speech frame can be obtained. The retention rate is the difference between 1 and a leakage rate.
    Type: Application
    Filed: January 30, 2024
    Publication date: May 23, 2024
    Inventors: Linhao DONG, Zejun MA
  • Publication number: 20240135933
    Abstract: A method, apparatus, device, and storage medium for speaker change point detection, the method including: acquiring target voice data to be detected; and extracting an acoustic feature characterizing acoustic information of the target voice data from the target voice data; encoding the acoustic feature to obtain speaker characterization vectors at a voice frame level of the target voice data; integrating and firing the speaker characterization vectors at the voice frame level of the target voice data based on a continuous integrate-and-fire CIF mechanism, to obtain a sequence of speaker characterizations bounded by speaker change points in the target voice data; and determining a timestamp corresponding to the speaker change points, according to the sequence of the speaker characterizations bounded by the speaker change points in the target voice data.
    Type: Application
    Filed: December 22, 2023
    Publication date: April 25, 2024
    Inventors: Linhao DONG, Zhiyun FAN, Zejun MA
  • Publication number: 20240127795
    Abstract: A model training method, a speech recognition method and apparatus, a medium, and a device are provided. The speech recognition model including an encoder, a CIF prediction sub-model and a CTC prediction sub-model. The model training method includes: encoding training speech data based on the encoder to obtain an acoustic vector sequence corresponding to the training speech data; obtaining an information amount sequence corresponding to the training speech data based on the acoustic vector sequence and the CIF prediction sub-model; obtaining a target probability sequence based on the acoustic vector sequence and the CTC prediction sub-model; determining a target loss of the speech recognition model based on the information amount sequence and the target probability sequence; and updating, in response to an updating condition being satisfied, a model parameter of the speech recognition model based on the target loss.
    Type: Application
    Filed: May 7, 2022
    Publication date: April 18, 2024
    Applicant: Beijing Youzhuju Network Technology Co., Ltd.
    Inventors: Linhao DONG, Zejun MA
  • Publication number: 20240095451
    Abstract: Provided are an electronic device and a computer readable storage medium. The method includes: acquiring a text to be analyzed; performing token conversion on words in the text to be analyzed to obtain a token sequence to be analyzed, where tokens in token sequences to be analyzed corresponding to texts to be analyzed in different languages belong to a same type; and performing feature extraction on the token sequence to be analyzed, and processing a target task based on the extracted feature, to determine an analysis result for the text to be analyzed.
    Type: Application
    Filed: September 18, 2023
    Publication date: March 21, 2024
    Inventors: Yuxiang ZOU, Zejun MA
  • Publication number: 20240046921
    Abstract: Embodiments of the present disclosure provide a method, apparatus, electronic device, and medium for speech processing. The method comprises generating a token-level semantic feature of target speech data based on a frame-level acoustic feature of the target speech data. The method further comprises generating a token-level voiceprint feature of the target speech data based on the frame-level acoustic feature. The method further comprises determining a token in the target speech data where speaker change occurs based on the token-level semantic feature and the token-level voiceprint feature. According to embodiments of the present disclosure, speaker change in speech data is detected at the token level in conjunction with the speaker's acoustic features and speech contents, and speaker-based speech recognition results are output directly without post-processing, simplifying the speech recognition process.
    Type: Application
    Filed: August 4, 2023
    Publication date: February 8, 2024
    Inventors: Linhao DONG, Zhenlin Liang, Zhiyun Fan, Yi Liu, Zejun Ma
  • Publication number: 20230402031
    Abstract: A speech processing method is provided. The method includes: receiving a speech block to be identified as a current speech block, where the speech block includes a past frame, a current frame and a future frame; performing a speech identification process based on the current speech block, where the speech identification process includes: performing speech identification based on the current speech block to obtain a speech identification result of the current frame and a speech identification result of the future frame; determining whether a previous speech block for the current speech block exists; in a case that the previous speech block for the current speech block exists, updating a target identification result based on the speech identification result of the current frame of the current speech block; and outputting the speech identification result of the future frame of the current speech block.
    Type: Application
    Filed: April 6, 2022
    Publication date: December 14, 2023
    Inventors: Linhao DONG, Meng CAI, Zejun MA
  • Patent number: 10373613
    Abstract: A dual-mode voice control method is disclosed. The method may comprise determining whether a user has executed an operation of activating an operate-to-speak stop determination mode in a voice input interface. The method may further comprise, in response to determining that the user has executed the operation of activating the operate-to-speak stop determination mode, determining whether a microphone is in a busy state. The method may further comprise, in response to determining that the microphone is in the busy state, switching a voice mode from a directly-speak automatic stop determination mode to the operate-to-speak stop determination mode. Before the user executes the operation of activating the operate-to-speak stop determination mode, the voice mode is in the directly-speak automatic stop determination mode if the microphone is in the busy state.
    Type: Grant
    Filed: November 29, 2016
    Date of Patent: August 6, 2019
    Assignee: Guangzhou Shenma Mobile Information Technology Co., Ltd.
    Inventors: Yajun Wang, Tuwenchang Si, Na Wang, Yi Peng, Sishou Zheng, Xiaoli Fu, Chao Li, Wei Kang, Yining Chen, Zejun Ma
  • Publication number: 20170162196
    Abstract: A dual-mode voice control method is disclosed. The method may comprise determining whether a user has executed an operation of activating an operate-to-speak stop determination mode in a voice input interface. The method may further comprise, in response to determining that the user has executed the operation of activating the operate-to-speak stop determination mode, determining whether a microphone is in a busy state. The method may further comprise, in response to determining that the microphone is in the busy state, switching a voice mode from a directly-speak automatic stop determination mode to the operate-to-speak stop determination mode. Before the user executes the operation of activating the operate-to-speak stop determination mode, the voice mode is in the directly-speak automatic stop determination mode if the microphone is in the busy state.
    Type: Application
    Filed: November 29, 2016
    Publication date: June 8, 2017
    Inventors: Yajun Wang, Tuwenchang Si, Na Wang, Yi Peng, Sishou Zheng, Xiaoli Fu, Chao Li, Wei Kang, Yining Chen, Zejun Ma