Patents by Inventor Xiaozhuo Cheng

Xiaozhuo Cheng has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11138979
    Abstract: An apparatus includes processor(s) to: divide a speech data set into multiple data chunks that each represent a chunk of speech audio; derive a threshold amplitude based on at least one peak amplitude of the speech audio; designate each data chunk with a peak amplitude below the threshold amplitude a pause data chunk; within a set of temporally consecutive data chunks of the multiple data chunks, identify a longest subset of temporally consecutive pause data chunks; within the set of temporally consecutive data chunks, designate the longest subset of temporally consecutive pause data chunks as a likely sentence pause of a candidate set of likely sentence pauses; based on at least the candidate set, divide the speech data set into multiple data segments that each represent a speech segment of the speech audio; and perform speech-to-text conversion, to identify a sentence spoken in each speech segment.
    Type: Grant
    Filed: December 30, 2020
    Date of Patent: October 5, 2021
    Assignee: SAS INSTITUTE INC.
    Inventors: Xiaozhuo Cheng, Xu Yang, Xiaolong Li
  • Publication number: 20210295845
    Abstract: An apparatus includes processor(s) to: divide a speech data set into multiple data chunks that each represent a chunk of speech audio; derive a threshold amplitude based on at least one peak amplitude of the speech audio; designate each data chunk with a peak amplitude below the threshold amplitude a pause data chunk; within a set of temporally consecutive data chunks of the multiple data chunks, identify a longest subset of temporally consecutive pause data chunks; within the set of temporally consecutive data chunks, designate the longest subset of temporally consecutive pause data chunks as a likely sentence pause of a candidate set of likely sentence pauses; based on at least the candidate set, divide the speech data set into multiple data segments that each represent a speech segment of the speech audio; and perform speech-to-text conversion, to identify a sentence spoken in each speech segment.
    Type: Application
    Filed: December 30, 2020
    Publication date: September 23, 2021
    Applicant: SAS Institute Inc.
    Inventors: XIAOZHUO CHENG, XU YANG, XIAOLONG LI
  • Patent number: 11049502
    Abstract: An apparatus includes processor(s) to: divide a speech data set into multiple data chunks that each represent a chunk of speech audio; configure a neural network to implement an acoustic model that includes a CTC output; provide each data chunk to the neural network and monitor the CTC output for a string of blank symbols; designate each string of blank symbols from the CTC output that is at least as long as a predetermined blank threshold length as a likely sentence pause of a candidate set of likely sentence pauses; based on at least the candidate set, divide the speech data set into multiple data segments that each represent a speech segment of the speech audio; and perform speech-to-text conversion, to identify a sentence spoken in a selected language in each speech segment.
    Type: Grant
    Filed: December 30, 2020
    Date of Patent: June 29, 2021
    Assignee: SAS INSTITUTE INC.
    Inventors: Xiaozhuo Cheng, Xu Yang, Xiaolong Li