Patents by Inventor Samuel Norris Henderson

Samuel Norris Henderson has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Multithreaded speech data preprocessing

Patent number: 11862171

Abstract: An apparatus includes a processor to: receive, from a requesting device, a request to perform speech-to-text conversion of a speech data set; within a first thread of a thread pool, perform a first pause detection technique to identify a first set of likely sentence pauses; within a second thread of the thread pool, perform a second pause detection technique to identify a second set of likely sentence pauses; perform a speaker diarization technique to identify a set of likely speaker changes; divide the speech data set into data segments representing speech segments based on a combination of at least the first set of likely sentence pauses, the second set of likely sentence pauses, and the set of likely speaker changes; use at least an acoustic model with each data segment to identify likely speech sounds; and generate a transcript based, at least in part, on the identified likely speech sounds.

Type: Grant

Filed: November 23, 2022

Date of Patent: January 2, 2024

Assignee: SAS Institute Inc.

Inventors: Xiaolong Li, Xiaozhuo Cheng, Samuel Norris Henderson, Xu Yang
Multithreaded speech-to-text processing

Patent number: 11776545

Abstract: An apparatus includes a processor to: receive a request to perform speech-to-text conversion of a speech data set; perform pause detection to identify a set of likely sentence pauses and/or speaker diarization technique to identify a set of likely speaker changes; based the set of likely sentence pauses and/or the set of likely speaker changes, divide the speech data set into data segments representing speech segments; use an acoustic model with the data segments to derive sets of probabilities of speech sounds uttered; store the sets of probabilities in temporal order within a buffer queue; distribute the sets of probabilities from the buffer queue in temporal order among threads of a thread pool; and within each thread, and based on set(s) of probabilities, derive one candidate word and select either the candidate word or an alternate candidate word derived from a language model as the next word most likely spoken.

Type: Grant

Filed: November 28, 2022

Date of Patent: October 3, 2023

Assignee: SAS Institute Inc.

Inventors: Xiaolong Li, Xiaozhuo Cheng, Samuel Norris Henderson, Xu Yang
Multithreaded Speech Data Preprocessing

Publication number: 20230107312

Abstract: An apparatus includes a processor to: receive, from a requesting device, a request to perform speech-to-text conversion of a speech data set; within a first thread of a thread pool, perform a first pause detection technique to identify a first set of likely sentence pauses; within a second thread of the thread pool, perform a second pause detection technique to identify a second set of likely sentence pauses; perform a speaker diarization technique to identify a set of likely speaker changes; divide the speech data set into data segments representing speech segments based on a combination of at least the first set of likely sentence pauses, the second set of likely sentence pauses, and the set of likely speaker changes; use at least an acoustic model with each data segment to identify likely speech sounds; and generate a transcript based, at least in part, on the identified likely speech sounds.

Type: Application

Filed: November 23, 2022

Publication date: April 6, 2023

Applicant: SAS Institute Inc.

Inventors: XIAOLONG LI, Xiaozhuo Cheng, Samuel Norris Henderson, Xu Yang
Multithreaded Speech-to-Text Processing

Publication number: 20230098063

Abstract: An apparatus includes a processor to: receive a request to perform speech-to-text conversion of a speech data set; perform pause detection to identify a set of likely sentence pauses and/or speaker diarization technique to identify a set of likely speaker changes; based the set of likely sentence pauses and/or the set of likely speaker changes, divide the speech data set into data segments representing speech segments; use an acoustic model with the data segments to derive sets of probabilities of speech sounds uttered; store the sets of probabilities in temporal order within a buffer queue; distribute the sets of probabilities from the buffer queue in temporal order among threads of a thread pool; and within each thread, and based on set(s) of probabilities, derive one candidate word and select either the candidate word or an alternate candidate word derived from a language model as the next word most likely spoken.

Type: Application

Filed: November 28, 2022

Publication date: March 30, 2023

Applicant: SAS Institute Inc.

Inventors: XIAOLONG LI, Xiaozhuo Cheng, Samuel Norris Henderson, Xu Yang
Speech segmentation based on combination of pause detection and speaker diarization

Patent number: 11538481

Abstract: An apparatus includes at least one processor to, in response to a request to perform speech-to-text conversion: perform a pause detection technique including analyzing speech audio to identify pauses, and analyzing lengths of the pauses to identify likely sentence pauses; perform a speaker diarization technique including dividing the speech audio into fragments, analyzing vocal characteristics of speech sounds of each fragment to identify a speaker of a set of speakers, and identifying instances of a change in speakers between each temporally consecutive pair of fragments to identify likely speaker changes; and perform speech-to-text operations including dividing the speech audio into segments based on at least the likely sentence pauses and likely speaker changes, using at least an acoustic model with each segment to identify likely speech sounds in the speech audio, and generating a transcript of the speech audio based at least on the likely speech sounds.

Type: Grant

Filed: June 28, 2022

Date of Patent: December 27, 2022

Assignee: SAS INSTITUTE INC.

Inventors: Xiaolong Li, Samuel Norris Henderson, Xiaozhuo Cheng, Xu Yang
SPEECH SEGMENTATION BASED ON COMBINATION OF PAUSE DETECTION AND SPEAKER DIARIZATION

Publication number: 20220335947

Abstract: An apparatus includes at least one processor to, in response to a request to perform speech-to-text conversion: perform a pause detection technique including analyzing speech audio to identify pauses, and analyzing lengths of the pauses to identify likely sentence pauses; perform a speaker diarization technique including dividing the speech audio into fragments, analyzing vocal characteristics of speech sounds of each fragment to identify a speaker of a set of speakers, and identifying instances of a change in speakers between each temporally consecutive pair of fragments to identify likely speaker changes; and perform speech-to-text operations including dividing the speech audio into segments based on at least the likely sentence pauses and likely speaker changes, using at least an acoustic model with each segment to identify likely speech sounds in the speech audio, and generating a transcript of the speech audio based at least on the likely speech sounds.

Type: Application

Filed: June 28, 2022

Publication date: October 20, 2022

Applicant: SAS Institute Inc.

Inventors: XIAOLONG LI, Samuel Norris Henderson, Xiaozhuo Cheng, Xu Yang

Multithreaded speech data preprocessing

Multithreaded speech-to-text processing

Multithreaded Speech Data Preprocessing

Multithreaded Speech-to-Text Processing

Speech segmentation based on combination of pause detection and speaker diarization

SPEECH SEGMENTATION BASED ON COMBINATION OF PAUSE DETECTION AND SPEAKER DIARIZATION