Patents by Inventor Harry Bratt

Harry Bratt has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Method, system and apparatus for understanding and generating human conversational cues

Patent number: 12586563

Abstract: A voice-based digital assistant (VDA) uses a conversation intelligence (CI) manager module having a rule-based engine on conversational intelligence to process information from one or more modules to make determinations on both i) understanding the human conversational cues and ii) generating the human conversational cues, including at least understanding and generating a backchannel utterance, in a flow and exchange of human communication in order to at least one of grab or yield a conversational floor between a user and the VDA. The CI manager module uses the rule-based engine to analyze and make a determination on a conversational cue of, at least, prosody in a user's flow of speech to generate the backchannel utterance to signal any of i) an understanding, ii) a correction, iii) a confirmation, and iv) a questioning of verbal communications conveyed by the user in the flow of speech during a time frame when the user still holds the conversational floor.

Type: Grant

Filed: May 7, 2020

Date of Patent: March 24, 2026

Assignee: SRI International

Inventors: Harry Bratt, Kristin Precoda, Dimitra Vergyri
Hybrid human-assisted dialogue system

Patent number: 12499887

Abstract: An automated interactive voice dialogue system using a human-in-the-loop design may enable human-supported interventions, where the automated system still conducts most of the interaction but enables a human agent to assume control of the dialogue and assist, if deemed necessary, so that the user may continue the interaction with little interruption or frustration. In some examples, the user of the dialogue system of this disclosure may not realize that there was a problem, and that the interaction is being or has switched from an automated dialogue system to a human. In some examples, the automated dialogue system of this disclosure may also automatically switch back to machine interaction when the human agent has resolved the situation.

Type: Grant

Filed: December 20, 2022

Date of Patent: December 16, 2025

Assignee: SRI International

Inventors: Dimitra Vergyri, Harry Bratt
Controllable, natural paralinguistics for text to speech synthesis

Patent number: 12361925

Abstract: A speech recognition module receives training data of speech and creates a representation for individual words, non-words, phonemes, and any combination. A set of speech processing detectors analyze the training data of speech from humans communicating. The set of speech processing detectors detect speech parameters that are indicative of paralinguistic effects on top of enunciated words, phonemes, and non-words in the audio stream. One or more machine learning models undergo supervised machine learning on their neural network to train on how to associate one or more mark-up markers with a textual representation, for each individual word, individual non-word, individual phoneme, and any combinations of these, that was enunciated with a particular paralinguistic effect. Each mark-up marker can correspond to its own paralinguistic effect.

Type: Grant

Filed: December 29, 2020

Date of Patent: July 15, 2025

Assignee: SRI International

Inventors: Harry Bratt, Colleen Richey, Maneesh Yadav
SPEECH MODIFICATION USING ACCENT EMBEDDINGS

Publication number: 20240304175

Abstract: Techniques for a machine learning system configured to obtain a dataset of a plurality of sample speech clips; generate a plurality of sequence; initialize a plurality of speaker embeddings and a plurality of accent embeddings; update the plurality of speaker embeddings; update the plurality of accent embeddings; generate a plurality of augmented embeddings based on the plurality of sequence embeddings, the plurality of speaker embeddings, and the plurality of accent embeddings; and generate a plurality of synthetic speech clips based on the plurality of augmented embeddings. The machine learning system may further be configured to obtain an audio waveform; decompose the audio waveform into first magnitude spectral slices and an original phase; process the first magnitude spectral slices to map the first magnitude spectral slices to second magnitude spectral slices; and generate a modified audio waveform in part by combining the second magnitude spectral slices and the original phase.

Type: Application

Filed: March 7, 2024

Publication date: September 12, 2024

Inventors: Alexander Erdmann, Sarah Bakst, Harry Bratt, Dimitra Vergyri, Horacio Franco
METHOD AND SYSTEM FOR CREATING A PROSODIC SCRIPT

Publication number: 20240257801

Abstract: A method, apparatus, and system for creating a script for rendering audio and/or video streams include identifying at least one prosodic speech feature in a received audio stream and/or a received language model, creating a respective prosodic speech symbol for each of the at least one identified prosodic speech features, converting the received audio stream and/or the received language model into a text stream, temporally inserting the created at least one prosodic speech symbol into the text stream, identifying in a received video stream at least one prosodic gesture of at least a portion of a body of a speaker of the received audio stream, creating at least one respective gesture symbol for each of the at least one identified prosodic gestures, and temporally inserting the created at least one gesture symbol into the text stream along with the at least one prosodic speech symbol to create a prosodic script.

Type: Application

Filed: December 21, 2023

Publication date: August 1, 2024

Inventors: Jeffrey LUBIN, Alexander ERDMANN, James BERGEN, Harry BRATT, Jihua HUANG, Sarah BAKST, Michael LOMNITZ, Zachary DANIELS, John CADIGAN, Ali CHAUDHRY, Zhiwei ZHU, Joshua CHATTIN, Girish ACHARYA
HYBRID HUMAN-ASSISTED DIALOGUE SYSTEM

Publication number: 20230260510

Abstract: An automated interactive voice dialogue system using a human-in-the-loop design may enable human-supported interventions, where the automated system still conducts most of the interaction but enables a human agent to assume control of the dialogue and assist, if deemed necessary, so that the user may continue the interaction with little interruption or frustration. In some examples, the user of the dialogue system of this disclosure may not realize that there was a problem, and that the interaction is being or has switched from an automated dialogue system to a human. In some examples, the automated dialogue system of this disclosure may also automatically switch back to machine interaction when the human agent has resolved the situation.

Type: Application

Filed: December 20, 2022

Publication date: August 17, 2023

Inventors: Dimitra Vergyri, Harry Bratt
CONTROLLABLE, NATURAL PARALINGUISTICS FOR TEXT TO SPEECH SYNTHESIS

Publication number: 20220406292

Abstract: A speech recognition module receives training data of speech and creates a representation for individual words, non-words, phonemes, and any combination. A set of speech processing detectors analyze the training data of speech from humans communicating. The set of speech processing detectors detect speech parameters that are indicative of paralinguistic effects on top of enunciated words, phonemes, and non-words in the audio stream. One or more machine learning models undergo supervised machine learning on their neural network to train on how to associate one or more mark-up markers with a textual representation, for each individual word, individual non-word, individual phoneme, and any combinations of these, that was enunciated with a particular paralinguistic effect. Each mark-up marker can correspond to its own paralinguistic effect.

Type: Application

Filed: December 29, 2020

Publication date: December 22, 2022

Inventors: Harry Bratt, Colleen Richey, Maneesh Yadav
Method, System and Apparatus for Understanding and Generating Human Conversational Cues

Publication number: 20220115001

Abstract: A voice-based digital assistant (VDA) uses a conversation intelligence (CI) manager module having a rule-based engine on conversational intelligence to process information from one or more modules to make determinations on both i) understanding the human conversational cues and ii) generating the human conversational cues, including at least understanding and generating a backchannel utterance, in a flow and exchange of human communication in order to at least one of grab or yield a conversational floor between a user and the VDA. The CI manager module uses the rule-based engine to analyze and make a determination on a conversational cue of, at least, prosody in a user's flow of speech to generate the backchannel utterance to signal any of i) an understanding, ii) a correction, iii) a confirmation, and iv) a questioning of verbal communications conveyed by the user in the flow of speech during a time frame when the user still holds the conversational floor.

Type: Application

Filed: May 7, 2020

Publication date: April 14, 2022

Inventors: Harry Bratt, Kristin Precoda, Dimitra Vergyri
Autonomous intelligent radio

Patent number: 11152016

Abstract: Embodiments of the disclosed technologies include finding content of interest in an RF spectrum by automatically scanning the RF spectrum; detecting, in a range of frequencies of the RF spectrum that includes one or more undefined channels, a candidate RF segment; where the candidate RF segment includes a frequency-bound time segment of electromagnetic energy; executing a machine learning-based process to determine, for the candidate RF segment, signal characterization data indicative of one or more of: a frequency range, a modulation type, a timestamp; using the signal characterization data to determine whether audio contained in the candidate RF segment corresponds to a search criterion; in response to determining that the candidate RF segment corresponds to the search criterion, outputting, through an electronic device, data indicative of the candidate RF segment; where the data indicative of the candidate RF segment is output in a real-time time interval after the candidate RF segment is detected.

Type: Grant

Filed: May 8, 2019

Date of Patent: October 19, 2021

Assignee: SRI INTERNATIONAL

Inventors: Aaron D. Lawson, Harry Bratt, Mitchell L. McLaren, Martin Graciarena
Real-time class recognition for an audio stream

Patent number: 11024291

Abstract: In an embodiment, the disclosed technologies include automatically recognizing speech content of an audio stream that may contain multiple different classes of speech content, by receiving, by an audio capture device, an audio stream; outputting, by one or more classifiers, in response to an inputting to the one or more classifiers of digital data that has been extracted from the audio stream, score data; where a score of the score data indicates a likelihood that a particular time segment of the audio stream contains speech of a particular class; where the one or more classifiers use one or more machine-learned models that have been trained to recognize audio of one or more particular classes to determine the score data; using a sliding time window process, selecting particular scores from the score data; using the selected particular scores, determining and outputting one or more decisions as to whether one or more particular time segments of the audio stream contain speech of one or more particular classes

Type: Grant

Filed: March 27, 2019

Date of Patent: June 1, 2021

Assignee: SRI INTERNATIONAL

Inventors: Diego Castan Lavilla, Harry Bratt, Mitchell Leigh McLaren
AUTONOMOUS INTELLIGENT RADIO

Publication number: 20200184997

Abstract: Embodiments of the disclosed technologies include finding content of interest in an RF spectrum by automatically scanning the RF spectrum; detecting, in a range of frequencies of the RF spectrum that includes one or more undefined channels, a candidate RF segment; where the candidate RF segment includes a frequency-bound time segment of electromagnetic energy; executing a machine learning-based process to determine, for the candidate RF segment, signal characterization data indicative of one or more of: a frequency range, a modulation type, a timestamp; using the signal characterization data to determine whether audio contained in the candidate RF segment corresponds to a search criterion; in response to determining that the candidate RF segment corresponds to the search criterion, outputting, through an electronic device, data indicative of the candidate RF segment; where the data indicative of the candidate RF segment is output in a real-time time interval after the candidate RF segment is detected.

Type: Application

Filed: May 8, 2019

Publication date: June 11, 2020

Inventors: Aaron D. Lawson, Harry Bratt, Mitchell L. McLaren, Martin Graciarena
REAL-TIME CLASS RECOGNITION FOR AN AUDIO STREAM

Publication number: 20200160845

Abstract: In an embodiment, the disclosed technologies include automatically recognizing speech content of an audio stream that may contain multiple different classes of speech content, by receiving, by an audio capture device, an audio stream; outputting, by one or more classifiers, in response to an inputting to the one or more classifiers of digital data that has been extracted from the audio stream, score data; where a score of the score data indicates a likelihood that a particular time segment of the audio stream contains speech of a particular class; where the one or more classifiers use one or more machine-learned models that have been trained to recognize audio of one or more particular classes to determine the score data; using a sliding time window process, selecting particular scores from the score data; using the selected particular scores, determining and outputting one or more decisions as to whether one or more particular time segments of the audio stream contain speech of one or more particular classes

Type: Application

Filed: March 27, 2019

Publication date: May 21, 2020

Inventors: Diego Castan Lavilla, Harry Bratt, Mitchell Leigh McLaren
Semi-supervised speaker diarization

Patent number: 10133538

Abstract: An audio file analyzer computing system includes technologies to, among other things, localize audio events of interest (such as speakers of interest) within an audio file that includes multiple different classes (e.g., different speakers) of audio. The illustrative audio file analyzer computing system uses a seed segment to perform a semi-supervised diarization of the audio file. The seed segment is pre-selected, such as by a human person using an interactive graphical user interface.

Type: Grant

Filed: March 27, 2015

Date of Patent: November 20, 2018

Assignee: SRI International

Inventors: Mitchell Leigh McLaren, Aaron Dennis Lawson, Harry Bratt
Method and apparatus for classifying lexical stress

Patent number: 9928832

Abstract: A method for classifying lexical stress in an utterance includes generating a feature vector representing stress characteristics of a syllable occurring in the utterance, wherein the feature vector includes a plurality of features based on prosodic information and spectral information, computing a plurality of scores, wherein each of the plurality of scores is related to a probability of a given class of lexical stress, and classifying the lexical stress of the syllable based on the plurality of scores.

Type: Grant

Filed: June 30, 2014

Date of Patent: March 27, 2018

Assignee: SRI INTERNATIONAL

Inventors: Horacio E. Franco, Luciana Ferrer, Harry Bratt, Colleen Richey, Kristin Precoda, Victor Abrash
Vehicle personal assistant that interprets spoken natural language input based upon vehicle context

Patent number: 9798799

Abstract: A vehicle personal assistant to engage a user in a conversational dialog about vehicle-related topics, such as those commonly found in a vehicle owner's manual, includes modules to interpret spoken natural language input, search a vehicle knowledge base and/or other data sources for pertinent information, and respond to the user's input in a conversational fashion. The dialog may be initiated by the user or more proactively by the vehicle personal assistant based on events that may be currently happening in relation to the vehicle. The vehicle personal assistant may use real-time inputs obtained from the vehicle and/or non-verbal inputs from the user to enhance its understanding of the dialog and assist the user in a variety of ways.

Type: Grant

Filed: November 15, 2012

Date of Patent: October 24, 2017

Assignee: SRI INTERNATIONAL

Inventors: Michael J. Wolverton, William S. Mark, Harry Bratt, Douglas A. Bercow
METHOD AND APPARATUS FOR TAILORING THE OUTPUT OF AN INTELLIGENT AUTOMATED ASSISTANT TO A USER

Publication number: 20170061316

Abstract: The present invention relates to a method and apparatus for tailoring the output of an intelligent automated assistant. One embodiment of a method for conducting an interaction with a human user includes collecting data about the user using a multimodal set of sensors positioned in a vicinity of the user, making a set of inferences about the user in accordance with the data, and tailoring an output to be delivered to the user in accordance with the set of inferences.

Type: Application

Filed: November 16, 2016

Publication date: March 2, 2017

Inventors: Gokhan Tur, Horacio E. Franco, Elizabeth Shriberg, Gregory K. Myers, William S. Mark, Norman D. Winarsky, Andreas Stolcke, Bart Peintner, Michael J. Wolverton, Luciana Ferrer, Martin Graciarena, Neil Yorke-Smith, Harry Bratt
Method and apparatus for tailoring the output of an intelligent automated assistant to a user

Patent number: 9501743

Abstract: The present invention relates to a method and apparatus for tailoring the output of an intelligent automated assistant. One embodiment of a method for conducting an interaction with a human user includes collecting data about the user using a multimodal set of sensors positioned in a vicinity of the user, making a set of inferences about the user in accordance with the data, and tailoring an output to be delivered to the user in accordance with the set of inferences.

Type: Grant

Filed: December 2, 2015

Date of Patent: November 22, 2016

Assignee: SRI INTERNATIONAL

Inventors: Gokhan Tur, Horacio E. Franco, Elizabeth Shriberg, Gregory K. Myers, William S. Mark, Norman D. Winarsky, Andreas Stolcke, Bart Peintner, Michael J. Wolverton, Luciana Ferrer, Martin Graciarena, Neil Yorke-Smith, Harry Bratt
SEMI-SUPERVISED SPEAKER DIARIZATION

Publication number: 20160283185

Abstract: An audio file analyzer computing system includes technologies to, among other things, localize audio events of interest (such as speakers of interest) within an audio file that includes multiple different classes (e.g., different speakers) of audio. The illustrative audio file analyzer computing system uses a seed segment to perform a semi-supervised diarization of the audio file. The seed segment is pre-selected, such as by a human person using an interactive graphical user interface.

Type: Application

Filed: March 27, 2015

Publication date: September 29, 2016

Inventors: Mitchell Leigh McLaren, Aaron Dennis Lawson, Harry Bratt
METHOD AND APPARATUS FOR TAILORING THE OUTPUT OF AN INTELLIGENT AUTOMATED ASSISTANT TO A USER

Publication number: 20160086090

Abstract: The present invention relates to a method and apparatus for tailoring the output of an intelligent automated assistant. One embodiment of a method for conducting an interaction with a human user includes collecting data about the user using a multimodal set of sensors positioned in a vicinity of the user, making a set of inferences about the user in accordance with the data, and tailoring an output to be delivered to the user in accordance with the set of inferences.

Type: Application

Filed: December 2, 2015

Publication date: March 24, 2016

Inventors: Gokhan Tur, Horacio E. Franco, Elizabeth Shriberg, Gregory K. Myers, William S. Mark, Norman D. Winarsky, Andreas Stolcke, Bart Peinter, Michael J. Wolverton, Luciana Ferrer, Martin Graciarena, Neil Yorke-Smith, Harry Bratt
Method and apparatus for tailoring the output of an intelligent automated assistant to a user

Patent number: 9213558

Abstract: The present invention relates to a method and apparatus for tailoring the output of an intelligent automated assistant. One embodiment of a method for conducting an interaction with a human user includes collecting data about the user using a multimodal set of sensors positioned in a vicinity of the user, making a set of inferences about the user in accordance with the data, and tailoring an output to be delivered to the user in accordance with the set of inferences.

Type: Grant

Filed: September 1, 2010

Date of Patent: December 15, 2015

Assignee: SRI INTERNATIONAL

Inventors: Gokhan Tur, Horacio E. Franco, Elizabeth Shriberg, Gregory K. Myers, William S. Mark, Norman D. Winarsky, Andreas Stolcke, Bart Peintner, Michael J. Wolverton, Luciana Ferrer, Martin Graciarena, Harry Bratt, Neil Yorke-Smith

1 2 next