Patents by Inventor Brian Roark

Brian Roark has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

LANGUAGE-AGNOSTIC MULTILINGUAL MODELING USING EFFECTIVE SCRIPT NORMALIZATION

Publication number: 20260120679

Abstract: A method includes obtaining a plurality of training data sets each associated with a respective native language and includes a plurality of respective training data samples. For each respective training data sample of each training data set in the respective native language, the method includes transliterating the corresponding transcription in the respective native script into corresponding transliterated text representing the respective native language of the corresponding audio in a target script and associating the corresponding transliterated text in the target script with the corresponding audio in the respective native language to generate a respective normalized training data sample.

Type: Application

Filed: December 22, 2025

Publication date: April 30, 2026

Applicant: Google LLC

Inventors: Arindrima Datta, Bhuvana Ramabhadran, Jesse Emond, Brian Roark
Language-agnostic multilingual modeling using effective script normalization

Patent number: 12536989

Abstract: A method includes obtaining a plurality of training data sets each associated with a respective native language and includes a plurality of respective training data samples. For each respective training data sample of each training data set in the respective native language, the method includes transliterating the corresponding transcription in the respective native script into corresponding transliterated text representing the respective native language of the corresponding audio in a target script and associating the corresponding transliterated text in the target script with the corresponding audio in the respective native language to generate a respective normalized training data sample.

Type: Grant

Filed: March 21, 2023

Date of Patent: January 27, 2026

Assignee: Google LLC

Inventors: Arindrima Datta, Bhuvana Ramabhadran, Jesse Emond, Brian Roark
Language-agnostic Multilingual Modeling Using Effective Script Normalization

Publication number: 20230223009

Abstract: A method includes obtaining a plurality of training data sets each associated with a respective native language and includes a plurality of respective training data samples. For each respective training data sample of each training data set in the respective native language, the method includes transliterating the corresponding transcription in the respective native script into corresponding transliterated text representing the respective native language of the corresponding audio in a target script and associating the corresponding transliterated text in the target script with the corresponding audio in the respective native language to generate a respective normalized training data sample.

Type: Application

Filed: March 21, 2023

Publication date: July 13, 2023

Applicant: Google LLC

Inventors: Arindrima Datta, Bhuvana Ramabhadran, Jesse Emond, Brian Roark
Language-agnostic multilingual modeling using effective script normalization

Patent number: 11615779

Abstract: A method includes obtaining a plurality of training data sets each associated with a respective native language and includes a plurality of respective training data samples. For each respective training data sample of each training data set in the respective native language, the method includes transliterating the corresponding transcription in the respective native script into corresponding transliterated text representing the respective native language of the corresponding audio in a target script and associating the corresponding transliterated text in the target script with the corresponding audio in the respective native language to generate a respective normalized training data sample.

Type: Grant

Filed: January 19, 2021

Date of Patent: March 28, 2023

Assignee: Google LLC

Inventors: Arindrima Datta, Bhuvana Ramabhadran, Jesse Emond, Brian Roark
Generating output for presentation in response to user interface input, where the input and/or the output include chatspeak

Patent number: 11238242

Abstract: Some implementations are directed to translating chatspeak to a normalized form, where the chatspeak is included in natural language input formulated by a user via a user interface input device of a computing device—such as input provided by the user to an automated assistant. The normalized form of the chatspeak may be utilized by the automated assistant in determining reply content that is responsive to the natural language input, and that reply content may be presented to the user via one or more user interface output devices of the computing device of the user. Some implementations are additionally and/or alternatively directed to providing, for presentation to a user, natural language output that includes chatspeak in lieu of a normalized form of the chatspeak, based at least in part on a “chatspeak measure” that is determined based on past usage of chatspeak by the user and/or by additional users.

Type: Grant

Filed: March 21, 2019

Date of Patent: February 1, 2022

Assignee: Google LLC

Inventors: Wan Fen Nicole Quah, Bryan Horling, Maryam Garrett, Brian Roark, Richard Sproat
GENERATING OUTPUT FOR PRESENTATION IN RESPONSE TO USER INTERFACE INPUT, WHERE THE INPUT AND/OR THE OUTPUT INCLUDE CHATSPEAK

Publication number: 20190220519

Abstract: Some implementations are directed to translating chatspeak to a normalized form, where the chatspeak is included in natural language input formulated by a user via a user interface input device of a computing device—such as input provided by the user to an automated assistant. The normalized form of the chatspeak may be utilized by the automated assistant in determining reply content that is responsive to the natural language input, and that reply content may be presented to the user via one or more user interface output devices of the computing device of the user. Some implementations are additionally and/or alternatively directed to providing, for presentation to a user, natural language output that includes chatspeak in lieu of a normalized form of the chatspeak, based at least in part on a “chatspeak measure” that is determined based on past usage of chatspeak by the user and/or by additional users.

Type: Application

Filed: March 21, 2019

Publication date: July 18, 2019

Inventors: Wan Fen Nicole Quah, Bryan Horling, Maryam Garrett, Brian Roark, Richard Sproat
Generating output for presentation in response to user interface input, where the input and/or the output include chatspeak

Patent number: 10268683

Abstract: Some implementations are directed to translating chatspeak to a normalized form, where the chatspeak is included in natural language input formulated by a user via a user interface input device of a computing device—such as input provided by the user to an automated assistant. The normalized form of the chatspeak may be utilized by the automated assistant in determining reply content that is responsive to the natural language input, and that reply content may be presented to the user via one or more user interface output devices of the computing device of the user. Some implementations are additionally and/or alternatively directed to providing, for presentation to a user, natural language output that includes chatspeak in lieu of a normalized form of the chatspeak, based at least in part on a “chatspeak measure” that is determined based on past usage of chatspeak by the user and/or by additional users.

Type: Grant

Filed: May 17, 2016

Date of Patent: April 23, 2019

Assignee: GOOGLE LLC

Inventors: Wan Fen Nicole Quah, Bryan Horling, Maryam Garrett, Brian Roark, Richard Sproat
GENERATING OUTPUT FOR PRESENTATION IN RESPONSE TO USER INTERFACE INPUT, WHERE THE INPUT AND/OR THE OUTPUT INCLUDE CHATSPEAK

Publication number: 20170337184

Abstract: Some implementations are directed to translating chatspeak to a normalized form, where the chatspeak is included in natural language input formulated by a user via a user interface input device of a computing device—such as input provided by the user to an automated assistant. The normalized form of the chatspeak may be utilized by the automated assistant in determining reply content that is responsive to the natural language input, and that reply content may be presented to the user via one or more user interface output devices of the computing device of the user. Some implementations are additionally and/or alternatively directed to providing, for presentation to a user, natural language output that includes chatspeak in lieu of a normalized form of the chatspeak, based at least in part on a “chatspeak measure” that is determined based on past usage of chatspeak by the user and/or by additional users.

Type: Application

Filed: May 17, 2016

Publication date: November 23, 2017

Inventors: Wan Fen Nicole Quah, Bryan Horling, Maryam Garrett, Brian Roark, Richard Sproat
RAPID SERIAL PRESENTATION COMMUNICATION SYSTEMS AND METHODS

Publication number: 20100280403

Abstract: Embodiments of the disclosed technology provide reliable and fast communication of a human through a direct brain interface which detects the intent of the user. An embodiment of the disclosed technology comprises a system and method in which least one sequence of a plurality of stimuli is presented to an individual (using appropriate sensory modalities), and the time course of at least one measurable response to the sequence(s) is used to select at least one stimulus from the sequence(s). In an embodiment, the sequence(s) may be dynamically altered based on previously selected stimuli and/or on estimated probability distributions over the stimuli. In an embodiment, such dynamic alteration may be based on predictive models of appropriate sequence generation mechanisms, such as an adaptive or static sequence model.

Type: Application

Filed: January 12, 2009

Publication date: November 4, 2010

Inventors: Deniz Erdogmus, Brian Roark, Melanie Fried-Oken, Jan Van Santen, Michael Pavel
System and method of using meta-data in speech processing

Publication number: 20050096908

Abstract: Systems and methods relate to generating a language model for use in, for example, a spoken dialog system or some other application. The method comprises building a class-based language model, generating at least one sequence network and replacing class labels in the class-based language model with the at least one sequence network. In this manner, placeholders or tokens associated with classes can be inserted into the models at training time and word/phone networks can be built based on meta-data information at test time. Finally, the placeholder token can be replaced with the word/phone networks at run time to improve recognition of difficult words such as proper names.

Type: Application

Filed: October 29, 2004

Publication date: May 5, 2005

Applicant: AT&T Corp.

Inventors: Michiel Bacchiani, Sameer Maskey, Brian Roark, Richard Sproat
System and method for using meta-data dependent language modeling for automatic speech recognition

Publication number: 20050096907

Abstract: Disclosed are systems and methods for providing a spoken dialog system using meta-data to build language models to improve speech processing. Meta-data is generally defined as data outside received speech; for example, meta-data may be a customer profile having a name, address and purchase history of a caller to a spoken dialog system. The method comprises building tree clusters from meta-data and estimating a language model using the built tree clusters. The language model may be used by various modules in the spoken dialog system, such as the automatic speech recognition module and/or the dialog management module. Building the tree clusters from the meta-data may involve generating projections from the meta-data and further may comprise computing counts as a result of unigram tree clustering and then building both unigram trees and higher-order trees from the meta-data as well as computing node distances within the built trees that are used for estimating the language model.

Type: Application

Filed: October 29, 2004

Publication date: May 5, 2005

Applicant: AT&T Corp.

Inventors: Michiel Bacchiani, Brian Roark