Patents by Inventor Mark Edward Epstein

Mark Edward Epstein has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Classification of offensive words

Patent number: 10635750

Abstract: A computer-implemented method can include identifying a first set of text samples that include a particular potentially offensive term. Labels can be obtained for the first set of text samples that indicate whether the particular potentially offensive term is used in an offensive manner. A classifier can be trained based at least on the first set of text samples and the labels, the classifier being configured to use one or more signals associated with a text sample to generate a label that indicates whether a potentially offensive term in the text sample is used in an offensive manner in the text sample. The method can further include providing, to the classifier, a first text sample that includes the particular potentially offensive term, and in response, obtaining, from the classifier, a label that indicates whether the particular potentially offensive term is used in an offensive manner in the first text sample.

Type: Grant

Filed: April 17, 2018

Date of Patent: April 28, 2020

Assignee: Google LLC

Inventors: Mark Edward Epstein, Pedro J. Moreno Mengibar
Building language models for a user in a social network from linguistic information

Patent number: 9747895

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for building language models. One of the methods includes identifying a first group of one or more users associated with a user in a social network. The method includes identifying first linguistic information associated with the first group. The method includes generating a first language model based on the first linguistic information. The method includes identifying a second group of one or more users associated with the user. The method includes identifying second linguistic information associated with the second group. The method includes generating a second language model based on the second linguistic information. The method includes associating the first language model and the second language model with the user.

Type: Grant

Filed: July 8, 2013

Date of Patent: August 29, 2017

Assignee: Google Inc.

Inventors: Martin Jansche, Mark Edward Epstein
Clustering classes in language modeling

Patent number: 9529898

Abstract: This document describes, among other things, a computer-implemented method. The method can include obtaining a plurality of text samples that each include one or more terms belonging to a first class of terms. The plurality of text samples can be classified into a plurality of groups of text samples. Each group of text samples can correspond to a different sub-class of terms. For each of the groups of text samples, a sub-class context model can be generated based on the text samples in the respective group of text samples. Particular ones of the sub-class context models that are determined to be similar can be merged to generate a hierarchical set of context models. Further, the method can include selecting particular ones of the context models and generating a class-based language model based on the selected context models.

Type: Grant

Filed: March 12, 2015

Date of Patent: December 27, 2016

Assignee: Google Inc.

Inventors: Mark Edward Epstein, Vladislav Schogol
Using social networks to improve acoustic models

Patent number: 9460716

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for acoustic model generation. One of the methods includes identifying one or more demographic characteristics for a user of a social networking site. The method includes receiving speech data from the user, the speech data associated with a user device. The method includes storing the speech data associated with demographic characteristics of the user and the user device.

Type: Grant

Filed: July 9, 2013

Date of Patent: October 4, 2016

Assignee: Google Inc.

Inventors: Mark Edward Epstein, Martin Jansche
Generating language models

Patent number: 9437189

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating language models. In some implementations, data is accessed that indicates a set of classes corresponding to a concept. A first language model is generated in which a first class represents the concept. A second language model is generated in which second classes represent the concept. Output of the first language model and the second language model is obtained, and the outputs are evaluated. A class from the set of classes is selected based on evaluating the output of the first language model and the output of the second language model. In some implementations, the first class and the second class are selected from a parse tree or other data that indicates relationships among the classes in the set of classes.

Type: Grant

Filed: May 29, 2014

Date of Patent: September 6, 2016

Assignee: Google Inc.

Inventors: Mark Edward Epstein, Lucy Vasserman
Methods and systems for determining instructions for applications that are recognizable by a voice interface

Patent number: 9318128

Abstract: Methods and systems for facilitating development of voice-enabled applications are provided. The method may comprise receiving, at a computing device, a plurality of actions associated with a given application, parameters associated with each respective action, and example instructions responsive to respective actions. The method may also comprise determining candidate instructions based on the actions, parameters, and example instructions. Each candidate instruction may comprise one or more grammars recognizable by a voice interface for the given application. The method may further comprise the computing device receiving respective acceptance information for each candidate instruction, and comparing at least a portion of the respective acceptance information with a stored acceptance information log comprising predetermined acceptance information so as to determine a correlation.

Type: Grant

Filed: February 1, 2013

Date of Patent: April 19, 2016

Assignee: Google Inc.

Inventors: Mark Edward Epstein, Pedro J. Moreno Mengibar, Fadi Biadsy
Language modeling in speech recognition

Patent number: 9286892

Abstract: Some implementations include a computer-implemented method. The method can include providing a training set of text samples to a semantic parser that associates text samples with actions. The method can include obtaining, for each of one or more of the text samples of the training set, data that indicates one or more domains that the semantic parser has associated with the text sample. For each of one or more domains, a subset of the text samples of the training set can be generated that the semantic parser has associated with the domain. Using the subset of text samples associated with the domain, a language model can be generated for one or more of the domain. Speech recognition can be performed on an utterance using the one or more language models that are generated for the one or more of the domains.

Type: Grant

Filed: April 1, 2014

Date of Patent: March 15, 2016

Assignee: Google Inc.

Inventors: Pedro J. Moreno Mengibar, Mark Edward Epstein
Clustering Classes in Language Modeling

Publication number: 20160062985

Abstract: This document describes, among other things, a computer-implemented method. The method can include obtaining a plurality of text samples that each include one or more terms belonging to a first class of terms. The plurality of text samples can be classified into a plurality of groups of text samples. Each group of text samples can correspond to a different sub-class of terms. For each of the groups of text samples, a sub-class context model can be generated based on the text samples in the respective group of text samples. Particular ones of the sub-class context models that are determined to be similar can be merged to generate a hierarchical set of context models. Further, the method can include selecting particular ones of the context models and generating a class-based language model based on the selected context models.

Type: Application

Filed: March 12, 2015

Publication date: March 3, 2016

Inventors: Mark Edward Epstein, Vladislav Schogol
Generating Language Models

Publication number: 20150348541

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating language models. In some implementations, data is accessed that indicates a set of classes corresponding to a concept. A first language model is generated in which a first class represents the concept. A second language model is generated in which second classes represent the concept. Output of the first language model and the second language model is obtained, and the outputs are evaluated. A class from the set of classes is selected based on evaluating the output of the first language model and the output of the second language model. In some implementations, the first class and the second class are selected from a parse tree or other data that indicates relationships among the classes in the set of classes.

Type: Application

Filed: May 29, 2014

Publication date: December 3, 2015

Applicant: Google Inc.

Inventors: Mark Edward Epstein, Lucy Vasserman
Classification of Offensive Words

Publication number: 20150309987

Abstract: A computer-implemented method can include identifying a first set of text samples that include a particular potentially offensive term. Labels can be obtained for the first set of text samples that indicate whether the particular potentially offensive term is used in an offensive manner. A classifier can be trained based at least on the first set of text samples and the labels, the classifier being configured to use one or more signals associated with a text sample to generate a label that indicates whether a potentially offensive term in the text sample is used in an offensive manner in the text sample. The method can further include providing, to the classifier, a first text sample that includes the particular potentially offensive term, and in response, obtaining, from the classifier, a label that indicates whether the particular potentially offensive term is used in an offensive manner in the first text sample.

Type: Application

Filed: April 29, 2014

Publication date: October 29, 2015

Applicant: Google Inc.

Inventors: Mark Edward Epstein, Pedro J. Moreno Mengibar
SPEECH AND SEMANTIC PARSING FOR CONTENT SELECTION

Publication number: 20150287410

Abstract: Systems, apparatus and method for speech and semantic parsing for content selection. In an aspect, a method includes selecting, for each of a plurality of voice query analyzers, an analyzer output parameter; generating a voice query model for voice queries, the voice query model including analysis fields, wherein each analysis field in at least a first portion of the analysis fields corresponds to a corresponding analyzer output parameter; receiving, from a plurality of content item providers, voice query selection data that describes analyzer output parameter values for the voice query model that satisfy selection criteria for the content item provider; and persisting the voice query selection data for the content item providers to a computer memory device; wherein the voice query analyzers include a semantic analyzer and a biometric analyzer.

Type: Application

Filed: March 15, 2013

Publication date: October 8, 2015

Inventors: Pedro J. Moreno Mengibar, Mark Edward Epstein
LANGUAGE MODELING IN SPEECH RECOGNITION

Publication number: 20150279360

Abstract: Some implementations include a computer-implemented method. The method can include providing a training set of text samples to a semantic parser that associates text samples with actions. The method can include obtaining, for each of one or more of the text samples of the training set, data that indicates one or more domains that the semantic parser has associated with the text sample. For each of one or more domains, a subset of the text samples of the training set can be generated that the semantic parser has associated with the domain. Using the subset of text samples associated with the domain, a language model can be generated for one or more of the domain. Speech recognition can be performed on an utterance using the one or more language models that are generated for the one or more of the domains.

Type: Application

Filed: April 1, 2014

Publication date: October 1, 2015

Applicant: Google Inc.

Inventors: Pedro J. Moreno Mengibar, Mark Edward Epstein
Bootstrapping named entity canonicalizers from English using alignment models

Patent number: 9146919

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training recognition canonical representations corresponding to named-entity phrases in a second natural language based on translating a set of allowable expressions with canonical representations from a first natural language, which may be generated by expanding a context-free grammar for the allowable expressions for the first natural language.

Type: Grant

Filed: March 14, 2013

Date of Patent: September 29, 2015

Assignee: Google Inc.

Inventors: Mark Edward Epstein, Pedro J. Mengibar
Increasing semantic coverage with semantically irrelevant insertions

Patent number: 9129598

Abstract: A method includes accessing data specifying a set of actions, each action defining a user device operation and for each action: accessing a corresponding set of command sentences for the action, determining first n-grams in the set of command sentences that are semantically relevant for the action, determining second n-grams in the set of command sentences that are semantically irrelevant for the action, generating a training set of command sentences from the corresponding set of command sentences, the generating the training set of command sentences including removing each second n-gram from each sentence in the corresponding set of command sentences for the action, and generating a command model from the training set of command sentences configured to generate an action score for the action for an input sentence based on: first n-grams for the action, and second n-grams for the action that are also second n-grams for all other actions.

Type: Grant

Filed: March 27, 2015

Date of Patent: September 8, 2015

Assignee: Google Inc.

Inventors: Pedro J. Moreno Mengibar, Mark Edward Epstein, Fadi Biadsy
Mining data for natural language system

Patent number: 9047271

Abstract: A method iteratively processes data for a set of actions, including: for each action: accessing a corresponding set of command sentences for the action, determining first n-grams that are semantically relevant for the action and second n-grams that are semantically irrelevant for the action, and identifying, from a log of command sentences that includes command sentences not included in the corresponding set of command sentences, candidate command sentences that include one first n-gram and a third n-gram that has not yet been determined to be a first n-gram or a second n-gram; for each candidate command sentence, determining each third n-gram that is semantically relevant for an action to be a first n-gram, and determining each third n-gram that is semantically irrelevant for an action to be a second n-gram, and adjusting the corresponding set of command sentences for each action based on the first n-grams and the second n-grams.

Type: Grant

Filed: February 28, 2013

Date of Patent: June 2, 2015

Assignee: Google Inc.

Inventors: Pedro J. Moreno Mengibar, Mark Edward Epstein, Fadi Biadsy
Increasing semantic coverage with semantically irrelevant insertions

Patent number: 9020809

Abstract: A method includes accessing data specifying a set of actions, each action defining a user device operation and for each action: accessing a corresponding set of command sentences for the action, determining first n-grams in the set of command sentences that are semantically relevant for the action, determining second n-grams in the set of command sentences that are semantically irrelevant for the action, generating a training set of command sentences from the corresponding set of command sentences, the generating the training set of command sentences including removing each second n-gram from each sentence in the corresponding set of command sentences for the action, and generating a command model from the training set of command sentences configured to generate an action score for the action for an input sentence based on: first n-grams for the action, and second n-grams for the action that are also second n-grams for all other actions.

Type: Grant

Filed: February 28, 2013

Date of Patent: April 28, 2015

Assignee: Google Inc.

Inventors: Pedro J. Mengibar, Mark Edward Epstein, Fadi Biadsy
BOOTSTRAPPING NAMED ENTITY CANONICALIZERS FROM ENGLISH USING ALIGNMENT MODELS

Publication number: 20140200876

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training recognition canonical representations corresponding to named-entity phrases in a second natural language based on translating a set of allowable expressions with canonical representations from a first natural language, which may be generated by expanding a context-free grammar for the allowable expressions for the first natural language.

Type: Application

Filed: March 14, 2013

Publication date: July 17, 2014

Applicant: Google Inc.

Inventors: Mark Edward Epstein, Pedro J. Mengibar
PHONETIC PRONUNCIATION

Publication number: 20140074470

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for improved pronunciation. One of the methods includes receiving data that represents an audible pronunciation of the name of an individual from a user device. The method includes identifying one or more other users that are members of a social circle that the individual is a member. The method includes identifying one or more devices associated with the other users. The method also includes providing information that identifies the individual and the data representing the audible pronunciation to the one or more identified devices.

Type: Application

Filed: July 23, 2013

Publication date: March 13, 2014

Applicant: Google Inc.

Inventors: Martin Jansche, Mark Edward Epstein, Ciprian I. Chelba
Determining advertisements based on verbal inputs to applications on a computing device

Patent number: 8612226

Abstract: The present disclosure provides methods operable by computing device having one or more applications configured to perform functions based on a received verbal input. The method may comprise receiving a verbal input, obtaining one or more textual phrases corresponding to the received verbal input, and providing the one or more textual phrases to an appropriate application on the computing device. The method may further comprise accumulating data on the one or more textual phrases. The data comprises at least a count of a number of times a particular textual phrase is obtained based on a given received verbal input. Based on the count exceeding a threshold, the method may further comprise providing a query corresponding to the textual phrase, where the query is usable to search an advertisement database for one or more advertisements relating to the textual phrase.

Type: Grant

Filed: January 28, 2013

Date of Patent: December 17, 2013

Assignee: Google Inc.

Inventors: Mark Edward Epstein, Pedro J. Moreno Mengibar
Method and apparatus for translating natural-language speech using multiple output phrases

Patent number: 6859778

Abstract: A multi-lingual translation system that provides multiple output sentences for a given word or phrase. Each output sentence for a given word or phrase reflects, for example, a different emotional emphasis, dialect, accents, loudness or rates of speech. A given output sentence could be selected automatically, or manually as desired, to create a desired effect. For example, the same output sentence for a given word or phrase can be recorded three times, to selectively reflect excitement, sadness or fear. The multi-lingual translation system includes a phrase-spotting mechanism, a translation mechanism, a speech output mechanism and optionally, a language understanding mechanism or an event measuring mechanism or both. The phrase-spotting mechanism identifies a spoken phrase from a restricted domain of phrases. The language understanding mechanism, if present, maps the identified phrase onto a small set of formal phrases.

Type: Grant

Filed: March 16, 2000

Date of Patent: February 22, 2005

Assignees: International Business Machines Corporation, OIPENN, Inc.

Inventors: Raimo Bakis, Mark Edward Epstein, William Stuart Meisel, Miroslav Novak, Michael Picheny, Ridley M. Whitaker

1 2 next