Patents by Inventor Mazin Gilbert

Mazin Gilbert has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

System and method of extracting clauses for spoken language understanding

Patent number: 9703769

Abstract: A clausifier and method of extracting clauses for spoken language understanding are disclosed. The method relates to generating a set of clauses from speech utterance text and comprises inserting at least one boundary tag in speech utterance text related to sentence boundaries, inserting at least one edit tag indicating a portion of the speech utterance text to remove, and inserting at least one conjunction tag within the speech utterance text. The result is a set of clauses that may be identified within the speech utterance text according to the inserted at least one boundary tag, at least one edit tag and at least one conjunction tag. The disclosed clausifier comprises a sentence boundary classifier, an edit detector classifier, and a conjunction detector classifier. The clausifier may comprise a single classifier or a plurality of classifiers to perform the steps of identifying sentence boundaries, editing text, and identifying conjunctions within the text.

Type: Grant

Filed: October 7, 2015

Date of Patent: July 11, 2017

Assignee: Nuance Communications, Inc.

Inventors: Srinivas Bangalore, Narendra K. Gupta, Mazin Gilbert
System and method for rapid customization of speech recognition models

Patent number: 9679561

Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for generating domain-specific speech recognition models for a domain of interest by combining and tuning existing speech recognition models when a speech recognizer does not have access to a speech recognition model for that domain of interest and when available domain-specific data is below a minimum desired threshold to create a new domain-specific speech recognition model. A system configured to practice the method identifies a speech recognition domain and combines a set of speech recognition models, each speech recognition model of the set of speech recognition models being from a respective speech recognition domain. The system receives an amount of data specific to the speech recognition domain, wherein the amount of data is less than a minimum threshold to create a new domain-specific model, and tunes the combined speech recognition model for the speech recognition domain based on the data.

Type: Grant

Filed: March 28, 2011

Date of Patent: June 13, 2017

Assignee: Nuance Communications, Inc.

Inventors: Srinivas Bangalore, Robert Bell, Diamantino Antonio Caseiro, Mazin Gilbert, Patrick Haffner
Learning from interactions for a spoken dialog system

Patent number: 9620117

Abstract: In one embodiment, a semantic classifier input and a corresponding label attributed to the semantic classifier input may be obtained. A determination may be made whether the corresponding label is correct based on logged interaction data. An entry of an adaptation corpus may be generated based on a result of the determination. Operation of the semantic classifier may be adapted based on the adaptation corpus.

Type: Grant

Filed: June 27, 2006

Date of Patent: April 11, 2017

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Mazin Gilbert, Esther Levin, Michael Lederman Littman, Robert E. Schapire
SYSTEM AND METHOD FOR USING SPEECH FOR DATA SEARCHING DURING PRESENTATIONS

Publication number: 20170075656

Abstract: There is provided for a system, method, and computer readable medium storing instructions related to controlling a presentation in a multimodal system. A method for the retrieval of information on the basis of its content for real-time incorporation into an electronic presentation is discussed. One method includes controlling a media presentation using a multimodal interface. The method involves receiving from a presenter a content-based request associated with a plurality of segments within a media presentation preprocessed for context-based searching; displaying the media presentation and displaying to the presenter results in response to the content-based request; receiving a selection from the presenter of at least one result; and displaying the selected result to an audience.

Type: Application

Filed: November 2, 2016

Publication date: March 16, 2017

Inventors: Patrick EHLEN, David Crawford GIBBON, Mazin GILBERT, Michael JOHNSTON, Zhu LIU, Behzad SHAHRARAY
SYSTEM AND METHOD FOR STORING ADVERTISING DATA

Publication number: 20170078763

Abstract: A computerized method is disclosed including but not limited to determining on a processor, a tendency of an end user's to respond to one of a plurality of advertising data types in a video data stream wherein the plurality of advertising data types comprise at least two of audio, video, text and image data types. A computer readable medium containing a data structure is disclosed providing a functional and structural interrelationship between a processor in the system and data in the data structure.

Type: Application

Filed: July 15, 2015

Publication date: March 16, 2017

Applicant: AT&T Intellectual Property I, LP

Inventors: Narenda K. Gupta, Mazin Gilbert
Method and Apparatus for Responding to an Inquiry

Publication number: 20170024648

Abstract: Disclosed is a method and apparatus for responding to an inquiry from a client via a network. The method and apparatus receive the inquiry from a client via a network. Based on the inquiry, question-answer pairs retrieved from the network are analyzed to determine a response to the inquiry. The QA pairs are not predefined. As a result, the QA pairs have to be analyzed in order to determine whether they are responsive to a particular inquiry. Questions of the QA pairs may be repetitive and similar to one another even for very different subjects, and without additional contextual and meta-level information, are not useful in determining whether their corresponding answer responds to an inquiry.

Type: Application

Filed: October 6, 2016

Publication date: January 26, 2017

Applicant: AT&T Intellectual Property II, L.P.

Inventors: Junlan Feng, Mazin Gilbert, Dilek Hakkani-Tur, Gokhan Tur
System and Method for Optimizing Speech Recognition and Natural Language Parameters with User Feedback

Publication number: 20160329045

Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for assigning saliency weights to words of an ASR model. The saliency values assigned to words within an ASR model are based on human perception judgments of previous transcripts. These saliency values are applied as weights to modify an ASR model such that the results of the weighted ASR model in converting a spoken document to a transcript provide a more accurate and useful transcription to the user.

Type: Application

Filed: July 18, 2016

Publication date: November 10, 2016

Inventors: Andrej LJOLJE, Diamantino Antonio CASEIRO, Mazin GILBERT, Vincent GOFFIN, Taniya Mishra
System and method for using speech for data searching during presentations

Patent number: 9489432

Abstract: There is provided for a system, method, and computer readable medium storing instructions related to controlling a presentation in a multimodal system. A method for the retrieval of information on the basis of its content for real-time incorporation into an electronic presentation is discussed. One method includes controlling a media presentation using a multimodal interface. The method involves receiving from a presenter a content-based request associated with a plurality of segments within a media presentation preprocessed for context-based searching; displaying the media presentation and displaying to the presenter results in response to the content-based request; receiving a selection from the presenter of at least one result; and displaying the selected result to an audience.

Type: Grant

Filed: August 24, 2015

Date of Patent: November 8, 2016

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Patrick Ehlen, David Crawford Gibbon, Mazin Gilbert, Michael Johnston, Zhu Liu, Behzad Shahraray
Method and apparatus for responding to an inquiry

Patent number: 9489450

Abstract: Disclosed is a method and apparatus for responding to an inquiry from a client via a network. The method and apparatus receive the inquiry from a client via a network. Based on the inquiry, question-answer pairs retrieved from the network are analyzed to determine a response to the inquiry. The QA pairs are not predefined. As a result, the QA pairs have to be analyzed in order to determine whether they are responsive to a particular inquiry. Questions of the QA pairs may be repetitive and, without more, will not be useful in determining whether their corresponding answer responds to an inquiry.

Type: Grant

Filed: November 10, 2015

Date of Patent: November 8, 2016

Inventors: Junlan Feng, Mazin Gilbert, Dilek Hakkani-Tur, Gokhan Tur
System and method for discriminative pronunciation modeling for voice search

Patent number: 9484019

Abstract: Disclosed herein is a method for speech recognition. The method includes receiving speech utterances, assigning a pronunciation weight to each unit of speech in the speech utterances, each respective pronunciation weight being normalized at a unit of speech level to sum to 1, for each received speech utterance, optimizing the pronunciation weight by identifying word and phone alignments and corresponding likelihood scores, and discriminatively adapting the pronunciation weight to minimize classification errors, and recognizing additional received speech utterances using the optimized pronunciation weights. A unit of speech can be a sentence, a word, a context-dependent phone, a context-independent phone, or a syllable. The method can further include discriminatively adapting pronunciation weights based on an objective function. The objective function can be maximum mutual information, maximum likelihood training, minimum classification error training, or other functions known to those of skill in the art.

Type: Grant

Filed: October 11, 2012

Date of Patent: November 1, 2016

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Mazin Gilbert, Alistair D. Conkie, Andrej Ljolje
On-Demand Language Translation for Television Programs

Publication number: 20160295292

Abstract: A method, a system and a machine-readable medium are provided for an on demand translation service. A translation module including at least one language pair module for translating a source language to a target language may be made available for use by a subscriber. The subscriber may be charged a fee for use of the requested on demand translation service or may be provided use of the on demand translation service for free in exchange for displaying commercial messages to the subscriber. A video signal may be received including information in the source language, which may be obtained as text from the video signal and may be translated from the source language to the target language by use of the translation module. Translated information, based on the translated text, may be added into the received video signal.

Type: Application

Filed: June 20, 2016

Publication date: October 6, 2016

Inventors: Srinivas Bangalore, David Crawford Gibbon, Mazin Gilbert, Patrick Guy Haffner, Zhu Liu, Behzad Shahraray
METHOD AND APPARATUS FOR IDENTIFYING ACOUSTIC BACKGROUND ENIRONMENTS BASED ON TIME AND SPEECH TO ENHANCE AUTOMATIC SPEECH RECOGNITION

Publication number: 20160275948

Abstract: Disclosed are systems, methods, and computer readable media for identifying an acoustic environment of a caller. The method embodiment comprises analyzing acoustic features of a received audio signal from a caller, receiving meta-data information, classifying a background environment of the caller based on the analyzed acoustic features and the meta-data, selecting an acoustic model matched to the classified background environment from a plurality of acoustic models, and performing speech recognition as the received audio signal using the selected acoustic model.

Type: Application

Filed: June 2, 2016

Publication date: September 22, 2016

Inventor: Mazin GILBERT
System and method for supplemental speech recognition by identified idle resources

Patent number: 9431005

Abstract: Disclosed herein are systems, methods, and computer-readable storage media for improving automatic speech recognition performance. A system practicing the method identifies idle speech recognition resources and establishes a supplemental speech recognizer on the idle resources based on overall speech recognition demand. The supplemental speech recognizer can differ from a main speech recognizer, and, along with the main speech recognizer, can be associated with a particular speaker. The system performs speech recognition on speech received from the particular speaker in parallel with the main speech recognizer and the supplemental speech recognizer and combines results from the main and supplemental speech recognizer. The system recognizes the received speech based on the combined results. The system can use beam adjustment in place of or in combination with a supplemental speech recognizer.

Type: Grant

Filed: November 30, 2012

Date of Patent: August 30, 2016

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Andrej Ljolje, Mazin Gilbert
Multimodal Portable Communication Interface for Accessing Video Content

Publication number: 20160249107

Abstract: A portable communication device has a touch screen display that receives tactile input and a microphone that receives audio input. The portable communication device initiates a query for media based at least in part on tactile input and audio input. The touch screen display is a multi-touch screen. The portable communication device sends an initiated query and receives a text response indicative of a speech to text conversion of the query. The portable communication device then displays video in response to tactile input and audio input.

Type: Application

Filed: May 2, 2016

Publication date: August 25, 2016

Inventors: BEHZAD SHAHRARAY, DAVID CRAWFORD GIBBON, BERNARD S. RENGER, ZHU LIU, ANDREA BASSO, MAZIN GILBERT, MICHAEL J. JOHNSTON
System and method for optimizing speech recognition and natural language parameters with user feedback

Patent number: 9396725

Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for assigning saliency weights to words of an ASR model. The saliency values assigned to words within an ASR model are based on human perception judgments of previous transcripts. These saliency values are applied as weights to modify an ASR model such that the results of the weighted ASR model in converting a spoken document to a transcript provide a more accurate and useful transcription to the user.

Type: Grant

Filed: May 27, 2014

Date of Patent: July 19, 2016

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Andrej Ljolje, Diamantino Antonio Caseiro, Mazin Gilbert, Vincent Goffin, Taniya Mishra
On-demand language translation for television programs

Patent number: 9374612

Abstract: A method, a system and a machine-readable medium are provided for an on demand translation service. A translation module including at least one language pair module for translating a source language to a target language may be made available for use by a subscriber. The subscriber may be charged a fee for use of the requested on demand translation service or may be provided use of the on demand translation service for free in exchange for displaying commercial messages to the subscriber. A video signal may be received including information in the source language, which may be obtained as text from the video signal and may be translated from the source language to the target language by use of the translation module. Translated information, based on the translated text, may be added into the received video signal.

Type: Grant

Filed: October 24, 2013

Date of Patent: June 21, 2016

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Srinivas Bangalore, David Crawford Gibbon, Mazin Gilbert, Patrick Guy Haffner, Zhu Liu, Behzad Shahraray
Method and apparatus for identifying acoustic background environments based on time and speed to enhance automatic speech recognition

Patent number: 9361881

Abstract: Disclosed are systems, methods, and computer readable media for identifying an acoustic environment of a caller. The method embodiment comprises analyzing acoustic features of a received audio signal from a caller, receiving meta-data information based on a previously recorded time and speed of the caller, classifying a background environment of the caller based on the analyzed acoustic features and the meta-data, selecting an acoustic model matched to the classified background environment from a plurality of acoustic models, and performing speech recognition as the received audio signal using the selected acoustic model.

Type: Grant

Filed: June 23, 2014

Date of Patent: June 7, 2016

Assignee: AT&T Intellectual Property II, L.P.

Inventor: Mazin Gilbert
Multimodal portable communication interface for accessing video content

Patent number: 9348908

Abstract: A portable communication device has a touch screen display that receives tactile input and a microphone that receives audio input. The portable communication device initiates a query for media based at least in part on tactile input and audio input. The touch screen display is a multi-touch screen. The portable communication device sends an initiated query and receives a text response indicative of a speech to text conversion of the query. The portable communication device then displays video in response to tactile input and audio input.

Type: Grant

Filed: July 19, 2013

Date of Patent: May 24, 2016

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Behzad Shahraray, David Crawford Gibbon, Bernard S. Renger, Zhu Liu, Andrea Basso, Mazin Gilbert, Michael J. Johnston
SYSTEM AND METHOD FOR GENERATING CUSTOMIZED TEXT-TO-SPEECH VOICES

Publication number: 20160093287

Abstract: A system and method are disclosed for generating customized text-to-speech voices for a particular application. The method comprises generating a custom text-to-speech voice by selecting a voice for generating a custom text-to-speech voice associated with a domain, collecting text data associated with the domain from a pre-existing text data source and using the collected text data, generating an in-domain inventory of synthesis speech units by selecting speech units appropriate to the domain via a search of a pre-existing inventory of synthesis speech units, or by recording the minimal inventory for a selected level of synthesis quality. The text-to-speech custom voice for the domain is generated utilizing the in-domain inventory of synthesis speech units. Active learning techniques may also be employed to identify problem phrases wherein only a few minutes of recorded data is necessary to deliver a high quality TTS custom voice.

Type: Application

Filed: December 10, 2015

Publication date: March 31, 2016

Inventors: Srinivas BANGALORE, Junlan FENG, Mazin GILBERT, Juergen SCHROETER, Ann K. SYRDAL, David SCHULZ
Bootstrapping language models for spoken dialog systems using the world wide web

Patent number: 9299345

Abstract: A system, method and computer readable medium that generates a language model from data from a web domain is disclosed. The method may include filtering web data to remove unwanted data from the web domain data, extracting predicate/argument pairs from the filtered web data, generating conversational utterances by merging the extracted predicate/argument pairs into conversational templates, and generating a web data language model using the generated conversational utterances.

Type: Grant

Filed: June 20, 2006

Date of Patent: March 29, 2016

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Mazin Gilbert, Dilek Z. Hakkani-Tur

prev 1 2 3 4 5 6 7 8 … next