Patents by Inventor Mazin Gilbert

Mazin Gilbert has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 9703769
    Abstract: A clausifier and method of extracting clauses for spoken language understanding are disclosed. The method relates to generating a set of clauses from speech utterance text and comprises inserting at least one boundary tag in speech utterance text related to sentence boundaries, inserting at least one edit tag indicating a portion of the speech utterance text to remove, and inserting at least one conjunction tag within the speech utterance text. The result is a set of clauses that may be identified within the speech utterance text according to the inserted at least one boundary tag, at least one edit tag and at least one conjunction tag. The disclosed clausifier comprises a sentence boundary classifier, an edit detector classifier, and a conjunction detector classifier. The clausifier may comprise a single classifier or a plurality of classifiers to perform the steps of identifying sentence boundaries, editing text, and identifying conjunctions within the text.
    Type: Grant
    Filed: October 7, 2015
    Date of Patent: July 11, 2017
    Assignee: Nuance Communications, Inc.
    Inventors: Srinivas Bangalore, Narendra K. Gupta, Mazin Gilbert
  • Patent number: 9679561
    Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for generating domain-specific speech recognition models for a domain of interest by combining and tuning existing speech recognition models when a speech recognizer does not have access to a speech recognition model for that domain of interest and when available domain-specific data is below a minimum desired threshold to create a new domain-specific speech recognition model. A system configured to practice the method identifies a speech recognition domain and combines a set of speech recognition models, each speech recognition model of the set of speech recognition models being from a respective speech recognition domain. The system receives an amount of data specific to the speech recognition domain, wherein the amount of data is less than a minimum threshold to create a new domain-specific model, and tunes the combined speech recognition model for the speech recognition domain based on the data.
    Type: Grant
    Filed: March 28, 2011
    Date of Patent: June 13, 2017
    Assignee: Nuance Communications, Inc.
    Inventors: Srinivas Bangalore, Robert Bell, Diamantino Antonio Caseiro, Mazin Gilbert, Patrick Haffner
  • Patent number: 9620117
    Abstract: In one embodiment, a semantic classifier input and a corresponding label attributed to the semantic classifier input may be obtained. A determination may be made whether the corresponding label is correct based on logged interaction data. An entry of an adaptation corpus may be generated based on a result of the determination. Operation of the semantic classifier may be adapted based on the adaptation corpus.
    Type: Grant
    Filed: June 27, 2006
    Date of Patent: April 11, 2017
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Mazin Gilbert, Esther Levin, Michael Lederman Littman, Robert E. Schapire
  • Publication number: 20170075656
    Abstract: There is provided for a system, method, and computer readable medium storing instructions related to controlling a presentation in a multimodal system. A method for the retrieval of information on the basis of its content for real-time incorporation into an electronic presentation is discussed. One method includes controlling a media presentation using a multimodal interface. The method involves receiving from a presenter a content-based request associated with a plurality of segments within a media presentation preprocessed for context-based searching; displaying the media presentation and displaying to the presenter results in response to the content-based request; receiving a selection from the presenter of at least one result; and displaying the selected result to an audience.
    Type: Application
    Filed: November 2, 2016
    Publication date: March 16, 2017
    Inventors: Patrick EHLEN, David Crawford GIBBON, Mazin GILBERT, Michael JOHNSTON, Zhu LIU, Behzad SHAHRARAY
  • Publication number: 20170078763
    Abstract: A computerized method is disclosed including but not limited to determining on a processor, a tendency of an end user's to respond to one of a plurality of advertising data types in a video data stream wherein the plurality of advertising data types comprise at least two of audio, video, text and image data types. A computer readable medium containing a data structure is disclosed providing a functional and structural interrelationship between a processor in the system and data in the data structure.
    Type: Application
    Filed: July 15, 2015
    Publication date: March 16, 2017
    Applicant: AT&T Intellectual Property I, LP
    Inventors: Narenda K. Gupta, Mazin Gilbert
  • Publication number: 20170024648
    Abstract: Disclosed is a method and apparatus for responding to an inquiry from a client via a network. The method and apparatus receive the inquiry from a client via a network. Based on the inquiry, question-answer pairs retrieved from the network are analyzed to determine a response to the inquiry. The QA pairs are not predefined. As a result, the QA pairs have to be analyzed in order to determine whether they are responsive to a particular inquiry. Questions of the QA pairs may be repetitive and similar to one another even for very different subjects, and without additional contextual and meta-level information, are not useful in determining whether their corresponding answer responds to an inquiry.
    Type: Application
    Filed: October 6, 2016
    Publication date: January 26, 2017
    Applicant: AT&T Intellectual Property II, L.P.
    Inventors: Junlan Feng, Mazin Gilbert, Dilek Hakkani-Tur, Gokhan Tur
  • Publication number: 20160329045
    Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for assigning saliency weights to words of an ASR model. The saliency values assigned to words within an ASR model are based on human perception judgments of previous transcripts. These saliency values are applied as weights to modify an ASR model such that the results of the weighted ASR model in converting a spoken document to a transcript provide a more accurate and useful transcription to the user.
    Type: Application
    Filed: July 18, 2016
    Publication date: November 10, 2016
    Inventors: Andrej LJOLJE, Diamantino Antonio CASEIRO, Mazin GILBERT, Vincent GOFFIN, Taniya Mishra
  • Patent number: 9489432
    Abstract: There is provided for a system, method, and computer readable medium storing instructions related to controlling a presentation in a multimodal system. A method for the retrieval of information on the basis of its content for real-time incorporation into an electronic presentation is discussed. One method includes controlling a media presentation using a multimodal interface. The method involves receiving from a presenter a content-based request associated with a plurality of segments within a media presentation preprocessed for context-based searching; displaying the media presentation and displaying to the presenter results in response to the content-based request; receiving a selection from the presenter of at least one result; and displaying the selected result to an audience.
    Type: Grant
    Filed: August 24, 2015
    Date of Patent: November 8, 2016
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Patrick Ehlen, David Crawford Gibbon, Mazin Gilbert, Michael Johnston, Zhu Liu, Behzad Shahraray
  • Patent number: 9489450
    Abstract: Disclosed is a method and apparatus for responding to an inquiry from a client via a network. The method and apparatus receive the inquiry from a client via a network. Based on the inquiry, question-answer pairs retrieved from the network are analyzed to determine a response to the inquiry. The QA pairs are not predefined. As a result, the QA pairs have to be analyzed in order to determine whether they are responsive to a particular inquiry. Questions of the QA pairs may be repetitive and, without more, will not be useful in determining whether their corresponding answer responds to an inquiry.
    Type: Grant
    Filed: November 10, 2015
    Date of Patent: November 8, 2016
    Inventors: Junlan Feng, Mazin Gilbert, Dilek Hakkani-Tur, Gokhan Tur
  • Patent number: 9484019
    Abstract: Disclosed herein is a method for speech recognition. The method includes receiving speech utterances, assigning a pronunciation weight to each unit of speech in the speech utterances, each respective pronunciation weight being normalized at a unit of speech level to sum to 1, for each received speech utterance, optimizing the pronunciation weight by identifying word and phone alignments and corresponding likelihood scores, and discriminatively adapting the pronunciation weight to minimize classification errors, and recognizing additional received speech utterances using the optimized pronunciation weights. A unit of speech can be a sentence, a word, a context-dependent phone, a context-independent phone, or a syllable. The method can further include discriminatively adapting pronunciation weights based on an objective function. The objective function can be maximum mutual information, maximum likelihood training, minimum classification error training, or other functions known to those of skill in the art.
    Type: Grant
    Filed: October 11, 2012
    Date of Patent: November 1, 2016
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Mazin Gilbert, Alistair D. Conkie, Andrej Ljolje
  • Publication number: 20160295292
    Abstract: A method, a system and a machine-readable medium are provided for an on demand translation service. A translation module including at least one language pair module for translating a source language to a target language may be made available for use by a subscriber. The subscriber may be charged a fee for use of the requested on demand translation service or may be provided use of the on demand translation service for free in exchange for displaying commercial messages to the subscriber. A video signal may be received including information in the source language, which may be obtained as text from the video signal and may be translated from the source language to the target language by use of the translation module. Translated information, based on the translated text, may be added into the received video signal.
    Type: Application
    Filed: June 20, 2016
    Publication date: October 6, 2016
    Inventors: Srinivas Bangalore, David Crawford Gibbon, Mazin Gilbert, Patrick Guy Haffner, Zhu Liu, Behzad Shahraray
  • Publication number: 20160275948
    Abstract: Disclosed are systems, methods, and computer readable media for identifying an acoustic environment of a caller. The method embodiment comprises analyzing acoustic features of a received audio signal from a caller, receiving meta-data information, classifying a background environment of the caller based on the analyzed acoustic features and the meta-data, selecting an acoustic model matched to the classified background environment from a plurality of acoustic models, and performing speech recognition as the received audio signal using the selected acoustic model.
    Type: Application
    Filed: June 2, 2016
    Publication date: September 22, 2016
    Inventor: Mazin GILBERT
  • Patent number: 9431005
    Abstract: Disclosed herein are systems, methods, and computer-readable storage media for improving automatic speech recognition performance. A system practicing the method identifies idle speech recognition resources and establishes a supplemental speech recognizer on the idle resources based on overall speech recognition demand. The supplemental speech recognizer can differ from a main speech recognizer, and, along with the main speech recognizer, can be associated with a particular speaker. The system performs speech recognition on speech received from the particular speaker in parallel with the main speech recognizer and the supplemental speech recognizer and combines results from the main and supplemental speech recognizer. The system recognizes the received speech based on the combined results. The system can use beam adjustment in place of or in combination with a supplemental speech recognizer.
    Type: Grant
    Filed: November 30, 2012
    Date of Patent: August 30, 2016
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Andrej Ljolje, Mazin Gilbert
  • Publication number: 20160249107
    Abstract: A portable communication device has a touch screen display that receives tactile input and a microphone that receives audio input. The portable communication device initiates a query for media based at least in part on tactile input and audio input. The touch screen display is a multi-touch screen. The portable communication device sends an initiated query and receives a text response indicative of a speech to text conversion of the query. The portable communication device then displays video in response to tactile input and audio input.
    Type: Application
    Filed: May 2, 2016
    Publication date: August 25, 2016
    Inventors: BEHZAD SHAHRARAY, DAVID CRAWFORD GIBBON, BERNARD S. RENGER, ZHU LIU, ANDREA BASSO, MAZIN GILBERT, MICHAEL J. JOHNSTON
  • Patent number: 9396725
    Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for assigning saliency weights to words of an ASR model. The saliency values assigned to words within an ASR model are based on human perception judgments of previous transcripts. These saliency values are applied as weights to modify an ASR model such that the results of the weighted ASR model in converting a spoken document to a transcript provide a more accurate and useful transcription to the user.
    Type: Grant
    Filed: May 27, 2014
    Date of Patent: July 19, 2016
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Andrej Ljolje, Diamantino Antonio Caseiro, Mazin Gilbert, Vincent Goffin, Taniya Mishra
  • Patent number: 9374612
    Abstract: A method, a system and a machine-readable medium are provided for an on demand translation service. A translation module including at least one language pair module for translating a source language to a target language may be made available for use by a subscriber. The subscriber may be charged a fee for use of the requested on demand translation service or may be provided use of the on demand translation service for free in exchange for displaying commercial messages to the subscriber. A video signal may be received including information in the source language, which may be obtained as text from the video signal and may be translated from the source language to the target language by use of the translation module. Translated information, based on the translated text, may be added into the received video signal.
    Type: Grant
    Filed: October 24, 2013
    Date of Patent: June 21, 2016
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Srinivas Bangalore, David Crawford Gibbon, Mazin Gilbert, Patrick Guy Haffner, Zhu Liu, Behzad Shahraray
  • Patent number: 9361881
    Abstract: Disclosed are systems, methods, and computer readable media for identifying an acoustic environment of a caller. The method embodiment comprises analyzing acoustic features of a received audio signal from a caller, receiving meta-data information based on a previously recorded time and speed of the caller, classifying a background environment of the caller based on the analyzed acoustic features and the meta-data, selecting an acoustic model matched to the classified background environment from a plurality of acoustic models, and performing speech recognition as the received audio signal using the selected acoustic model.
    Type: Grant
    Filed: June 23, 2014
    Date of Patent: June 7, 2016
    Assignee: AT&T Intellectual Property II, L.P.
    Inventor: Mazin Gilbert
  • Patent number: 9348908
    Abstract: A portable communication device has a touch screen display that receives tactile input and a microphone that receives audio input. The portable communication device initiates a query for media based at least in part on tactile input and audio input. The touch screen display is a multi-touch screen. The portable communication device sends an initiated query and receives a text response indicative of a speech to text conversion of the query. The portable communication device then displays video in response to tactile input and audio input.
    Type: Grant
    Filed: July 19, 2013
    Date of Patent: May 24, 2016
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Behzad Shahraray, David Crawford Gibbon, Bernard S. Renger, Zhu Liu, Andrea Basso, Mazin Gilbert, Michael J. Johnston
  • Publication number: 20160093287
    Abstract: A system and method are disclosed for generating customized text-to-speech voices for a particular application. The method comprises generating a custom text-to-speech voice by selecting a voice for generating a custom text-to-speech voice associated with a domain, collecting text data associated with the domain from a pre-existing text data source and using the collected text data, generating an in-domain inventory of synthesis speech units by selecting speech units appropriate to the domain via a search of a pre-existing inventory of synthesis speech units, or by recording the minimal inventory for a selected level of synthesis quality. The text-to-speech custom voice for the domain is generated utilizing the in-domain inventory of synthesis speech units. Active learning techniques may also be employed to identify problem phrases wherein only a few minutes of recorded data is necessary to deliver a high quality TTS custom voice.
    Type: Application
    Filed: December 10, 2015
    Publication date: March 31, 2016
    Inventors: Srinivas BANGALORE, Junlan FENG, Mazin GILBERT, Juergen SCHROETER, Ann K. SYRDAL, David SCHULZ
  • Patent number: 9299345
    Abstract: A system, method and computer readable medium that generates a language model from data from a web domain is disclosed. The method may include filtering web data to remove unwanted data from the web domain data, extracting predicate/argument pairs from the filtered web data, generating conversational utterances by merging the extracted predicate/argument pairs into conversational templates, and generating a web data language model using the generated conversational utterances.
    Type: Grant
    Filed: June 20, 2006
    Date of Patent: March 29, 2016
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Mazin Gilbert, Dilek Z. Hakkani-Tur