Patents by Inventor Dilek Z. Hakkani-Tur

Dilek Z. Hakkani-Tur has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 8818808
    Abstract: Utterance data that includes at least a small amount of manually transcribed data is provided. Automatic speech recognition is performed on ones of the utterance data not having a corresponding manual transcription to produce automatically transcribed utterances. A model is trained using all of the manually transcribed data and the automatically transcribed utterances. A predetermined number of utterances not having a corresponding manual transcription are intelligently selected and manually transcribed. Ones of the automatically transcribed data as well as ones having a corresponding manual transcription are labeled. In another aspect of the invention, audio data is mined from at least one source, and a language model is trained for call classification from the mined audio data to produce a language model.
    Type: Grant
    Filed: February 23, 2005
    Date of Patent: August 26, 2014
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Dilek Z. Hakkani-Tur, Mazin G. Rahim, Giuseppe Riccardi, Gokhan Tur
  • Publication number: 20140229179
    Abstract: Disclosed herein is a system, method and computer readable medium storing instructions related to semantic and syntactic information in a language understanding system. The method embodiment of the invention is a method for classifying utterances during a natural language dialog between a human and a computing device. The method comprises receiving a user utterance; generating a semantic and syntactic graph associated with the received utterance, extracting all n-grams as features from the generated semantic and syntactic graph and classifying the utterance. Classifying the utterance may be performed any number of ways such as using the extracted n-grams, a syntactic and semantic graphs or writing rules.
    Type: Application
    Filed: April 15, 2014
    Publication date: August 14, 2014
    Applicant: AT&T Intellectual Property II, L.P.
    Inventors: Ananlada Chotimongkol, Dilek Z. Hakkani-Tur, Gokhan Tur
  • Publication number: 20140222426
    Abstract: The invention relates to a system and method for gathering data for use in a spoken dialog system. An aspect of the invention is generally referred to as an automated hidden human that performs data collection automatically at the beginning of a conversation with a user in a spoken dialog system. The method comprises presenting an initial prompt to a user, recognizing a received user utterance using an automatic speech recognition engine and classifying the recognized user utterance using a spoken language understanding module. If the recognized user utterance is not understood or classifiable to a predetermined acceptance threshold, then the method re-prompts the user. If the recognized user utterance is not classifiable to a predetermined rejection threshold, then the method transfers the user to a human as this may imply a task-specific utterance. The received and classified user utterance is then used for training the spoken dialog system.
    Type: Application
    Filed: April 7, 2014
    Publication date: August 7, 2014
    Applicant: AT&T Intellectual Property II, L.P.
    Inventors: Giuseppe Di Fabbrizio, Dilek Z. Hakkani-Tur, Mazin G. Rahim, Bernard S. Renger, Gokhan Tur
  • Patent number: 8751439
    Abstract: An apparatus and a method for preserving privacy in natural language databases are provided. Natural language input may be received. At least one of sanitizing or anonymizing the natural language input may be performed to form a clean output. The clean output may be stored.
    Type: Grant
    Filed: June 25, 2013
    Date of Patent: June 10, 2014
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Dilek Z. Hakkani-Tur, Yucel Saygin, Min Tang, Gokhan Tur
  • Publication number: 20140156275
    Abstract: State-of-the-art speech recognition systems are trained using transcribed utterances, preparation of which is labor-intensive and time-consuming. The present invention is an iterative method for reducing the transcription effort for training in automatic speech recognition (ASR). Active learning aims at reducing the number of training examples to be labeled by automatically processing the unlabeled examples and then selecting the most informative ones with respect to a given cost function for a human to label. The method comprises automatically estimating a confidence score for each word of the utterance and exploiting the lattice output of a speech recognizer, which was trained on a small set of transcribed data. An utterance confidence score is computed based on these word confidence scores; then the utterances are selectively sampled to be transcribed using the utterance confidence scores.
    Type: Application
    Filed: February 10, 2014
    Publication date: June 5, 2014
    Applicant: AT&T Intellectual Property II, L.P.
    Inventors: Allen Louis Gorin, Dilek Z. Hakkani-Tur, Giuseppe Riccardi
  • Patent number: 8700404
    Abstract: Disclosed herein is a system, method and computer readable medium storing instructions related to semantic and syntactic information in a language understanding system. The method embodiment of the invention is a method for classifying utterances during a natural language dialog between a human and a computing device. The method comprises receiving a user utterance; generating a semantic and syntactic graph associated with the received utterance, extracting all n-grams as features from the generated semantic and syntactic graph and classifying the utterance. Classifying the utterance may be performed any number of ways such as using the extracted n-grams, a syntactic and semantic graphs or writing rules.
    Type: Grant
    Filed: August 27, 2005
    Date of Patent: April 15, 2014
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Ananlada Chotimongkol, Dilek Z. Hakkani-Tur, Gokhan Tur
  • Patent number: 8694324
    Abstract: The invention relates to a system and method for gathering data for use in a spoken dialog system. An aspect of the invention is generally referred to as an automated hidden human that performs data collection automatically at the beginning of a conversation with a user in a spoken dialog system. The method comprises presenting an initial prompt to a user, recognizing a received user utterance using an automatic speech recognition engine and classifying the recognized user utterance using a spoken language understanding module. If the recognized user utterance is not understood or classifiable to a predetermined acceptance threshold, then the method re-prompts the user. If the recognized user utterance is not classifiable to a predetermined rejection threshold, then the method transfers the user to a human as this may imply a task-specific utterance. The received and classified user utterance is then used for training the spoken dialog system.
    Type: Grant
    Filed: May 21, 2012
    Date of Patent: April 8, 2014
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Giuseppe Di Fabbrizio, Dilek Z. Hakkani-Tur, Mazin G. Rahim, Bernard S. Renger, Gokhan Tur
  • Patent number: 8650033
    Abstract: State-of-the-art speech recognition systems are trained using transcribed utterances, preparation of which is labor-intensive and time-consuming. The present invention is an iterative method for reducing the transcription effort for training in automatic speech recognition (ASR). Active learning aims at reducing the number of training examples to be labeled by automatically processing the unlabeled examples and then selecting the most informative ones with respect to a given cost function for a human to label. The method comprises automatically estimating a confidence score for each word of the utterance and exploiting the lattice output of a speech recognizer, which was trained on a small set of transcribed data. An utterance confidence score is computed based on these word confidence scores; then the utterances are selectively sampled to be transcribed using the utterance confidence scores.
    Type: Grant
    Filed: October 13, 2006
    Date of Patent: February 11, 2014
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Allen Louis Gorin, Dilek Z. Hakkani-Tur, Giuseppe Riccardi
  • Publication number: 20130325443
    Abstract: A machine-readable medium may include a group of reusable components for building a spoken dialog system. The reusable components may include a group of previously collected audible utterances. A machine-implemented method to build a library of reusable components for use in building a natural language spoken dialog system may include storing a dataset in a database. The dataset may include a group of reusable components for building a spoken dialog system. The reusable components may further include a group of previously collected audible utterances. A second method may include storing at least one set of data. Each one of the at least one set of data may include ones of the reusable components associated with audible data collected during a different collection phase.
    Type: Application
    Filed: July 2, 2013
    Publication date: December 5, 2013
    Applicant: AT&T Intellectual Property ll, L.P.
    Inventors: Lee Begeja, Giuseppe DiFabbrizio, David Crawford Gibbon, Dilek Z. Hakkani-Tur, Zhu Liu, Bernard S. Renger, Behzad Shahraray, Gokhan Tur
  • Patent number: 8589148
    Abstract: A method, system and machine-readable medium are provided for watermarking natural language digital text. A deep structure may be generated and a group of features may be extracted from natural language digital text input. The deep structure may be modified based, at least partly, on a watermark. Natural language digital text output may be generated based on the modified deep structure.
    Type: Grant
    Filed: April 14, 2006
    Date of Patent: November 19, 2013
    Assignees: AT&T Intellectual Property II, L.P., Perdue Research Foundation
    Inventors: Mikhail Mike Atallah, Srinivas Bangalore, Dilek Z. Hakkani-Tur, Giuseppe Riccardi, Mercan Topkara, Umut Topkara
  • Publication number: 20130304451
    Abstract: Processes capable of accepting linguistic input in one or more languages are generated by re-using existing linguistic components associated with a different anchor language, together with machine translation components that translate between the anchor language and the one or more languages. Linguistic input is directed to machine translation components that translate such input from its language into the anchor language. Those existing linguistic components are then utilized to initiate responsive processing and generate output. Optionally, the output is directed through the machine translation components. A language identifier can initially receive linguistic input and identify the language within which such linguistic input is provided to select an appropriate machine translation component.
    Type: Application
    Filed: May 10, 2012
    Publication date: November 14, 2013
    Applicant: MICROSOFT CORPORATION
    Inventors: Ruhi Sarikaya, Daniel Boies, Fethiye Asli Celikyilmaz, Anoop K. Deoras, Dustin Rigg Hillard, Dilek Z. Hakkani-Tur, Gokhan Tur, Fileno A. Alleva
  • Publication number: 20130289984
    Abstract: An apparatus and a method for preserving privacy in natural language databases are provided. Natural language input may be received. At least one of sanitizing or anonymizing the natural language input may be performed to form a clean output. The clean output may be stored.
    Type: Application
    Filed: June 25, 2013
    Publication date: October 31, 2013
    Inventors: Dilek Z. Hakkani-Tur, Yucel Saygin, Min Tang, Gokhan Tur
  • Patent number: 8548805
    Abstract: A system and method are disclosed for providing semi-supervised learning for a spoken language understanding module using semantic role labeling. The method embodiment relates to a method of generating a spoken language understanding module. Steps in the method comprise selecting at least one predicate/argument pair as an intent from a set of the most frequent predicate/argument pairs for a domain, labeling training data using mapping rules associated with the selected at least one predicate/argument pair, training a call-type classification model using the labeled training data, re-labeling the training data using the call-type classification model and iteratively several of the above steps until training set labels converge.
    Type: Grant
    Filed: November 26, 2012
    Date of Patent: October 1, 2013
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Ananlada Chotimongkol, Dilek Z. Hakkani-Tur, Gokhan Tur
  • Patent number: 8542803
    Abstract: A system and a method are provided. A textual transcript of a recorded voice communication is received. Text from a non-voice communication is received. Information about the textual transcript of the recorded voice communication and the text from the non-voice communication is provided such that a user can manage a group of communications, based at least in part on the textual transcript of the recorded voice communication and the text from the non-voice communication.
    Type: Grant
    Filed: August 19, 2005
    Date of Patent: September 24, 2013
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Alicia Abella, Brian Scott Amento, Dilek Z. Hakkani-Tur, Larry Stead, Gokhan Tur, Jay Wilpon, Jeremy Huntley Wright
  • Patent number: 8478589
    Abstract: A machine-readable medium may include a group of reusable components for building a spoken dialog system. The reusable components may include a group of previously collected audible utterances. A machine-implemented method to build a library of reusable components for use in building a natural language spoken dialog system may include storing a dataset in a database. The dataset may include a group of reusable components for building a spoken dialog system. The reusable components may further include a group of previously collected audible utterances. A second method may include storing at least one set of data. Each one of the at least one set of data may include ones of the reusable components associated with audible data collected during a different collection phase.
    Type: Grant
    Filed: January 5, 2005
    Date of Patent: July 2, 2013
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Lee Begeja, Giuseppe Di Fabbrizio, David Crawford Gibbon, Dilek Z. Hakkani-Tur, Zhu Liu, Bernard S. Renger, Behzad Shahraray, Gokhan Tur
  • Patent number: 8473451
    Abstract: An apparatus and a method for preserving privacy in natural language databases are provided. Natural language input may be received. At least one of sanitizing or anonymizing the natural language input may be performed to form a clean output. The clean output may be stored.
    Type: Grant
    Filed: March 22, 2005
    Date of Patent: June 25, 2013
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Dilek Z. Hakkani-Tur, Yucel Saygin, Min Tang, Gokhan Tur
  • Patent number: 8321220
    Abstract: A system and method are disclosed for providing semi-supervised learning for a spoken language understanding module using semantic role labeling. The method embodiment relates to a method of generating a spoken language understanding module. Steps in the method comprise selecting at least one predicate/argument pair as an intent from a set of the most frequent predicate/argument pairs for a domain, labeling training data using mapping rules associated with the selected at least one predicate/argument pair, training a call-type classification model using the labeled training data, re-labeling the training data using the call-type classification model and iteratively several of the above steps until training set labels converge.
    Type: Grant
    Filed: November 30, 2005
    Date of Patent: November 27, 2012
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Ananlada Chotimongkol, Dilek Z. Hakkani-Tur, Gokhan Tur
  • Publication number: 20120232898
    Abstract: The invention relates to a system and method for gathering data for use in a spoken dialog system. An aspect of the invention is generally referred to as an automated hidden human that performs data collection automatically at the beginning of a conversation with a user in a spoken dialog system. The method comprises presenting an initial prompt to a user, recognizing a received user utterance using an automatic speech recognition engine and classifying the recognized user utterance using a spoken language understanding module. If the recognized user utterance is not understood or classifiable to a predetermined acceptance threshold, then the method re-prompts the user. If the recognized user utterance is not classifiable to a predetermined rejection threshold, then the method transfers the user to a human as this may imply a task-specific utterance. The received and classified user utterance is then used for training the spoken dialog system.
    Type: Application
    Filed: May 21, 2012
    Publication date: September 13, 2012
    Applicant: AT&T Intellectual Property II, L.P.
    Inventors: Giuseppe Di Fabbrizio, Dilek Z. Hakkani-Tur, Mazin G. Rahim, Bernard S. Renger, Gokhan Tur
  • Patent number: 8204749
    Abstract: A system, method and computer-readable medium for practicing a method of emotion detection during a natural language dialog between a human and a computing device are disclosed. The method includes receiving an utterance from a user in a natural language dialog, receiving contextual information regarding the natural language dialog which is related to changes of emotion over time in the dialog, and detecting an emotion of the user based on the received contextual information. Examples of contextual information include, for example, differential statistics, joint statistics and distance statistics.
    Type: Grant
    Filed: March 21, 2011
    Date of Patent: June 19, 2012
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Dilek Z. Hakkani-Tur, Jackson J. Liscombe, Guiseppe Riccardi
  • Patent number: 8185399
    Abstract: The invention relates to a system and method for gathering data for use in a spoken dialog system. An aspect of the invention is generally referred to as an automated hidden human that performs data collection automatically at the beginning of a conversation with a user in a spoken dialog system. The method comprises presenting an initial prompt to a user, recognizing a received user utterance using an automatic speech recognition engine and classifying the recognized user utterance using a spoken language understanding module. If the recognized user utterance is not understood or classifiable to a predetermined acceptance threshold, then the method re-prompts the user. If the recognized user utterance is not classifiable to a predetermined rejection threshold, then the method transfers the user to a human as this may imply a task-specific utterance. The received and classified user utterance is then used for training the spoken dialog system.
    Type: Grant
    Filed: January 5, 2005
    Date of Patent: May 22, 2012
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Giuseppe Di Fabbrizio, Dilek Z. Hakkani-Tur, Mazin G. Rahim, Bernard S. Renger, Gokhan Tur