Patents by Inventor Gokhan Tur

Gokhan Tur has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20150046159
    Abstract: Utterance data that includes at least a small amount of manually transcribed data is provided. Automatic speech recognition is performed on ones of the utterance data not having a corresponding manual transcription to produce automatically transcribed utterances. A model is trained using all of the manually transcribed data and the automatically transcribed utterances. A predetermined number of utterances not having a corresponding manual transcription are intelligently selected and manually transcribed. Ones of the automatically transcribed data as well as ones having a corresponding manual transcription are labeled. In another aspect of the invention, audio data is mined from at least one source, and a language model is trained for call classification from the mined audio data to produce a language model.
    Type: Application
    Filed: August 26, 2014
    Publication date: February 12, 2015
    Inventors: Dilek Z. Hakkani-Tur, Mazin G. Rahim, Giuseppe Riccardi, Gokhan Tur
  • Patent number: 8914294
    Abstract: The invention relates to a system and method for gathering data for use in a spoken dialog system. An aspect of the invention is generally referred to as an automated hidden human that performs data collection automatically at the beginning of a conversation with a user in a spoken dialog system. The method comprises presenting an initial prompt to a user, recognizing a received user utterance using an automatic speech recognition engine and classifying the recognized user utterance using a spoken language understanding module. If the recognized user utterance is not understood or classifiable to a predetermined acceptance threshold, then the method re-prompts the user. If the recognized user utterance is not classifiable to a predetermined rejection threshold, then the method transfers the user to a human as this may imply a task-specific utterance. The received and classified user utterance is then used for training the spoken dialog system.
    Type: Grant
    Filed: April 7, 2014
    Date of Patent: December 16, 2014
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Giuseppe Di Fabbrizio, Dilek Z. Hakkani-Tur, Mazin G. Rahim, Bernard S. Renger, Gokhan Tur
  • Publication number: 20140350931
    Abstract: A Statistical Machine Translation (SMT) model is trained using pairs of sentences that include content obtained from one or more content sources (e.g. feed(s)) with corresponding queries that have been used to access the content. A query click graph may be used to assist in determining candidate pairs for the SMT training data. All/portion of the candidate pairs may be used to train the SMT model. After training the SMT model using the SMT training data, the SMT model is applied to content to determine predicted queries that may be used to search for the content. The predicted queries are used to train a language model, such as a query language model. The query language model may be interpolated other language models, such as a background language model, as well as a feed language model trained using the content used in determining the predicted queries.
    Type: Application
    Filed: May 24, 2013
    Publication date: November 27, 2014
    Applicant: Microsoft Corporation
    Inventors: Michael Levit, Dilek Hakkani-Tur, Gokhan Tur
  • Publication number: 20140343942
    Abstract: Systems for improving or generating a spoken language understanding system using a multitask learning method for intent or call-type classification. The multitask learning method aims at training tasks in parallel while using a shared representation. A computing device automatically re-uses the existing labeled data from various applications, which are similar but may have different call-types, intents or intent distributions to improve the performance. An automated intent mapping algorithm operates across applications. In one aspect, active learning is employed to selectively sample the data to be re-used.
    Type: Application
    Filed: May 27, 2014
    Publication date: November 20, 2014
    Applicant: AT&T Intellectual Property II, L.P.
    Inventor: Gokhan TUR
  • Publication number: 20140330565
    Abstract: An apparatus and a method are provided for building a spoken language understanding model. Labeled data may be obtained for a target application. A new classification model may be formed for use with the target application by using the labeled data for adaptation of an existing classification model. In some implementations, the existing classification model may be used to determine the most informative examples to label.
    Type: Application
    Filed: May 20, 2014
    Publication date: November 6, 2014
    Applicant: AT&T Intellectual Property II, L.P.
    Inventor: Gokhan Tur
  • Publication number: 20140278409
    Abstract: An apparatus and a method for preserving privacy in natural language databases are provided. Natural language input may be received. At least one of sanitizing or anonymizing the natural language input may be performed to form a clean output. The clean output may be stored.
    Type: Application
    Filed: May 28, 2014
    Publication date: September 18, 2014
    Applicant: AT&T Intellectual Property II, L.P.
    Inventors: Dilek Z. Hakkani-Tur, Yucel Saygin, Min Tang, Gokhan Tur
  • Publication number: 20140278424
    Abstract: Data associated with spoken language may be obtained. An analysis of the obtained data may be initiated for understanding of the spoken language using a deep convex network that is integrated with a kernel trick. The resulting kernel deep convex network may also be constructed by stacking one shallow kernel network over another with concatenation of the output vector of the lower network with the input data vector. A probability associated with a slot that is associated with slot-filling may be determined, based on local, discriminative features that are extracted using the kernel deep convex network.
    Type: Application
    Filed: March 13, 2013
    Publication date: September 18, 2014
    Applicant: MICROSOFT CORPORATION
    Inventors: Li Deng, Xiaodeng He, Gokhan Tur, Dilek Hakkani-Tur
  • Patent number: 8832064
    Abstract: Open-domain question answering is the task of finding a concise answer to a natural language question using a large domain, such as the Internet. The use of a semantic role labeling approach to the extraction of the answers to an open domain factoid (Who/When/What/Where) natural language question that contains a predicate is described. Semantic role labeling identities predicates and semantic argument phrases in the natural language question and the candidate sentences. When searching for an answer to a natural language question, the missing argument in the question is matched using semantic parses of the candidate answers. Such a technique may improve the accuracy of a question answering system and may decrease the length of answers for enabling voice interface to a question answering system.
    Type: Grant
    Filed: December 28, 2005
    Date of Patent: September 9, 2014
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Svetlana Stenchikova, Gokhan Tur, Dilek Hakkani Tur
  • Patent number: 8818808
    Abstract: Utterance data that includes at least a small amount of manually transcribed data is provided. Automatic speech recognition is performed on ones of the utterance data not having a corresponding manual transcription to produce automatically transcribed utterances. A model is trained using all of the manually transcribed data and the automatically transcribed utterances. A predetermined number of utterances not having a corresponding manual transcription are intelligently selected and manually transcribed. Ones of the automatically transcribed data as well as ones having a corresponding manual transcription are labeled. In another aspect of the invention, audio data is mined from at least one source, and a language model is trained for call classification from the mined audio data to produce a language model.
    Type: Grant
    Filed: February 23, 2005
    Date of Patent: August 26, 2014
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Dilek Z. Hakkani-Tur, Mazin G. Rahim, Giuseppe Riccardi, Gokhan Tur
  • Publication number: 20140236575
    Abstract: Structured web pages are accessed and parsed to obtain implicit annotation for natural language understanding tasks. Search queries that hit these structured web pages are automatically mined for information that is used to semantically annotate the queries. The automatically annotated queries may be used for automatically building statistical unsupervised slot filling models without using a semantic annotation guideline. For example, tags that are located on a structured web page that are associated with the search query may be used to annotate the query. The mined search queries may be filtered to create a set of queries that is in a form of a natural language query and/or remove queries that are difficult to parse. A natural language model may be trained using the resulting mined queries. Some queries may be set aside for testing and the model may be adapted using in-domain sentences that are not annotated.
    Type: Application
    Filed: February 21, 2013
    Publication date: August 21, 2014
    Applicant: MICROSOFT CORPORATION
    Inventors: Gokhan Tur, Dilek Hakkani-Tur, Larry Heck, Minwoo Jeong, Ye-Yi Wang
  • Publication number: 20140236570
    Abstract: An unsupervised training approach for Spoken Language Understanding (SLU) systems uses the structure of content sources (e.g. semantic knowledge graphs, relational databases, . . . ) to automatically specify a semantic representation for SLU. The semantic representation is used when creating entity-relation patterns that are used to mine natural language (NL) examples (e.g. NL surface forms from the web and search query click logs). The structure of the content source (e.g. semantic graph) is enriched with the mined NL examples. The NL examples and patterns may be used to automatically train SLU systems in an unsupervised manner that covers the knowledge represented in the structured content.
    Type: Application
    Filed: February 18, 2013
    Publication date: August 21, 2014
    Applicant: MICROSOFT CORPORATION
    Inventors: Larry Heck, Dilek Hakkani-Tur, Gokhan Tur
  • Publication number: 20140229179
    Abstract: Disclosed herein is a system, method and computer readable medium storing instructions related to semantic and syntactic information in a language understanding system. The method embodiment of the invention is a method for classifying utterances during a natural language dialog between a human and a computing device. The method comprises receiving a user utterance; generating a semantic and syntactic graph associated with the received utterance, extracting all n-grams as features from the generated semantic and syntactic graph and classifying the utterance. Classifying the utterance may be performed any number of ways such as using the extracted n-grams, a syntactic and semantic graphs or writing rules.
    Type: Application
    Filed: April 15, 2014
    Publication date: August 14, 2014
    Applicant: AT&T Intellectual Property II, L.P.
    Inventors: Ananlada Chotimongkol, Dilek Z. Hakkani-Tur, Gokhan Tur
  • Publication number: 20140222426
    Abstract: The invention relates to a system and method for gathering data for use in a spoken dialog system. An aspect of the invention is generally referred to as an automated hidden human that performs data collection automatically at the beginning of a conversation with a user in a spoken dialog system. The method comprises presenting an initial prompt to a user, recognizing a received user utterance using an automatic speech recognition engine and classifying the recognized user utterance using a spoken language understanding module. If the recognized user utterance is not understood or classifiable to a predetermined acceptance threshold, then the method re-prompts the user. If the recognized user utterance is not classifiable to a predetermined rejection threshold, then the method transfers the user to a human as this may imply a task-specific utterance. The received and classified user utterance is then used for training the spoken dialog system.
    Type: Application
    Filed: April 7, 2014
    Publication date: August 7, 2014
    Applicant: AT&T Intellectual Property II, L.P.
    Inventors: Giuseppe Di Fabbrizio, Dilek Z. Hakkani-Tur, Mazin G. Rahim, Bernard S. Renger, Gokhan Tur
  • Publication number: 20140205985
    Abstract: Disclosed is a method and apparatus for responding to an inquiry from a client via a network. The method and apparatus receive the inquiry from a client via a network. Based on the inquiry, question-answer pairs retrieved from the network are analyzed to determine a response to the inquiry. The QA pairs are not predefined. As a result, the QA pairs have to be analyzed in order to determine whether they are responsive to a particular inquiry. Questions of the QA pairs may be repetitive and, without more, will not be useful in determining whether their corresponding answer responds to an inquiry.
    Type: Application
    Filed: March 19, 2014
    Publication date: July 24, 2014
    Applicant: AT&T Intellectual Property II, L.P.
    Inventors: Junlan Feng, JR., Mazin Gilbert, Dilek Hakkani-Tur, Gokhan Tur
  • Publication number: 20140180676
    Abstract: Click logs are automatically mined to assist in discovering candidate variations for named entities. The named entities may be obtained from one or more sources and include an initial list of named entities. A search may be performed within one or more search engines to determine common phrases that are used to identify the named entity in addition to the named entity initially included in the named entity list. Click logs associated with results of past searches are automatically mined to discover what phrases determined from the searches are candidate variations for the named entity. The candidate variations are scored to assist in determining the variations to include within an understanding model. The variations may also be used when delivering responses and displayed output in the SLU system. For example, instead of using the listed named entity, a popular and/or shortened name may be used by the system.
    Type: Application
    Filed: December 21, 2012
    Publication date: June 26, 2014
    Applicant: Microsoft Corporation
    Inventors: Dustin Hillard, Fethiye Asli Celikyilmaz, Dilek Hakkani-Tur, Rukmini Iyer, Gokhan Tur
  • Publication number: 20140172899
    Abstract: A device may facilitate a query dialog involving queries that successively modify a query state. However, fulfilling such queries in the context of possible query domains, query intents, and contextual meanings of query terms may be difficult. Presented herein are techniques for modifying a query state in view of a query by utilizing a set of query state modifications, each representing a modification of the query state possibly intended by the user while formulating the query (e.g., adding, substituting, or removing query terms; changing the query domain or query intent; and navigating within a hierarchy of saved query states). Upon receiving a query, an embodiment may calculate the probability of the query connoting each query state modification (e.g., using a Bayesian classifier), and parsing the query according to a query state modification having a high probability (e.g., mapping respective query terms to query slots within the current query intent).
    Type: Application
    Filed: December 14, 2012
    Publication date: June 19, 2014
    Applicant: Microsoft Corporation
    Inventors: Dilek Hakkani-Tur, Gokhan Tur, Larry Heck, Ashley Fidler, Fehtiye Asli Celikyilmaz
  • Patent number: 8751439
    Abstract: An apparatus and a method for preserving privacy in natural language databases are provided. Natural language input may be received. At least one of sanitizing or anonymizing the natural language input may be performed to form a clean output. The clean output may be stored.
    Type: Grant
    Filed: June 25, 2013
    Date of Patent: June 10, 2014
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Dilek Z. Hakkani-Tur, Yucel Saygin, Min Tang, Gokhan Tur
  • Patent number: 8738379
    Abstract: Systems for improving or generating a spoken language understanding system using a multitask learning method for intent or call-type classification. The multitask learning method aims at training tasks in parallel while using a shared representation. A computing device automatically re-uses the existing labeled data from various applications, which are similar but may have different call-types, intents or intent distributions to improve the performance. An automated intent mapping algorithm operates across applications. In one aspect, active learning is employed to selectively sample the data to be re-used.
    Type: Grant
    Filed: December 28, 2009
    Date of Patent: May 27, 2014
    Assignee: AT&T Intellectual Property II, L.P.
    Inventor: Gokhan Tur
  • Patent number: 8731924
    Abstract: An apparatus and a method are provided for building a spoken language understanding model. Labeled data may be obtained for a target application. A new classification model may be formed for use with the target application by using the labeled data for adaptation of an existing classification model. In some implementations, the existing classification model may be used to determine the most informative examples to label.
    Type: Grant
    Filed: August 8, 2011
    Date of Patent: May 20, 2014
    Assignee: AT&T Intellectual Property II, L.P.
    Inventor: Gokhan Tur
  • Patent number: 8719010
    Abstract: Disclosed is a method and apparatus for responding to an inquiry from a client via a network. The method and apparatus receive the inquiry from a client via a network. Based on the inquiry, question-answer pairs retrieved from the network are analyzed to determine a response to the inquiry. The QA pairs are not predefined. As a result, the QA pairs have to be analyzed in order to determine whether they are responsive to a particular inquiry. Questions of the QA pairs may be repetitive and, without more, will not be useful in determining whether their corresponding answer responds to an inquiry.
    Type: Grant
    Filed: March 1, 2013
    Date of Patent: May 6, 2014
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Junlan Feng, Mazin Gilbert, Dilek Hakkani-Tur, Gokhan Tur