Patents by Inventor Gokhan Tur

Gokhan Tur has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

UNSUPERVISED AND ACTIVE LEARNING IN AUTOMATIC SPEECH RECOGNITION FOR CALL CLASSIFICATION

Publication number: 20150046159

Abstract: Utterance data that includes at least a small amount of manually transcribed data is provided. Automatic speech recognition is performed on ones of the utterance data not having a corresponding manual transcription to produce automatically transcribed utterances. A model is trained using all of the manually transcribed data and the automatically transcribed utterances. A predetermined number of utterances not having a corresponding manual transcription are intelligently selected and manually transcribed. Ones of the automatically transcribed data as well as ones having a corresponding manual transcription are labeled. In another aspect of the invention, audio data is mined from at least one source, and a language model is trained for call classification from the mined audio data to produce a language model.

Type: Application

Filed: August 26, 2014

Publication date: February 12, 2015

Inventors: Dilek Z. Hakkani-Tur, Mazin G. Rahim, Giuseppe Riccardi, Gokhan Tur
System and method of providing an automated data-collection in spoken dialog systems

Patent number: 8914294

Abstract: The invention relates to a system and method for gathering data for use in a spoken dialog system. An aspect of the invention is generally referred to as an automated hidden human that performs data collection automatically at the beginning of a conversation with a user in a spoken dialog system. The method comprises presenting an initial prompt to a user, recognizing a received user utterance using an automatic speech recognition engine and classifying the recognized user utterance using a spoken language understanding module. If the recognized user utterance is not understood or classifiable to a predetermined acceptance threshold, then the method re-prompts the user. If the recognized user utterance is not classifiable to a predetermined rejection threshold, then the method transfers the user to a human as this may imply a task-specific utterance. The received and classified user utterance is then used for training the spoken dialog system.

Type: Grant

Filed: April 7, 2014

Date of Patent: December 16, 2014

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Giuseppe Di Fabbrizio, Dilek Z. Hakkani-Tur, Mazin G. Rahim, Bernard S. Renger, Gokhan Tur
LANGUAGE MODEL TRAINED USING PREDICTED QUERIES FROM STATISTICAL MACHINE TRANSLATION

Publication number: 20140350931

Abstract: A Statistical Machine Translation (SMT) model is trained using pairs of sentences that include content obtained from one or more content sources (e.g. feed(s)) with corresponding queries that have been used to access the content. A query click graph may be used to assist in determining candidate pairs for the SMT training data. All/portion of the candidate pairs may be used to train the SMT model. After training the SMT model using the SMT training data, the SMT model is applied to content to determine predicted queries that may be used to search for the content. The predicted queries are used to train a language model, such as a query language model. The query language model may be interpolated other language models, such as a background language model, as well as a feed language model trained using the content used in determining the predicted queries.

Type: Application

Filed: May 24, 2013

Publication date: November 27, 2014

Applicant: Microsoft Corporation

Inventors: Michael Levit, Dilek Hakkani-Tur, Gokhan Tur
Multitask Learning for Spoken Language Understanding

Publication number: 20140343942

Abstract: Systems for improving or generating a spoken language understanding system using a multitask learning method for intent or call-type classification. The multitask learning method aims at training tasks in parallel while using a shared representation. A computing device automatically re-uses the existing labeled data from various applications, which are similar but may have different call-types, intents or intent distributions to improve the performance. An automated intent mapping algorithm operates across applications. In one aspect, active learning is employed to selectively sample the data to be re-used.

Type: Application

Filed: May 27, 2014

Publication date: November 20, 2014

Applicant: AT&T Intellectual Property II, L.P.

Inventor: Gokhan TUR
Apparatus and Method for Model Adaptation for Spoken Language Understanding

Publication number: 20140330565

Abstract: An apparatus and a method are provided for building a spoken language understanding model. Labeled data may be obtained for a target application. A new classification model may be formed for use with the target application by using the labeled data for adaptation of an existing classification model. In some implementations, the existing classification model may be used to determine the most informative examples to label.

Type: Application

Filed: May 20, 2014

Publication date: November 6, 2014

Applicant: AT&T Intellectual Property II, L.P.

Inventor: Gokhan Tur
PRESERVING PRIVACY IN NATURAL LANGAUGE DATABASES

Publication number: 20140278409

Abstract: An apparatus and a method for preserving privacy in natural language databases are provided. Natural language input may be received. At least one of sanitizing or anonymizing the natural language input may be performed to form a clean output. The clean output may be stored.

Type: Application

Filed: May 28, 2014

Publication date: September 18, 2014

Applicant: AT&T Intellectual Property II, L.P.

Inventors: Dilek Z. Hakkani-Tur, Yucel Saygin, Min Tang, Gokhan Tur
KERNEL DEEP CONVEX NETWORKS AND END-TO-END LEARNING

Publication number: 20140278424

Abstract: Data associated with spoken language may be obtained. An analysis of the obtained data may be initiated for understanding of the spoken language using a deep convex network that is integrated with a kernel trick. The resulting kernel deep convex network may also be constructed by stacking one shallow kernel network over another with concatenation of the output vector of the lower network with the input data vector. A probability associated with a slot that is associated with slot-filling may be determined, based on local, discriminative features that are extracted using the kernel deep convex network.

Type: Application

Filed: March 13, 2013

Publication date: September 18, 2014

Applicant: MICROSOFT CORPORATION

Inventors: Li Deng, Xiaodeng He, Gokhan Tur, Dilek Hakkani-Tur
Answer determination for natural language questioning

Patent number: 8832064

Abstract: Open-domain question answering is the task of finding a concise answer to a natural language question using a large domain, such as the Internet. The use of a semantic role labeling approach to the extraction of the answers to an open domain factoid (Who/When/What/Where) natural language question that contains a predicate is described. Semantic role labeling identities predicates and semantic argument phrases in the natural language question and the candidate sentences. When searching for an answer to a natural language question, the missing argument in the question is matched using semantic parses of the candidate answers. Such a technique may improve the accuracy of a question answering system and may decrease the length of answers for enabling voice interface to a question answering system.

Type: Grant

Filed: December 28, 2005

Date of Patent: September 9, 2014

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Svetlana Stenchikova, Gokhan Tur, Dilek Hakkani Tur
Unsupervised and active learning in automatic speech recognition for call classification

Patent number: 8818808

Abstract: Utterance data that includes at least a small amount of manually transcribed data is provided. Automatic speech recognition is performed on ones of the utterance data not having a corresponding manual transcription to produce automatically transcribed utterances. A model is trained using all of the manually transcribed data and the automatically transcribed utterances. A predetermined number of utterances not having a corresponding manual transcription are intelligently selected and manually transcribed. Ones of the automatically transcribed data as well as ones having a corresponding manual transcription are labeled. In another aspect of the invention, audio data is mined from at least one source, and a language model is trained for call classification from the mined audio data to produce a language model.

Type: Grant

Filed: February 23, 2005

Date of Patent: August 26, 2014

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Dilek Z. Hakkani-Tur, Mazin G. Rahim, Giuseppe Riccardi, Gokhan Tur
EXPLOITING THE SEMANTIC WEB FOR UNSUPERVISED NATURAL LANGUAGE SEMANTIC PARSING

Publication number: 20140236575

Abstract: Structured web pages are accessed and parsed to obtain implicit annotation for natural language understanding tasks. Search queries that hit these structured web pages are automatically mined for information that is used to semantically annotate the queries. The automatically annotated queries may be used for automatically building statistical unsupervised slot filling models without using a semantic annotation guideline. For example, tags that are located on a structured web page that are associated with the search query may be used to annotate the query. The mined search queries may be filtered to create a set of queries that is in a form of a natural language query and/or remove queries that are difficult to parse. A natural language model may be trained using the resulting mined queries. Some queries may be set aside for testing and the model may be adapted using in-domain sentences that are not annotated.

Type: Application

Filed: February 21, 2013

Publication date: August 21, 2014

Applicant: MICROSOFT CORPORATION

Inventors: Gokhan Tur, Dilek Hakkani-Tur, Larry Heck, Minwoo Jeong, Ye-Yi Wang
EXPLOITING THE SEMANTIC WEB FOR UNSUPERVISED SPOKEN LANGUAGE UNDERSTANDING

Publication number: 20140236570

Abstract: An unsupervised training approach for Spoken Language Understanding (SLU) systems uses the structure of content sources (e.g. semantic knowledge graphs, relational databases, . . . ) to automatically specify a semantic representation for SLU. The semantic representation is used when creating entity-relation patterns that are used to mine natural language (NL) examples (e.g. NL surface forms from the web and search query click logs). The structure of the content source (e.g. semantic graph) is enriched with the mined NL examples. The NL examples and patterns may be used to automatically train SLU systems in an unsupervised manner that covers the knowledge represented in the structured content.

Type: Application

Filed: February 18, 2013

Publication date: August 21, 2014

Applicant: MICROSOFT CORPORATION

Inventors: Larry Heck, Dilek Hakkani-Tur, Gokhan Tur
SYSTEM AND METHOD FOR USING SEMANTIC AND SYNTACTIC GRAPHS FOR UTTERANCE CLASSIFICATION

Publication number: 20140229179

Abstract: Disclosed herein is a system, method and computer readable medium storing instructions related to semantic and syntactic information in a language understanding system. The method embodiment of the invention is a method for classifying utterances during a natural language dialog between a human and a computing device. The method comprises receiving a user utterance; generating a semantic and syntactic graph associated with the received utterance, extracting all n-grams as features from the generated semantic and syntactic graph and classifying the utterance. Classifying the utterance may be performed any number of ways such as using the extracted n-grams, a syntactic and semantic graphs or writing rules.

Type: Application

Filed: April 15, 2014

Publication date: August 14, 2014

Applicant: AT&T Intellectual Property II, L.P.

Inventors: Ananlada Chotimongkol, Dilek Z. Hakkani-Tur, Gokhan Tur
System and Method of Providing an Automated Data-Collection in Spoken Dialog Systems

Publication number: 20140222426

Abstract: The invention relates to a system and method for gathering data for use in a spoken dialog system. An aspect of the invention is generally referred to as an automated hidden human that performs data collection automatically at the beginning of a conversation with a user in a spoken dialog system. The method comprises presenting an initial prompt to a user, recognizing a received user utterance using an automatic speech recognition engine and classifying the recognized user utterance using a spoken language understanding module. If the recognized user utterance is not understood or classifiable to a predetermined acceptance threshold, then the method re-prompts the user. If the recognized user utterance is not classifiable to a predetermined rejection threshold, then the method transfers the user to a human as this may imply a task-specific utterance. The received and classified user utterance is then used for training the spoken dialog system.

Type: Application

Filed: April 7, 2014

Publication date: August 7, 2014

Applicant: AT&T Intellectual Property II, L.P.

Inventors: Giuseppe Di Fabbrizio, Dilek Z. Hakkani-Tur, Mazin G. Rahim, Bernard S. Renger, Gokhan Tur
Method and Apparatus for Responding to an Inquiry

Publication number: 20140205985

Abstract: Disclosed is a method and apparatus for responding to an inquiry from a client via a network. The method and apparatus receive the inquiry from a client via a network. Based on the inquiry, question-answer pairs retrieved from the network are analyzed to determine a response to the inquiry. The QA pairs are not predefined. As a result, the QA pairs have to be analyzed in order to determine whether they are responsive to a particular inquiry. Questions of the QA pairs may be repetitive and, without more, will not be useful in determining whether their corresponding answer responds to an inquiry.

Type: Application

Filed: March 19, 2014

Publication date: July 24, 2014

Applicant: AT&T Intellectual Property II, L.P.

Inventors: Junlan Feng, JR., Mazin Gilbert, Dilek Hakkani-Tur, Gokhan Tur
NAMED ENTITY VARIATIONS FOR MULTIMODAL UNDERSTANDING SYSTEMS

Publication number: 20140180676

Abstract: Click logs are automatically mined to assist in discovering candidate variations for named entities. The named entities may be obtained from one or more sources and include an initial list of named entities. A search may be performed within one or more search engines to determine common phrases that are used to identify the named entity in addition to the named entity initially included in the named entity list. Click logs associated with results of past searches are automatically mined to discover what phrases determined from the searches are candidate variations for the named entity. The candidate variations are scored to assist in determining the variations to include within an understanding model. The variations may also be used when delivering responses and displayed output in the SLU system. For example, instead of using the listed named entity, a popular and/or shortened name may be used by the system.

Type: Application

Filed: December 21, 2012

Publication date: June 26, 2014

Applicant: Microsoft Corporation

Inventors: Dustin Hillard, Fethiye Asli Celikyilmaz, Dilek Hakkani-Tur, Rukmini Iyer, Gokhan Tur
PROBABILITY-BASED STATE MODIFICATION FOR QUERY DIALOGUES

Publication number: 20140172899

Abstract: A device may facilitate a query dialog involving queries that successively modify a query state. However, fulfilling such queries in the context of possible query domains, query intents, and contextual meanings of query terms may be difficult. Presented herein are techniques for modifying a query state in view of a query by utilizing a set of query state modifications, each representing a modification of the query state possibly intended by the user while formulating the query (e.g., adding, substituting, or removing query terms; changing the query domain or query intent; and navigating within a hierarchy of saved query states). Upon receiving a query, an embodiment may calculate the probability of the query connoting each query state modification (e.g., using a Bayesian classifier), and parsing the query according to a query state modification having a high probability (e.g., mapping respective query terms to query slots within the current query intent).

Type: Application

Filed: December 14, 2012

Publication date: June 19, 2014

Applicant: Microsoft Corporation

Inventors: Dilek Hakkani-Tur, Gokhan Tur, Larry Heck, Ashley Fidler, Fehtiye Asli Celikyilmaz
Preserving privacy in natural language databases

Patent number: 8751439

Abstract: An apparatus and a method for preserving privacy in natural language databases are provided. Natural language input may be received. At least one of sanitizing or anonymizing the natural language input may be performed to form a clean output. The clean output may be stored.

Type: Grant

Filed: June 25, 2013

Date of Patent: June 10, 2014

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Dilek Z. Hakkani-Tur, Yucel Saygin, Min Tang, Gokhan Tur
Multitask learning for spoken language understanding

Patent number: 8738379

Abstract: Systems for improving or generating a spoken language understanding system using a multitask learning method for intent or call-type classification. The multitask learning method aims at training tasks in parallel while using a shared representation. A computing device automatically re-uses the existing labeled data from various applications, which are similar but may have different call-types, intents or intent distributions to improve the performance. An automated intent mapping algorithm operates across applications. In one aspect, active learning is employed to selectively sample the data to be re-used.

Type: Grant

Filed: December 28, 2009

Date of Patent: May 27, 2014

Assignee: AT&T Intellectual Property II, L.P.

Inventor: Gokhan Tur
Apparatus and method for model adaptation for spoken language understanding

Patent number: 8731924

Abstract: An apparatus and a method are provided for building a spoken language understanding model. Labeled data may be obtained for a target application. A new classification model may be formed for use with the target application by using the labeled data for adaptation of an existing classification model. In some implementations, the existing classification model may be used to determine the most informative examples to label.

Type: Grant

Filed: August 8, 2011

Date of Patent: May 20, 2014

Assignee: AT&T Intellectual Property II, L.P.

Inventor: Gokhan Tur
Method and apparatus for responding to an inquiry

Patent number: 8719010

Abstract: Disclosed is a method and apparatus for responding to an inquiry from a client via a network. The method and apparatus receive the inquiry from a client via a network. Based on the inquiry, question-answer pairs retrieved from the network are analyzed to determine a response to the inquiry. The QA pairs are not predefined. As a result, the QA pairs have to be analyzed in order to determine whether they are responsive to a particular inquiry. Questions of the QA pairs may be repetitive and, without more, will not be useful in determining whether their corresponding answer responds to an inquiry.

Type: Grant

Filed: March 1, 2013

Date of Patent: May 6, 2014

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Junlan Feng, Mazin Gilbert, Dilek Hakkani-Tur, Gokhan Tur

prev 1 2 3 4 5 6 7 8 next