Patents by Inventor Dilek Z. Hakkani-Tur

Dilek Z. Hakkani-Tur has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

MULTI-DOMAIN JOINT SEMANTIC FRAME PARSING

Publication number: 20230401445

Abstract: A processing unit can train a model as a joint multi-domain recurrent neural network (JRNN), such as a bi-directional recurrent neural network (bRNN) and/or a recurrent neural network with long-short term memory (RNN-LSTM) for spoken language understanding (SLU). The processing unit can use the trained model to, e.g., jointly model slot filling, intent determination, and domain classification. The joint multi-domain model described herein can estimate a complete semantic frame per query, and the joint multi-domain model enables multi-task deep learning leveraging the data from multiple domains. The joint multi-domain recurrent neural (JRNN) can leverage semantic intents (such as, finding or identifying, e.g., a domain specific goal) and slots (such as, dates, times, locations, subjects, etc.) across multiple domains.

Type: Application

Filed: August 29, 2023

Publication date: December 14, 2023

Inventors: Dilek Z. Hakkani-Tur, Asli Celikyilmaz, Yun-Nung Chen, Li Deng, Jianfeng Gao, Gokhan Tur, Ye Yi Wang
Multi-domain joint semantic frame parsing

Patent number: 11783173

Abstract: A processing unit can train a model as a joint multi-domain recurrent neural network (JRNN), such as a bi-directional recurrent neural network (bRNN) and/or a recurrent neural network with long-short term memory (RNN-LSTM) for spoken language understanding (SLU). The processing unit can use the trained model to, e.g., jointly model slot filling, intent determination, and domain classification. The joint multi-domain model described herein can estimate a complete semantic frame per query, and the joint multi-domain model enables multi-task deep learning leveraging the data from multiple domains. The joint multi-domain recurrent neural network (JRNN) can leverage semantic intents (such as, finding or identifying, e.g., a domain specific goal) and slots (such as, dates, times, locations, subjects, etc.) across multiple domains.

Type: Grant

Filed: August 4, 2016

Date of Patent: October 10, 2023

Assignee: Microsoft Technology Licensing, LLC

Inventors: Dilek Z Hakkani-Tur, Asli Celikyilmaz, Yun-Nung Chen, Li Deng, Jianfeng Gao, Gokhan Tur, Ye-Yi Wang
End-to-end memory networks for contextual language understanding

Patent number: 11449744

Abstract: A processing unit can extract salient semantics to model knowledge carryover, from one turn to the next, in multi-turn conversations. Architecture described herein can use the end-to-end memory networks to encode inputs, e.g., utterances, with intents and slots, which can be stored as embeddings in memory, and in decoding the architecture can exploit latent contextual information from memory, e.g., demographic context, visual context, semantic context, etc. e.g., via an attention model, to leverage previously stored semantics for semantic parsing, e.g., for joint intent prediction and slot tagging. In examples, architecture is configured to build an end-to-end memory network model for contextual, e.g., multi-turn, language understanding, to apply the end-to-end memory network model to multiple turns of conversational input; and to fill slots for output of contextual, e.g., multi-turn, language understanding of the conversational input.

Type: Grant

Filed: August 4, 2016

Date of Patent: September 20, 2022

Assignee: Microsoft Technology Licensing, LLC

Inventors: Yun-Nung Chen, Dilek Z. Hakkani-Tur, Gokhan Tur, Li Deng, Jianfeng Gao
Eye gaze for spoken language understanding in multi-modal conversational interactions

Patent number: 10901500

Abstract: Improving accuracy in understanding and/or resolving references to visual elements in a visual context associated with a computerized conversational system is described. Techniques described herein leverage gaze input with gestures and/or speech input to improve spoken language understanding in computerized conversational systems. Leveraging gaze input and speech input improves spoken language understanding in conversational systems by improving the accuracy by which the system can resolve references—or interpret a user's intent—with respect to visual elements in a visual context. In at least one example, the techniques herein describe tracking gaze to generate gaze input, recognizing speech input, and extracting gaze features and lexical features from the user input. Based at least in part on the gaze features and lexical features, user utterances directed to visual elements in a visual context can be resolved.

Type: Grant

Filed: April 30, 2019

Date of Patent: January 26, 2021

Assignee: Microsoft Technology Licensing, LLC

Inventors: Anna Prokofieva, Fethiye Asli Celikyilmaz, Dilek Z Hakkani-Tur, Larry Heck, Malcolm Slaney
Knowledge-guided structural attention processing

Patent number: 10839165

Abstract: Systems and methods for determining knowledge-guided information for a recurrent neural networks (RNN) to guide the RNN in semantic tagging of an input phrase are presented. A knowledge encoding module of a Knowledge-Guided Structural Attention Process (K-SAP) receives an input phrase and, in conjunction with additional sub-components or cooperative components generates a knowledge-guided vector that is provided with the input phrase to the RNN for linguistic semantic tagging. Generating the knowledge-guided vector comprises at least parsing the input phrase and generating a corresponding hierarchical linguistic structure comprising one or more discrete sub-structures. The sub-structures may be encoded into vectors along with attention weighting identifying those sub-structures that have greater importance in determining the semantic meaning of the input phrase.

Type: Grant

Filed: June 18, 2019

Date of Patent: November 17, 2020

Assignee: Microsoft Technology Licensing, LLC

Inventors: Yun-Nung Vivian Chen, Dilek Z. Hakkani-Tur, Gokhan Tur, Asli Celikyilmaz, Jianfeng Gao, Li Deng
Eye Gaze for Spoken Language Understanding in Multi-Modal Conversational Interactions

Publication number: 20190391640

Abstract: Improving accuracy in understanding and/or resolving references to visual elements in a visual context associated with a computerized conversational system is described. Techniques described herein leverage gaze input with gestures and/or speech input to improve spoken language understanding in computerized conversational systems. Leveraging gaze input and speech input improves spoken language understanding in conversational systems by improving the accuracy by which the system can resolve references—or interpret a user's intent—with respect to visual elements in a visual context. In at least one example, the techniques herein describe tracking gaze to generate gaze input, recognizing speech input, and extracting gaze features and lexical features from the user input. Based at least in part on the gaze features and lexical features, user utterances directed to visual elements in a visual context can be resolved.

Type: Application

Filed: April 30, 2019

Publication date: December 26, 2019

Inventors: Anna Prokofieva, Fethiye Asli Celikyilmaz, Dilek Z Hakkani-Tur, Larry Heck, Malcom Slaney
KNOWLEDGE-GUIDED STRUCTURAL ATTENTION PROCESSING

Publication number: 20190303440

Abstract: Systems and methods for determining knowledge-guided information for a recurrent neural networks (RNN) to guide the RNN in semantic tagging of an input phrase are presented. A knowledge encoding module of a Knowledge-Guided Structural Attention Process (K-SAP) receives an input phrase and, in conjunction with additional sub-components or cooperative components generates a knowledge-guided vector that is provided with the input phrase to the RNN for linguistic semantic tagging. Generating the knowledge-guided vector comprises at least parsing the input phrase and generating a corresponding hierarchical linguistic structure comprising one or more discrete sub-structures. The sub-structures may be encoded into vectors along with attention weighting identifying those sub-structures that have greater importance in determining the semantic meaning of the input phrase.

Type: Application

Filed: June 18, 2019

Publication date: October 3, 2019

Applicant: Microsoft Technology Licensing, LLC

Inventors: Yun-Nung Vivian Chen, Dilek Z. Hakkani-Tur, Gokhan Tur, Asli Celikyilmaz, Jianfeng Gao, Li Deng
Knowledge-guided structural attention processing

Patent number: 10366163

Abstract: Systems and methods for determining knowledge-guided information for a recurrent neural networks (RNN) to guide the RNN in semantic tagging of an input phrase are presented. A knowledge encoding module of a Knowledge-Guided Structural Attention Process (K-SAP) receives an input phrase and, in conjunction with additional sub-components or cooperative components generates a knowledge-guided vector that is provided with the input phrase to the RNN for linguistic semantic tagging. Generating the knowledge-guided vector comprises at least parsing the input phrase and generating a corresponding hierarchical linguistic structure comprising one or more discrete sub-structures. The sub-structures may be encoded into vectors along with attention weighting identifying those sub-structures that have greater importance in determining the semantic meaning of the input phrase.

Type: Grant

Filed: September 7, 2016

Date of Patent: July 30, 2019

Assignee: Microsoft Technology Licensing, LLC

Inventors: Yun-Nung Chen, Dilek Z. Hakkani-Tur, Gokhan Tur, Asli Celikyilmaz, Jianfeng Gao, Li Deng
Eye gaze for spoken language understanding in multi-modal conversational interactions

Patent number: 10317992

Abstract: Improving accuracy in understanding and/or resolving references to visual elements in a visual context associated with a computerized conversational system is described. Techniques described herein leverage gaze input with gestures and/or speech input to improve spoken language understanding in computerized conversational systems. Leveraging gaze input and speech input improves spoken language understanding in conversational systems by improving the accuracy by which the system can resolve references—or interpret a user's intent—with respect to visual elements in a visual context. In at least one example, the techniques herein describe tracking gaze to generate gaze input, recognizing speech input, and extracting gaze features and lexical features from the user input. Based at least in part on the gaze features and lexical features, user utterances directed to visual elements in a visual context can be resolved.

Type: Grant

Filed: September 25, 2014

Date of Patent: June 11, 2019

Assignee: Microsoft Technology Licensing, LLC

Inventors: Anna Prokofieva, Fethiye Asli Celikyilmaz, Dilek Z Hakkani-Tur, Larry Heck, Malcolm Slaney
Library of existing spoken dialog data for use in generating new natural language spoken dialog systems

Patent number: 10199039

Abstract: A machine-readable medium may include a group of reusable components for building a spoken dialog system. The reusable components may include a group of previously collected audible utterances. A machine-implemented method to build a library of reusable components for use in building a natural language spoken dialog system may include storing a dataset in a database. The dataset may include a group of reusable components for building a spoken dialog system. The reusable components may further include a group of previously collected audible utterances. A second method may include storing at least one set of data. Each one of the at least one set of data may include ones of the reusable components associated with audible data collected during a different collection phase.

Type: Grant

Filed: December 9, 2015

Date of Patent: February 5, 2019

Assignee: NUANCE COMMUNICATIONS, INC.

Inventors: Lee Begeja, Giuseppe Di Fabbrizio, David Crawford Gibbon, Dilek Z. Hakkani-Tur, Zhu Liu, Bernard S. Renger, Behzad Shahraray, Gokhan Tur
Preserving privacy in natural langauge databases

Patent number: 10140321

Abstract: An apparatus and a method for preserving privacy in natural language databases are provided. Natural language input may be received. At least one of sanitizing or anonymizing the natural language input may be performed to form a clean output. The clean output may be stored.

Type: Grant

Filed: May 28, 2014

Date of Patent: November 27, 2018

Assignee: NUANCE COMMUNICATIONS, INC.

Inventors: Dilek Z. Hakkani-Tur, Yucel Saygin, Min Tang, Gokhan Tur
Unsupervised relation detection model training

Patent number: 10073840

Abstract: A relation detection model training solution. The relation detection model training solution mines freely available resources from the World Wide Web to train a relationship detection model for use during linguistic processing. The relation detection model training system searches the web for pairs of entities extracted from a knowledge graph that are connected by a specific relation. Performance is enhanced by clipping search snippets to extract patterns that connect the two entities in a dependency tree and refining the annotations of the relations according to other related entities in the knowledge graph. The relation detection model training solution scales to other domains and languages, pushing the burden from natural language semantic parsing to knowledge base population. The relation detection model training solution exhibits performance comparable to supervised solutions, which require design, collection, and manual labeling of natural language data.

Type: Grant

Filed: December 20, 2013

Date of Patent: September 11, 2018

Assignee: Microsoft Technology Licensing, LLC

Inventors: Dilek Z. Hakkani-Tur, Gokhan Tur, Larry Paul Heck
Knowledge source personalization to improve language models

Patent number: 9997157

Abstract: Systems and methods are provided for improving language models for speech recognition by personalizing knowledge sources utilized by the language models to specific users or user-population characteristics. A knowledge source, such as a knowledge graph, is personalized for a particular user by mapping entities or user actions from usage history for the user, such as query logs, to the knowledge source. The personalized knowledge source may be used to build a personal language model by training a language model with queries corresponding to entities or entity pairs that appear in usage history. In some embodiments, a personalized knowledge source for a specific user can be extended based on personalized knowledge sources of similar users.

Type: Grant

Filed: May 16, 2014

Date of Patent: June 12, 2018

Assignee: Microsoft Technology Licensing, LLC

Inventors: Murat Akbacak, Dilek Z. Hakkani-Tur, Gokhan Tur, Larry P. Heck, Benoit Dumoulin
KNOWLEDGE-GUIDED STRUCTURAL ATTENTION PROCESSING

Publication number: 20180067923

Abstract: Systems and methods for determining knowledge-guided information for a recurrent neural networks (RNN) to guide the RNN in semantic tagging of an input phrase are presented. A knowledge encoding module of a Knowledge-Guided Structural Attention Process (K-SAP) receives an input phrase and, in conjunction with additional sub-components or cooperative components generates a knowledge-guided vector that is provided with the input phrase to the RNN for linguistic semantic tagging. Generating the knowledge-guided vector comprises at least parsing the input phrase and generating a corresponding hierarchical linguistic structure comprising one or more discrete sub-structures. The sub-structures may be encoded into vectors along with attention weighting identifying those sub-structures that have greater importance in determining the semantic meaning of the input phrase.

Type: Application

Filed: September 7, 2016

Publication date: March 8, 2018

Inventors: Yun-Nung Chen, Dilek Z. Hakkani-Tur, Gokhan Tur, Asli Celikyilmaz, Jianfeng Gao, Li Deng
System and method for using semantic and syntactic graphs for utterance classification

Patent number: 9905223

Abstract: Disclosed herein is a system, method and computer readable medium storing instructions related to semantic and syntactic information in a language understanding system. The method embodiment of the invention is a method for classifying utterances during a natural language dialog between a human and a computing device. The method comprises receiving a user utterance; generating a semantic and syntactic graph associated with the received utterance, extracting all n-grams as features from the generated semantic and syntactic graph and classifying the utterance. Classifying the utterance may be performed any number of ways such as using the extracted n-grams, a syntactic and semantic graphs or writing rules.

Type: Grant

Filed: December 9, 2015

Date of Patent: February 27, 2018

Assignee: Nuance Communications, Inc.

Inventors: Ananlada Chotimongkol, Dilek Z. Hakkani-Tur, Gokhan Tur
MULTI-DOMAIN JOINT SEMANTIC FRAME PARSING

Publication number: 20170372199

Abstract: A processing unit can train a model as a joint multi-domain recurrent neural network (JRNN), such as a bi-directional recurrent neural network (bRNN) and/or a recurrent neural network with long-short term memory (RNN-LSTM) for spoken language understanding (SLU). The processing unit can use the trained model to, e.g., jointly model slot filling, intent determination, and domain classification. The joint multi-domain model described herein can estimate a complete semantic frame per query, and the joint multi-domain model enables multi-task deep learning leveraging the data from multiple domains. The joint multi-domain recurrent neural network (JRNN) can leverage semantic intents (such as, finding or identifying, e.g., a domain specific goal) and slots (such as, dates, times, locations, subjects, etc.) across multiple domains.

Type: Application

Filed: August 4, 2016

Publication date: December 28, 2017

Inventors: Dilek Z Hakkani-Tur, Asli Celikyilmaz, Yun-Nung Chen, Li Deng, Jianfeng Gao, Gokhan Tur, Ye-Yi Wang
END-TO-END MEMORY NETWORKS FOR CONTEXTUAL LANGUAGE UNDERSTANDING

Publication number: 20170372200

Abstract: A processing unit can extract salient semantics to model knowledge carryover, from one turn to the next, in multi-turn conversations. Architecture described herein can use the end-to-end memory networks to encode inputs, e.g., utterances, with intents and slots, which can be stored as embeddings in memory, and in decoding the architecture can exploit latent contextual information from memory, e.g., demographic context, visual context, semantic context, etc. e.g., via an attention model, to leverage previously stored semantics for semantic parsing, e.g., for joint intent prediction and slot tagging. In examples, architecture is configured to build an end-to-end memory network model for contextual, e.g., multi-turn, language understanding, to apply the end-to-end memory network model to multiple turns of conversational input; and to fill slots for output of contextual, e.g., multi-turn, language understanding of the conversational input.

Type: Application

Filed: August 4, 2016

Publication date: December 28, 2017

Inventors: Yun-Nung Chen, Dilek Z. Hakkani-Tur, Gokhan Tur, Li Deng, Jianfeng Gao
Language modeling for conversational understanding domains using semantic web resources

Patent number: 9679558

Abstract: Systems and methods are provided for training language models using in-domain-like data collected automatically from one or more data sources. The data sources (such as text data or user-interactional data) are mined for specific types of data, including data related to style, content, and probability of relevance, which are then used for language model training. In one embodiment, a language model is trained from features extracted from a knowledge graph modified into a probabilistic graph, where entity popularities are represented and the popularity information is obtained from data sources related to the knowledge. Embodiments of language models trained from this data are particularly suitable for domain-specific conversational understanding tasks where natural language is used, such as user interaction with a game console or a personal assistant application on personal device.

Type: Grant

Filed: May 15, 2014

Date of Patent: June 13, 2017

Assignee: Microsoft Technology Licensing, LLC

Inventors: Murat Akbacak, Dilek Z. Hakkani-Tur, Gokhan Tur, Larry P. Heck, Benoit Dumoulin
Unsupervised and active learning in automatic speech recognition for call classification

Patent number: 9666182

Abstract: Utterance data that includes at least a small amount of manually transcribed data is provided. Automatic speech recognition is performed on ones of the utterance data not having a corresponding manual transcription to produce automatically transcribed utterances. A model is trained using all of the manually transcribed data and the automatically transcribed utterances. A predetermined number of utterances not having a corresponding manual transcription are intelligently selected and manually transcribed. Ones of the automatically transcribed data as well as ones having a corresponding manual transcription are labeled. In another aspect of the invention, audio data is mined from at least one source, and a language model is trained for call classification from the mined audio data to produce a language model.

Type: Grant

Filed: October 5, 2015

Date of Patent: May 30, 2017

Assignee: Nuance Communications, Inc.

Inventors: Dilek Z. Hakkani-Tur, Mazin G. Rahim, Giuseppe Riccardi, Gokhan Tur
Model based approach for on-screen item selection and disambiguation

Patent number: 9412363

Abstract: A model-based approach for on-screen item selection and disambiguation is provided. An utterance may be received by a computing device in response to a display of a list of items for selection on a display screen. A disambiguation model may then be applied to the utterance. The disambiguation model may be utilized to determine whether the utterance is directed to at least one of the list of displayed items, extract referential features from the utterance and identify an item from the list corresponding to the utterance, based on the extracted referential features. The computing device may then perform an action which includes selecting the identified item associated with utterance.

Type: Grant

Filed: March 3, 2014

Date of Patent: August 9, 2016

Assignee: Microsoft Technology Licensing, LLC

Inventors: Ruhi Sarikaya, Fethiye Asli Celikyilmaz, Zhaleh Feizollahi, Larry Paul Heck, Dilek Z. Hakkani-Tur

1 2 3 4 5 next