Patents by Inventor Gokhan Tur

Gokhan Tur has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20250028321
    Abstract: Described herein is a system for tracking objects and performing dynamic entity resolution using image data. For example, the system may build an environment map and populate the map with objects present in the environment. As the devices move about the environment it may capture image data and, based on its position and/or configuration of its components, may determine updated locations of objects that move in the environment. Upon receiving a query from a user, based on the location of the objects relative to the device/user, the system can interpret gestures and voice commands to infer which object is specified by the voice command. To build the environment map, the system performs object detection to generate bounding boxes associated with an object, then clusters the bounding boxes into a three-dimensional (3D) object associated with 3D coordinates. As the system tracks the object using the 3D coordinates while maintaining two-dimensional (2D) information (e.g.
    Type: Application
    Filed: October 7, 2024
    Publication date: January 23, 2025
    Inventors: Gunnar Atli Sigurdsson, Robinson Piramuthu, Gokhan Tur
  • Patent number: 12205577
    Abstract: Techniques for rendering visual content, in response to one or more utterances, are described. A device receives one or more utterances that define a parameter(s) for desired output content. A system (or the device) identifies natural language data corresponding to the desired content, and uses natural language generation processes to update the natural language data based on the parameter(s). The system (or the device) then generates an image based on the updated natural language data. The system (or the device) also generates video data of an avatar. The device displays the image and the avatar, and synchronizes movements of the avatar with output of synthesized speech of the updated natural language data. The device may also display subtitles of the updated natural language data, and cause a word of the subtitles to be emphasized when synthesized speech of the word is being output.
    Type: Grant
    Filed: March 30, 2021
    Date of Patent: January 21, 2025
    Assignee: Amazon Technologies, Inc.
    Inventors: Taehwan Kim, Sanqiang Zhao, Robinson Piramuthu, Seokhwan Kim, Yang Liu, Gokhan Tur, Eshan Bhatnagar
  • Patent number: 12117838
    Abstract: Described herein is a system for tracking objects and performing dynamic entity resolution using image data. For example, the system may build an environment map and populate the map with objects present in the environment. As the devices move about the environment it may capture image data and, based on its position and/or configuration of its components, may determine updated locations of objects that move in the environment. Upon receiving a query from a user, based on the location of the objects relative to the device/user, the system can interpret gestures and voice commands to infer which object is specified by the voice command. To build the environment map, the system performs object detection to generate bounding boxes associated with an object, then clusters the bounding boxes into a three-dimensional (3D) object associated with 3D coordinates. As the system tracks the object using the 3D coordinates while maintaining two-dimensional (2D) information (e.g.
    Type: Grant
    Filed: March 31, 2021
    Date of Patent: October 15, 2024
    Assignee: Amazon Technologies, Inc.
    Inventors: Gunnar Atli Sigurdsson, Robinson Piramuthu, Gokhan Tur
  • Patent number: 12008985
    Abstract: Devices and techniques are generally described for learning personalized responses to declarative natural language inputs. In various examples, a first natural language input may be received. The first natural language input may correspond to intent data corresponding to a declarative user input. In some examples, a dialog session may be initiated with the first user. An action intended by the first user for the first natural language input may be determined based at least in part on the dialog session. In various examples, first data representing the action may be stored in association with second data representing a state described by at least a portion of the first natural language input.
    Type: Grant
    Filed: June 22, 2020
    Date of Patent: June 11, 2024
    Assignee: Amazon Technologies, Inc.
    Inventors: Qiaozi Gao, Divyanshu Brijmohan Verma, Govindarajan Sundaram Thattai, Qing Ping, Joel Joseph Chengottusseriyil, Ivan Vitomir Stojanovic, Feiyang Niu, Gokhan Tur, Charles J Allen
  • Patent number: 12002458
    Abstract: A device capable of autonomous motion may move in an environment and may receive audio data from a microphone. If the device receives a command represented in the audio data that is absent from a set of known commands, the device may prompt the user to explain how to perform the command. The device may save a command template corresponding to the command, which may be used to perform future commands.
    Type: Grant
    Filed: September 4, 2020
    Date of Patent: June 4, 2024
    Assignee: Amazon Technologies, Inc.
    Inventors: Qiaozi Gao, Govindarajan Sundaram Thattai, Qing Ping, Joel Joseph Chengottusseriyil, Feiyang Niu, Gokhan Tur, Dilek Hakkani-Tur
  • Patent number: 11978437
    Abstract: Devices and techniques are generally described for learning personalized concepts for natural language processing. In various examples, a first natural language input may be received. In some examples, a determination may be made that the first natural language input comprises non-actionable slot data. A dialog session may be initiated with the user. In some examples, first slot data that is indicated by the user during the dialog session may be determined. In various examples, data representing the first slot data may be stored in a database in association with the first natural language input.
    Type: Grant
    Filed: December 11, 2020
    Date of Patent: May 7, 2024
    Assignee: Amazon Technologies, Inc.
    Inventors: Govindarajan Sundaram Thattai, Qing Ping, Feiyang Niu, Joel Joseph Chengottusseriyil, Prashanth Rajagopal, Qiaozi Gao, Aishwarya Naresh Reganti, Gokhan Tur, Dilek Hakkani-Tur, Rohit Prasad, Premkumar Natarajan
  • Publication number: 20230401445
    Abstract: A processing unit can train a model as a joint multi-domain recurrent neural network (JRNN), such as a bi-directional recurrent neural network (bRNN) and/or a recurrent neural network with long-short term memory (RNN-LSTM) for spoken language understanding (SLU). The processing unit can use the trained model to, e.g., jointly model slot filling, intent determination, and domain classification. The joint multi-domain model described herein can estimate a complete semantic frame per query, and the joint multi-domain model enables multi-task deep learning leveraging the data from multiple domains. The joint multi-domain recurrent neural (JRNN) can leverage semantic intents (such as, finding or identifying, e.g., a domain specific goal) and slots (such as, dates, times, locations, subjects, etc.) across multiple domains.
    Type: Application
    Filed: August 29, 2023
    Publication date: December 14, 2023
    Inventors: Dilek Z. Hakkani-Tur, Asli Celikyilmaz, Yun-Nung Chen, Li Deng, Jianfeng Gao, Gokhan Tur, Ye Yi Wang
  • Patent number: 11783173
    Abstract: A processing unit can train a model as a joint multi-domain recurrent neural network (JRNN), such as a bi-directional recurrent neural network (bRNN) and/or a recurrent neural network with long-short term memory (RNN-LSTM) for spoken language understanding (SLU). The processing unit can use the trained model to, e.g., jointly model slot filling, intent determination, and domain classification. The joint multi-domain model described herein can estimate a complete semantic frame per query, and the joint multi-domain model enables multi-task deep learning leveraging the data from multiple domains. The joint multi-domain recurrent neural network (JRNN) can leverage semantic intents (such as, finding or identifying, e.g., a domain specific goal) and slots (such as, dates, times, locations, subjects, etc.) across multiple domains.
    Type: Grant
    Filed: August 4, 2016
    Date of Patent: October 10, 2023
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Dilek Z Hakkani-Tur, Asli Celikyilmaz, Yun-Nung Chen, Li Deng, Jianfeng Gao, Gokhan Tur, Ye-Yi Wang
  • Patent number: 11449744
    Abstract: A processing unit can extract salient semantics to model knowledge carryover, from one turn to the next, in multi-turn conversations. Architecture described herein can use the end-to-end memory networks to encode inputs, e.g., utterances, with intents and slots, which can be stored as embeddings in memory, and in decoding the architecture can exploit latent contextual information from memory, e.g., demographic context, visual context, semantic context, etc. e.g., via an attention model, to leverage previously stored semantics for semantic parsing, e.g., for joint intent prediction and slot tagging. In examples, architecture is configured to build an end-to-end memory network model for contextual, e.g., multi-turn, language understanding, to apply the end-to-end memory network model to multiple turns of conversational input; and to fill slots for output of contextual, e.g., multi-turn, language understanding of the conversational input.
    Type: Grant
    Filed: August 4, 2016
    Date of Patent: September 20, 2022
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Yun-Nung Chen, Dilek Z. Hakkani-Tur, Gokhan Tur, Li Deng, Jianfeng Gao
  • Publication number: 20220199088
    Abstract: A network computer system for managing a network service (e.g., a transport service) can include a voice-assistant subsystem for generating dialogues and performing actions for service providers of the network service. The network computer system can receive, from a user device, a request for the network service. In response, the network computer system can identify a service provider and transmit an invitation to the provider device of the service provider. In response to the identification of the service provider for the request, the voice-assistant subsystem can trigger an audio voice prompt to be presented on the provider device and a listening period during which the provider device monitors for an audio input from the service provider. Based on the audio input captured by the provider device, the network computer system can determine an intent corresponding to whether the service provider accepts or declines the invitation.
    Type: Application
    Filed: December 28, 2021
    Publication date: June 23, 2022
    Inventors: Lawrence Benjamin Goldstein, Arjun Vora, Gokhan Tur, Manisha Mundhe, Xiaochao Yang
  • Patent number: 11244685
    Abstract: A network computer system for managing a network service (e.g., a transport service) can include a voice-assistant subsystem for generating dialogues and performing actions for service providers of the network service. The network computer system can receive, from a user device, a request for the network service. In response, the network computer system can identify a service provider and transmit an invitation to the provider device of the service provider. In response to the identification of the service provider for the request, the voice-assistant subsystem can trigger an audio voice prompt to be presented on the provider device and a listening period during which the provider device monitors for an audio input from the service provider. Based on the audio input captured by the provider device, the network computer system can determine an intent corresponding to whether the service provider accepts or declines the invitation.
    Type: Grant
    Filed: September 4, 2019
    Date of Patent: February 8, 2022
    Assignee: Uber Technologies, Inc.
    Inventors: Lawrence Benjamin Goldstein, Arjun Vora, Gokhan Tur, Manisha Mundhe, Xiaochao Yang
  • Publication number: 20210398524
    Abstract: Devices and techniques are generally described for learning personalized responses to declarative natural language inputs. In various examples, a first natural language input may be received. The first natural language input may correspond to intent data corresponding to a declarative user input. In some examples, a dialog session may be initiated with the first user. An action intended by the first user for the first natural language input may be determined based at least in part on the dialog session. In various examples, first data representing the action may be stored in association with second data representing a state described by at least a portion of the first natural language input.
    Type: Application
    Filed: June 22, 2020
    Publication date: December 23, 2021
    Inventors: Qiaozi Gao, Divyanshu Brijmohan Verma, Govindarajan Sundaram Thattai, Qing Ping, Joel Joseph Chengottusseriyil, Ivan Vitomir Stojanovic, Feiyang Niu, Gokhan Tur, Charles J. Allen
  • Patent number: 10878009
    Abstract: Natural language query translation may be provided. A statistical model may be trained to detect domains according to a plurality of query click log data. Upon receiving a natural language query, the statistical model may be used to translate the natural language query into an action. The action may then be performed and at least one result associated with performing the action may be provided.
    Type: Grant
    Filed: July 24, 2018
    Date of Patent: December 29, 2020
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Dilek Zeynep Hakkani-Tur, Gokhan Tur, Rukmini Iyer, Larry Paul Heck
  • Patent number: 10839165
    Abstract: Systems and methods for determining knowledge-guided information for a recurrent neural networks (RNN) to guide the RNN in semantic tagging of an input phrase are presented. A knowledge encoding module of a Knowledge-Guided Structural Attention Process (K-SAP) receives an input phrase and, in conjunction with additional sub-components or cooperative components generates a knowledge-guided vector that is provided with the input phrase to the RNN for linguistic semantic tagging. Generating the knowledge-guided vector comprises at least parsing the input phrase and generating a corresponding hierarchical linguistic structure comprising one or more discrete sub-structures. The sub-structures may be encoded into vectors along with attention weighting identifying those sub-structures that have greater importance in determining the semantic meaning of the input phrase.
    Type: Grant
    Filed: June 18, 2019
    Date of Patent: November 17, 2020
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Yun-Nung Vivian Chen, Dilek Z. Hakkani-Tur, Gokhan Tur, Asli Celikyilmaz, Jianfeng Gao, Li Deng
  • Patent number: 10755713
    Abstract: A method for assisting a user with one or more desired tasks is disclosed. For example, an executable, generic language understanding module and an executable, generic task reasoning module are provided for execution in the computer processing system. A set of run-time specifications is provided to the generic language understanding module and the generic task reasoning module, comprising one or more models specific to a domain. A language input is then received from a user, an intention of the user is determined with respect to one or more desired tasks, and the user is assisted with the one or more desired tasks, in accordance with the intention of the user.
    Type: Grant
    Filed: December 21, 2018
    Date of Patent: August 25, 2020
    Assignee: SRI International
    Inventors: Osher Yadgar, Neil Yorke-Smith, Bart Peintner, Gokhan Tur, Necip Fazil Ayan, Michael J. Wolverton, Girish Acharya, Venkatarama Satyanarayana Parimi, William S. Mark, Wen Wang, Andreas Kathol, Regis Vincent, Horacio E. Franco
  • Publication number: 20200075016
    Abstract: A network computer system for managing a network service (e.g., a transport service) can include a voice-assistant subsystem for generating dialogues and performing actions for service providers of the network service. The network computer system can receive, from a user device, a request for the network service. In response, the network computer system can identify a service provider and transmit an invitation to the provider device of the service provider. In response to the identification of the service provider for the request, the voice-assistant subsystem can trigger an audio voice prompt to be presented on the provider device and a listening period during which the provider device monitors for an audio input from the service provider. Based on the audio input captured by the provider device, the network computer system can determine an intent corresponding to whether the service provider accepts or declines the invitation.
    Type: Application
    Filed: September 4, 2019
    Publication date: March 5, 2020
    Inventors: Lawrence Benjamin Goldstein, Arjun Vora, Gokhan Tur, Manisha Mundhe, Xiaochao Yang
  • Publication number: 20190303440
    Abstract: Systems and methods for determining knowledge-guided information for a recurrent neural networks (RNN) to guide the RNN in semantic tagging of an input phrase are presented. A knowledge encoding module of a Knowledge-Guided Structural Attention Process (K-SAP) receives an input phrase and, in conjunction with additional sub-components or cooperative components generates a knowledge-guided vector that is provided with the input phrase to the RNN for linguistic semantic tagging. Generating the knowledge-guided vector comprises at least parsing the input phrase and generating a corresponding hierarchical linguistic structure comprising one or more discrete sub-structures. The sub-structures may be encoded into vectors along with attention weighting identifying those sub-structures that have greater importance in determining the semantic meaning of the input phrase.
    Type: Application
    Filed: June 18, 2019
    Publication date: October 3, 2019
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Yun-Nung Vivian Chen, Dilek Z. Hakkani-Tur, Gokhan Tur, Asli Celikyilmaz, Jianfeng Gao, Li Deng
  • Patent number: 10366336
    Abstract: The present invention relates to a method and apparatus for exploiting human feedback in an intelligent automated assistant. One embodiment of a method for conducting an interaction with a human user includes inferring an intent from data entered by the human user, formulating a response in accordance with the intent, receiving feedback from a human advisor in response to at least one of the inferring and the formulating, wherein the human advisor is a person other than the human user, and adapting at least one model used in at least one of the inferring and the formulating, wherein the adapting is based on the feedback.
    Type: Grant
    Filed: September 1, 2010
    Date of Patent: July 30, 2019
    Assignee: SRI International
    Inventors: Gokhan Tur, Horacio E. Franco, William S. Mark, Norman D. Winarsky, Bart Peintner, Michael J. Wolverton, Neil Yorke-Smith
  • Patent number: 10366163
    Abstract: Systems and methods for determining knowledge-guided information for a recurrent neural networks (RNN) to guide the RNN in semantic tagging of an input phrase are presented. A knowledge encoding module of a Knowledge-Guided Structural Attention Process (K-SAP) receives an input phrase and, in conjunction with additional sub-components or cooperative components generates a knowledge-guided vector that is provided with the input phrase to the RNN for linguistic semantic tagging. Generating the knowledge-guided vector comprises at least parsing the input phrase and generating a corresponding hierarchical linguistic structure comprising one or more discrete sub-structures. The sub-structures may be encoded into vectors along with attention weighting identifying those sub-structures that have greater importance in determining the semantic meaning of the input phrase.
    Type: Grant
    Filed: September 7, 2016
    Date of Patent: July 30, 2019
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Yun-Nung Chen, Dilek Z. Hakkani-Tur, Gokhan Tur, Asli Celikyilmaz, Jianfeng Gao, Li Deng
  • Publication number: 20190130912
    Abstract: A method for assisting a user with one or more desired tasks is disclosed. For example, an executable, generic language understanding module and an executable, generic task reasoning module are provided for execution in the computer processing system. A set of run-time specifications is provided to the generic language understanding module and the generic task reasoning module, comprising one or more models specific to a domain. A language input is then received from a user, an intention of the user is determined with respect to one or more desired tasks, and the user is assisted with the one or more desired tasks, in accordance with the intention of the user.
    Type: Application
    Filed: December 21, 2018
    Publication date: May 2, 2019
    Inventors: Osher Yadgar, Neil Yorke-Smith, Bart Peintner, Gokhan Tur, Necip Fazil Ayan, Michael J. Wolverton, Girish Acharya, Venkatarama Satyanarayana Parimi, William S. Mark, Wen Wang, Andreas Kathol, Regis Vincent, Horacio E. Franco