Patents by Inventor Meltem Oktem

Meltem Oktem has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11960793
    Abstract: A method can perform a process with a method including capturing an image, determining an environment that a user is operating a computing device, detecting a hand gesture based on an object in the image, determining, using a machine learned model, an intent of a user based on the hand gesture and the environment, and executing a task based at least on the determined intent.
    Type: Grant
    Filed: December 30, 2022
    Date of Patent: April 16, 2024
    Assignee: GOOGLE LLC
    Inventors: Archana Kannan, Roza Chojnacka, Jamieson Kerns, Xiyang Luo, Meltem Oktem, Nada Elassal
  • Publication number: 20240086063
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for cross input modality learning in a mobile device are disclosed. In one aspect, a method includes activating a first modality user input mode in which user inputs by way of a first modality are recognized using a first modality recognizer; and receiving a user input by way of the first modality. The method includes, obtaining, as a result of the first modality recognizer recognizing the user input, a transcription that includes a particular term; and generating an input context data structure that references at least the particular term. The method further includes, transmitting, by the first modality recognizer, the input context data structure to a second modality recognizer for use in updating a second modality recognition model associated with the second modality recognizer.
    Type: Application
    Filed: November 22, 2023
    Publication date: March 14, 2024
    Inventors: Yu Ouyang, Diego Melendo Casado, Mohammadinamul Hasan Sheik, Francoise Beaufays, Dragan Zivkovic, Meltem Oktem
  • Patent number: 11842045
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for cross input modality learning in a mobile device are disclosed. In one aspect, a method includes activating a first modality user input mode in which user inputs by way of a first modality are recognized using a first modality recognizer; and receiving a user input by way of the first modality. The method includes, obtaining, as a result of the first modality recognizer recognizing the user input, a transcription that includes a particular term; and generating an input context data structure that references at least the particular term. The method further includes, transmitting, by the first modality recognizer, the input context data structure to a second modality recognizer for use in updating a second modality recognition model associated with the second modality recognizer.
    Type: Grant
    Filed: August 31, 2022
    Date of Patent: December 12, 2023
    Assignee: Google LLC
    Inventors: Yu Ouyang, Diego Melendo Casado, Mohammadinamul Hasan Sheik, Francoise Beaufays, Dragan Zivkovic, Meltem Oktem
  • Publication number: 20230335116
    Abstract: In some implementations, processor(s) can receive an utterance from a speaker, and determine whether the speaker is a known user of a user device or not a known user of the user device. The user device can be shared by a plurality of known users. Further, the processor(s) can determine whether the utterance corresponds to a personal request or non-personal request. Moreover, and in response to determining that the speaker not a known user of the user device and in response to determining that the utterance corresponds to a non-personal request, the processor(s) can cause a response to the utterance to be provided for presentation to the speaker at the user device response to the utterance, or can cause an action to be performed by the user device responsive to the utterance.
    Type: Application
    Filed: June 16, 2023
    Publication date: October 19, 2023
    Inventors: Meltem Oktem, Taral Pradeep Joglekar, Fnu Heryandi, Pu-sen Chao, Ignacio Lopez Moreno, Salil Rajadhyaksha, Alexander H. Gruenstein, Diego Melendo Casado
  • Publication number: 20230289134
    Abstract: A method can perform a process with a method including capturing an image, determining an environment that a user is operating a computing device, detecting a hand gesture based on an object in the image, determining, using a machine learned model, an intent of a user based on the hand gesture and the environment, and executing a task based at least on the determined intent.
    Type: Application
    Filed: December 30, 2022
    Publication date: September 14, 2023
    Inventors: Archana Kannan, Roza Chojnacka, Jamieson Kerns, Xiyang Luo, Meltem Oktem, Nada Elassal
  • Patent number: 11721326
    Abstract: In some implementations, processor(s) can receive an utterance from a speaker, and determine whether the speaker is a known user of a user device or not a known user of the user device. The user device can be shared by a plurality of known users. Further, the processor(s) can determine whether the utterance corresponds to a personal request or non-personal request. Moreover, and in response to determining that the speaker is not a known user of the user device and in response to determining that the utterance corresponds to a non-personal request, the processor(s) can cause a response to the utterance to be provided for presentation to the speaker at the user device response to the utterance, or can cause an action to be performed by the user device responsive to the utterance.
    Type: Grant
    Filed: January 26, 2022
    Date of Patent: August 8, 2023
    Assignee: GOOGLE LLC
    Inventors: Meltem Oktem, Taral Pradeep Joglekar, Fnu Heryandi, Pu-sen Chao, Ignacio Lopez Moreno, Salil Rajadhyaksha, Alexander H. Gruenstein, Diego Melendo Casado
  • Patent number: 11543888
    Abstract: A method can perform a process with a method including capturing an image, determining an environment that a user is operating a computing device, detecting a hand gesture based on an object in the image, determining, using a machine learned model, an intent of a user based on the hand gesture and the environment, and executing a task based at least on the determined intent.
    Type: Grant
    Filed: June 25, 2020
    Date of Patent: January 3, 2023
    Assignee: GOOGLE LLC
    Inventors: Archana Kannan, Roza Chojnacka, Jamieson Kerns, Xiyang Luo, Meltem Oktem, Nada Elassal
  • Publication number: 20220413696
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for cross input modality learning in a mobile device are disclosed. In one aspect, a method includes activating a first modality user input mode in which user inputs by way of a first modality are recognized using a first modality recognizer; and receiving a user input by way of the first modality. The method includes, obtaining, as a result of the first modality recognizer recognizing the user input, a transcription that includes a particular term; and generating an input context data structure that references at least the particular term. The method further includes, transmitting, by the first modality recognizer, the input context data structure to a second modality recognizer for use in updating a second modality recognition model associated with the second modality recognizer.
    Type: Application
    Filed: August 31, 2022
    Publication date: December 29, 2022
    Inventors: Yu Ouyang, Diego Melendo Casado, Mohammadinamul Hasan Sheik, Francoise Beaufays, Dragan Zivkovic, Meltem Oktem
  • Patent number: 11435898
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for cross input modality learning in a mobile device are disclosed. In one aspect, a method includes activating a first modality user input mode in which user inputs by way of a first modality are recognized using a first modality recognizer; and receiving a user input by way of the first modality. The method includes, obtaining, as a result of the first modality recognizer recognizing the user input, a transcription that includes a particular term; and generating an input context data structure that references at least the particular term. The method further includes, transmitting, by the first modality recognizer, the input context data structure to a second modality recognizer for use in updating a second modality recognition model associated with the second modality recognizer.
    Type: Grant
    Filed: October 6, 2020
    Date of Patent: September 6, 2022
    Assignee: Google LLC
    Inventors: Yu Ouyang, Diego Melendo Casado, Mohammadinamul Hasan Sheik, Francoise Beaufays, Dragan Zivkovic, Meltem Oktem
  • Publication number: 20220148577
    Abstract: In some implementations, authentication tokens corresponding to known users of a device are stored on the device. An utterance from a speaker is received. The speaker of the utterance is classified as not a known user of the device. A query that includes the authentication tokens that correspond to known users of the device, a representation of the utterance and an indication that the speaker was classified as not a known user of the device is provided to the server. A response to the query is received at the device and from the server based on the query.
    Type: Application
    Filed: January 26, 2022
    Publication date: May 12, 2022
    Inventors: Meltem Oktem, Taral Pradeep Joglekar, Fnu Heryandi, Pu-sen Chao, Ignacio Lopez Moreno, Salil Rajadhyaksha, Alexander H. Gruenstein, Diego Melendo Casado
  • Patent number: 11238848
    Abstract: In some implementations, authentication tokens corresponding to known users of a device are stored on the device. An utterance from a speaker is received. The speaker of the utterance is classified as not a known user of the device. A query that includes the authentication tokens that correspond to known users of the device, a representation of the utterance, and an indication that the speaker was classified as not a known user of the device is provided to the server. A response to the query is received at the device and from the server based on the query.
    Type: Grant
    Filed: December 10, 2019
    Date of Patent: February 1, 2022
    Assignee: Google LLC
    Inventors: Meltem Oktem, Taral Pradeep Joglekar, Fnu Heryandi, Pu-sen Chao, Ignacio Lopez Moreno, Salil Rajadhyaksha, Alexander H. Gruenstein, Diego Melendo Casado
  • Patent number: 11158128
    Abstract: A system and method may provide for spatial and semantic auto-completion of an augmented or mixed reality environment. The system may detect physical objects in a physical environment based on analysis of image frames captured by an image sensor of a computing device. The system may detect spaces in the physical environment that are occupied by the detected physical objects, and may detect spaces that are unoccupied in the physical environment. Based on the identification of the detected physical objects, the system may gain a semantic understanding of the physical environment, and may determine suggested objects for placement in the physical environment based on the semantic understanding. The system may place virtual representations of the suggested objects in a mixed reality scene of the physical environment for user consideration.
    Type: Grant
    Filed: April 26, 2019
    Date of Patent: October 26, 2021
    Assignee: GOOGLE LLC
    Inventors: Roza Chojnacka, Meltem Oktem, Rajan Patel, Uday Idnani, Xiyang Luo
  • Patent number: 11145299
    Abstract: Methods, systems, and apparatus, including computer-readable storage devices, for managing voice interface devices. In some implementations, messages are received from a plurality of devices, each of the messages indicating a respective voice input detected by the device that sent the message. Audio signatures are obtained for the voice inputs detected by the plurality of devices. The audio signatures for the voice inputs and times that the voice inputs were detected are evaluated. Based on the evaluation, at least some of the plurality of devices are grouped to form a group of multiple devices that detected a same user utterance. A device from the group is selected to respond to the user utterance, and the multiple devices in the group are managed so that only the selected device outputs a synthesized speech response to the user utterance.
    Type: Grant
    Filed: April 19, 2018
    Date of Patent: October 12, 2021
    Assignee: X Development LLC
    Inventors: Meltem Oktem, Max Benjamin Braun
  • Publication number: 20210019046
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for cross input modality learning in a mobile device are disclosed. In one aspect, a method includes activating a first modality user input mode in which user inputs by way of a first modality are recognized using a first modality recognizer; and receiving a user input by way of the first modality. The method includes, obtaining, as a result of the first modality recognizer recognizing the user input, a transcription that includes a particular term; and generating an input context data structure that references at least the particular term. The method further includes, transmitting, by the first modality recognizer, the input context data structure to a second modality recognizer for use in updating a second modality recognition model associated with the second modality recognizer.
    Type: Application
    Filed: October 6, 2020
    Publication date: January 21, 2021
    Inventors: Yu Ouyang, Diego Melendo Casado, Mohammadinamul Hasan Sheik, Francoise Beaufays, Dragan Zivkovic, Meltem Oktem
  • Publication number: 20200409469
    Abstract: A method can perform a process with a method including capturing an image, determining an environment that a user is operating a computing device, detecting a hand gesture based on an object in the image, determining, using a machine learned model, an intent of a user based on the hand gesture and the environment, and executing a task based at least on the determined intent.
    Type: Application
    Filed: June 25, 2020
    Publication date: December 31, 2020
    Inventors: Archana Kannan, Roza Chojnacka, Jamieson Kerns, Xiyang Luo, Meltem Oktem, Nada Elassal
  • Patent number: 10831366
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for cross input modality learning in a mobile device are disclosed. In one aspect, a method includes activating a first modality user input mode in which user inputs by way of a first modality are recognized using a first modality recognizer; and receiving a user input by way of the first modality. The method includes, obtaining, as a result of the first modality recognizer recognizing the user input, a transcription that includes a particular term; and generating an input context data structure that references at least the particular term. The method further includes, transmitting, by the first modality recognizer, the input context data structure to a second modality recognizer for use in updating a second modality recognition model associated with the second modality recognizer.
    Type: Grant
    Filed: December 29, 2016
    Date of Patent: November 10, 2020
    Assignee: Google LLC
    Inventors: Yu Ouyang, Diego Melendo Casado, Mohammadinamul Hasan Sheik, Francoise Beaufays, Dragan Zivkovic, Meltem Oktem
  • Publication number: 20200342668
    Abstract: A system and method may provide for spatial and semantic auto-completion of an augmented or mixed reality environment. The system may detect physical objects in a physical environment based on analysis of image frames captured by an image sensor of a computing device. The system may detect spaces in the physical environment that are occupied by the detected physical objects, and may detect spaces that are unoccupied in the physical environment. Based on the identification of the detected physical objects, the system may gain a semantic understanding of the physical environment, and may determine suggested objects for placement in the physical environment based on the semantic understanding. The system may place virtual representations of the suggested objects in a mixed reality scene of the physical environment for user consideration.
    Type: Application
    Filed: April 26, 2019
    Publication date: October 29, 2020
    Inventors: Roza Chojnacka, Meltem Oktem, Rajan Patel, Uday Idnani, Xiyang Luo
  • Publication number: 20200118550
    Abstract: In some implementations, authentication tokens corresponding to known users of a device are stored on the device. An utterance from a speaker is received. The utterance is classified as spoken by a particular known user of the known users. A query that includes a representation of the utterance and an indication of the particular known user as the speaker is provided using the authentication token of the particular known user.
    Type: Application
    Filed: December 10, 2019
    Publication date: April 16, 2020
    Inventors: Meltem Oktem, Taral Pradeep Joglekar, Fnu Heryandi, Pu-sen Chao, Ignacio Lopez Moreno, Salil Rajadhyaksha, Alexander H. Gruenstein, Diego Melendo Casado
  • Patent number: 10522137
    Abstract: In some implementations, authentication tokens corresponding to known users of a device are stored on the device. An utterance from a speaker is received. The utterance is classified as spoken by a particular known user of the known users. A query that includes a representation of the utterance and an indication of the particular known user as the speaker is provided using the authentication token of the particular known user.
    Type: Grant
    Filed: April 18, 2018
    Date of Patent: December 31, 2019
    Assignee: Google LLC
    Inventors: Meltem Oktem, Taral Pradeep Joglekar, Fnu Heryandi, Pu-sen Chao, Ignacio Lopez Moreno, Salil Rajadhyaksha, Alexander H. Gruenstein, Diego Melendo Casado
  • Publication number: 20190325865
    Abstract: Methods, systems, and apparatus, including computer-readable storage devices, for managing voice interface devices. In some implementations, messages are received from a plurality of devices, each of the messages indicating a respective voice input detected by the device that sent the message. Audio signatures are obtained for the voice inputs detected by the plurality of devices. The audio signatures for the voice inputs and times that the voice inputs were detected are evaluated. Based on the evaluation, at least some of the plurality of devices are grouped to form a group of multiple devices that detected a same user utterance. A device from the group is selected to respond to the user utterance, and the multiple devices in the group are managed so that only the selected device outputs a synthesized speech response to the user utterance.
    Type: Application
    Filed: April 19, 2018
    Publication date: October 24, 2019
    Inventors: Meltem Oktem, Max Benjamin Braun