Patents by Inventor Meltem Oktem
Meltem Oktem has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11960793Abstract: A method can perform a process with a method including capturing an image, determining an environment that a user is operating a computing device, detecting a hand gesture based on an object in the image, determining, using a machine learned model, an intent of a user based on the hand gesture and the environment, and executing a task based at least on the determined intent.Type: GrantFiled: December 30, 2022Date of Patent: April 16, 2024Assignee: GOOGLE LLCInventors: Archana Kannan, Roza Chojnacka, Jamieson Kerns, Xiyang Luo, Meltem Oktem, Nada Elassal
-
Publication number: 20240086063Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for cross input modality learning in a mobile device are disclosed. In one aspect, a method includes activating a first modality user input mode in which user inputs by way of a first modality are recognized using a first modality recognizer; and receiving a user input by way of the first modality. The method includes, obtaining, as a result of the first modality recognizer recognizing the user input, a transcription that includes a particular term; and generating an input context data structure that references at least the particular term. The method further includes, transmitting, by the first modality recognizer, the input context data structure to a second modality recognizer for use in updating a second modality recognition model associated with the second modality recognizer.Type: ApplicationFiled: November 22, 2023Publication date: March 14, 2024Inventors: Yu Ouyang, Diego Melendo Casado, Mohammadinamul Hasan Sheik, Francoise Beaufays, Dragan Zivkovic, Meltem Oktem
-
Patent number: 11842045Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for cross input modality learning in a mobile device are disclosed. In one aspect, a method includes activating a first modality user input mode in which user inputs by way of a first modality are recognized using a first modality recognizer; and receiving a user input by way of the first modality. The method includes, obtaining, as a result of the first modality recognizer recognizing the user input, a transcription that includes a particular term; and generating an input context data structure that references at least the particular term. The method further includes, transmitting, by the first modality recognizer, the input context data structure to a second modality recognizer for use in updating a second modality recognition model associated with the second modality recognizer.Type: GrantFiled: August 31, 2022Date of Patent: December 12, 2023Assignee: Google LLCInventors: Yu Ouyang, Diego Melendo Casado, Mohammadinamul Hasan Sheik, Francoise Beaufays, Dragan Zivkovic, Meltem Oktem
-
Publication number: 20230335116Abstract: In some implementations, processor(s) can receive an utterance from a speaker, and determine whether the speaker is a known user of a user device or not a known user of the user device. The user device can be shared by a plurality of known users. Further, the processor(s) can determine whether the utterance corresponds to a personal request or non-personal request. Moreover, and in response to determining that the speaker not a known user of the user device and in response to determining that the utterance corresponds to a non-personal request, the processor(s) can cause a response to the utterance to be provided for presentation to the speaker at the user device response to the utterance, or can cause an action to be performed by the user device responsive to the utterance.Type: ApplicationFiled: June 16, 2023Publication date: October 19, 2023Inventors: Meltem Oktem, Taral Pradeep Joglekar, Fnu Heryandi, Pu-sen Chao, Ignacio Lopez Moreno, Salil Rajadhyaksha, Alexander H. Gruenstein, Diego Melendo Casado
-
Publication number: 20230289134Abstract: A method can perform a process with a method including capturing an image, determining an environment that a user is operating a computing device, detecting a hand gesture based on an object in the image, determining, using a machine learned model, an intent of a user based on the hand gesture and the environment, and executing a task based at least on the determined intent.Type: ApplicationFiled: December 30, 2022Publication date: September 14, 2023Inventors: Archana Kannan, Roza Chojnacka, Jamieson Kerns, Xiyang Luo, Meltem Oktem, Nada Elassal
-
Patent number: 11721326Abstract: In some implementations, processor(s) can receive an utterance from a speaker, and determine whether the speaker is a known user of a user device or not a known user of the user device. The user device can be shared by a plurality of known users. Further, the processor(s) can determine whether the utterance corresponds to a personal request or non-personal request. Moreover, and in response to determining that the speaker is not a known user of the user device and in response to determining that the utterance corresponds to a non-personal request, the processor(s) can cause a response to the utterance to be provided for presentation to the speaker at the user device response to the utterance, or can cause an action to be performed by the user device responsive to the utterance.Type: GrantFiled: January 26, 2022Date of Patent: August 8, 2023Assignee: GOOGLE LLCInventors: Meltem Oktem, Taral Pradeep Joglekar, Fnu Heryandi, Pu-sen Chao, Ignacio Lopez Moreno, Salil Rajadhyaksha, Alexander H. Gruenstein, Diego Melendo Casado
-
Patent number: 11543888Abstract: A method can perform a process with a method including capturing an image, determining an environment that a user is operating a computing device, detecting a hand gesture based on an object in the image, determining, using a machine learned model, an intent of a user based on the hand gesture and the environment, and executing a task based at least on the determined intent.Type: GrantFiled: June 25, 2020Date of Patent: January 3, 2023Assignee: GOOGLE LLCInventors: Archana Kannan, Roza Chojnacka, Jamieson Kerns, Xiyang Luo, Meltem Oktem, Nada Elassal
-
Publication number: 20220413696Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for cross input modality learning in a mobile device are disclosed. In one aspect, a method includes activating a first modality user input mode in which user inputs by way of a first modality are recognized using a first modality recognizer; and receiving a user input by way of the first modality. The method includes, obtaining, as a result of the first modality recognizer recognizing the user input, a transcription that includes a particular term; and generating an input context data structure that references at least the particular term. The method further includes, transmitting, by the first modality recognizer, the input context data structure to a second modality recognizer for use in updating a second modality recognition model associated with the second modality recognizer.Type: ApplicationFiled: August 31, 2022Publication date: December 29, 2022Inventors: Yu Ouyang, Diego Melendo Casado, Mohammadinamul Hasan Sheik, Francoise Beaufays, Dragan Zivkovic, Meltem Oktem
-
Patent number: 11435898Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for cross input modality learning in a mobile device are disclosed. In one aspect, a method includes activating a first modality user input mode in which user inputs by way of a first modality are recognized using a first modality recognizer; and receiving a user input by way of the first modality. The method includes, obtaining, as a result of the first modality recognizer recognizing the user input, a transcription that includes a particular term; and generating an input context data structure that references at least the particular term. The method further includes, transmitting, by the first modality recognizer, the input context data structure to a second modality recognizer for use in updating a second modality recognition model associated with the second modality recognizer.Type: GrantFiled: October 6, 2020Date of Patent: September 6, 2022Assignee: Google LLCInventors: Yu Ouyang, Diego Melendo Casado, Mohammadinamul Hasan Sheik, Francoise Beaufays, Dragan Zivkovic, Meltem Oktem
-
Publication number: 20220148577Abstract: In some implementations, authentication tokens corresponding to known users of a device are stored on the device. An utterance from a speaker is received. The speaker of the utterance is classified as not a known user of the device. A query that includes the authentication tokens that correspond to known users of the device, a representation of the utterance and an indication that the speaker was classified as not a known user of the device is provided to the server. A response to the query is received at the device and from the server based on the query.Type: ApplicationFiled: January 26, 2022Publication date: May 12, 2022Inventors: Meltem Oktem, Taral Pradeep Joglekar, Fnu Heryandi, Pu-sen Chao, Ignacio Lopez Moreno, Salil Rajadhyaksha, Alexander H. Gruenstein, Diego Melendo Casado
-
Patent number: 11238848Abstract: In some implementations, authentication tokens corresponding to known users of a device are stored on the device. An utterance from a speaker is received. The speaker of the utterance is classified as not a known user of the device. A query that includes the authentication tokens that correspond to known users of the device, a representation of the utterance, and an indication that the speaker was classified as not a known user of the device is provided to the server. A response to the query is received at the device and from the server based on the query.Type: GrantFiled: December 10, 2019Date of Patent: February 1, 2022Assignee: Google LLCInventors: Meltem Oktem, Taral Pradeep Joglekar, Fnu Heryandi, Pu-sen Chao, Ignacio Lopez Moreno, Salil Rajadhyaksha, Alexander H. Gruenstein, Diego Melendo Casado
-
Patent number: 11158128Abstract: A system and method may provide for spatial and semantic auto-completion of an augmented or mixed reality environment. The system may detect physical objects in a physical environment based on analysis of image frames captured by an image sensor of a computing device. The system may detect spaces in the physical environment that are occupied by the detected physical objects, and may detect spaces that are unoccupied in the physical environment. Based on the identification of the detected physical objects, the system may gain a semantic understanding of the physical environment, and may determine suggested objects for placement in the physical environment based on the semantic understanding. The system may place virtual representations of the suggested objects in a mixed reality scene of the physical environment for user consideration.Type: GrantFiled: April 26, 2019Date of Patent: October 26, 2021Assignee: GOOGLE LLCInventors: Roza Chojnacka, Meltem Oktem, Rajan Patel, Uday Idnani, Xiyang Luo
-
Patent number: 11145299Abstract: Methods, systems, and apparatus, including computer-readable storage devices, for managing voice interface devices. In some implementations, messages are received from a plurality of devices, each of the messages indicating a respective voice input detected by the device that sent the message. Audio signatures are obtained for the voice inputs detected by the plurality of devices. The audio signatures for the voice inputs and times that the voice inputs were detected are evaluated. Based on the evaluation, at least some of the plurality of devices are grouped to form a group of multiple devices that detected a same user utterance. A device from the group is selected to respond to the user utterance, and the multiple devices in the group are managed so that only the selected device outputs a synthesized speech response to the user utterance.Type: GrantFiled: April 19, 2018Date of Patent: October 12, 2021Assignee: X Development LLCInventors: Meltem Oktem, Max Benjamin Braun
-
Publication number: 20210019046Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for cross input modality learning in a mobile device are disclosed. In one aspect, a method includes activating a first modality user input mode in which user inputs by way of a first modality are recognized using a first modality recognizer; and receiving a user input by way of the first modality. The method includes, obtaining, as a result of the first modality recognizer recognizing the user input, a transcription that includes a particular term; and generating an input context data structure that references at least the particular term. The method further includes, transmitting, by the first modality recognizer, the input context data structure to a second modality recognizer for use in updating a second modality recognition model associated with the second modality recognizer.Type: ApplicationFiled: October 6, 2020Publication date: January 21, 2021Inventors: Yu Ouyang, Diego Melendo Casado, Mohammadinamul Hasan Sheik, Francoise Beaufays, Dragan Zivkovic, Meltem Oktem
-
Publication number: 20200409469Abstract: A method can perform a process with a method including capturing an image, determining an environment that a user is operating a computing device, detecting a hand gesture based on an object in the image, determining, using a machine learned model, an intent of a user based on the hand gesture and the environment, and executing a task based at least on the determined intent.Type: ApplicationFiled: June 25, 2020Publication date: December 31, 2020Inventors: Archana Kannan, Roza Chojnacka, Jamieson Kerns, Xiyang Luo, Meltem Oktem, Nada Elassal
-
Patent number: 10831366Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for cross input modality learning in a mobile device are disclosed. In one aspect, a method includes activating a first modality user input mode in which user inputs by way of a first modality are recognized using a first modality recognizer; and receiving a user input by way of the first modality. The method includes, obtaining, as a result of the first modality recognizer recognizing the user input, a transcription that includes a particular term; and generating an input context data structure that references at least the particular term. The method further includes, transmitting, by the first modality recognizer, the input context data structure to a second modality recognizer for use in updating a second modality recognition model associated with the second modality recognizer.Type: GrantFiled: December 29, 2016Date of Patent: November 10, 2020Assignee: Google LLCInventors: Yu Ouyang, Diego Melendo Casado, Mohammadinamul Hasan Sheik, Francoise Beaufays, Dragan Zivkovic, Meltem Oktem
-
Publication number: 20200342668Abstract: A system and method may provide for spatial and semantic auto-completion of an augmented or mixed reality environment. The system may detect physical objects in a physical environment based on analysis of image frames captured by an image sensor of a computing device. The system may detect spaces in the physical environment that are occupied by the detected physical objects, and may detect spaces that are unoccupied in the physical environment. Based on the identification of the detected physical objects, the system may gain a semantic understanding of the physical environment, and may determine suggested objects for placement in the physical environment based on the semantic understanding. The system may place virtual representations of the suggested objects in a mixed reality scene of the physical environment for user consideration.Type: ApplicationFiled: April 26, 2019Publication date: October 29, 2020Inventors: Roza Chojnacka, Meltem Oktem, Rajan Patel, Uday Idnani, Xiyang Luo
-
Publication number: 20200118550Abstract: In some implementations, authentication tokens corresponding to known users of a device are stored on the device. An utterance from a speaker is received. The utterance is classified as spoken by a particular known user of the known users. A query that includes a representation of the utterance and an indication of the particular known user as the speaker is provided using the authentication token of the particular known user.Type: ApplicationFiled: December 10, 2019Publication date: April 16, 2020Inventors: Meltem Oktem, Taral Pradeep Joglekar, Fnu Heryandi, Pu-sen Chao, Ignacio Lopez Moreno, Salil Rajadhyaksha, Alexander H. Gruenstein, Diego Melendo Casado
-
Patent number: 10522137Abstract: In some implementations, authentication tokens corresponding to known users of a device are stored on the device. An utterance from a speaker is received. The utterance is classified as spoken by a particular known user of the known users. A query that includes a representation of the utterance and an indication of the particular known user as the speaker is provided using the authentication token of the particular known user.Type: GrantFiled: April 18, 2018Date of Patent: December 31, 2019Assignee: Google LLCInventors: Meltem Oktem, Taral Pradeep Joglekar, Fnu Heryandi, Pu-sen Chao, Ignacio Lopez Moreno, Salil Rajadhyaksha, Alexander H. Gruenstein, Diego Melendo Casado
-
Publication number: 20190325865Abstract: Methods, systems, and apparatus, including computer-readable storage devices, for managing voice interface devices. In some implementations, messages are received from a plurality of devices, each of the messages indicating a respective voice input detected by the device that sent the message. Audio signatures are obtained for the voice inputs detected by the plurality of devices. The audio signatures for the voice inputs and times that the voice inputs were detected are evaluated. Based on the evaluation, at least some of the plurality of devices are grouped to form a group of multiple devices that detected a same user utterance. A device from the group is selected to respond to the user utterance, and the multiple devices in the group are managed so that only the selected device outputs a synthesized speech response to the user utterance.Type: ApplicationFiled: April 19, 2018Publication date: October 24, 2019Inventors: Meltem Oktem, Max Benjamin Braun