Patents by Inventor Victor Carbune
Victor Carbune has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20240386886Abstract: Implementations herein related to customizing an automated assistant using domain-specific resources. One or more resources are processed to generate a natural language representation of the contents of the resources. The natural language representation is utilized to customize an automated assistant for interactions with a user. Various implementations include priming and fine-tuning large language models that are utilized to implement the automated assistant. Various implementations are directed to biasing speech recognition based on terms identified in the resources. Various implementations are directed to customizing the tone of the automated assistant based on information included in the resources.Type: ApplicationFiled: May 15, 2023Publication date: November 21, 2024Inventors: Matthew Sharifi, Victor Carbune
-
Patent number: 12149773Abstract: Voice-based interaction with video content being presented by a media player application is enhanced through the use of an automated assistant capable of identifying when a spoken utterance by a user is a request to playback a specific scene in the video content. A query identified in a spoken utterance may be used to access stored scene metadata associated with video content being presented in the vicinity of the user to identify one or more locations in the video content that correspond to the query, such that a media control command may be issued to the media player application to cause the media player application to seek to a particular location in the video content that satisfies the query.Type: GrantFiled: September 2, 2022Date of Patent: November 19, 2024Assignee: GOOGLE LLCInventors: Matthew Sharifi, Victor Carbune
-
Patent number: 12147470Abstract: A method for handling contradictory queries on a shared device includes receiving a first query issued by a first user, the first query specifying a first long-standing operation for a digital assistant to perform, and while the digital assistant is performing the first long-standing operation, receiving a second query, the second query specifying a second long-standing operation for the digital assistant to perform. The method also includes determining that the second query was issued by another user different than the first user and determining, using a query resolver, that performing the second long-standing operation would conflict with the first long-standing operation. The method further includes identifying one or more compromise operations for the digital assistant to perform, and instructing the digital assistant to perform a selected compromise operation among the identified one or more compromise operations.Type: GrantFiled: October 6, 2022Date of Patent: November 19, 2024Assignee: Google LLCInventors: Matthew Sharifi, Victor Carbune
-
Publication number: 20240378700Abstract: System and methods are provided for generating panoramic imagery. An example method may be performed by one or more processors and includes obtaining first panoramic imagery depicting a geographic area. The method also includes obtaining an image depicting one or more physical objects absent from the first panoramic imagery. Further, the method includes transforming the first panoramic imagery into second panoramic imagery depicting the one or more physical objects and including at least a portion of the first panoramic imagery.Type: ApplicationFiled: July 23, 2024Publication date: November 14, 2024Inventors: Matthew Sharifi, Victor Carbune
-
Publication number: 20240380970Abstract: Implementations set forth herein relate to an automated assistant that can control a camera according to one or more conditions specified by a user. A condition can be satisfied when, for example, the automated assistant detects a particular environment feature is apparent. In this way, the user can rely on the automated assistant to identify and capture certain moments without necessarily requiring the user to constantly monitor a viewing window of the camera. In some implementations, a condition for the automated assistant to capture media data can be based on application data and/or other contextual data that is associated with the automated assistant. For instance, a relationship between content in a camera viewing window and other content of an application interface can be a condition upon which the automated assistant captures certain media data using a camera.Type: ApplicationFiled: July 25, 2024Publication date: November 14, 2024Inventors: Felix Weissenberger, Balint Miklos, Victor Carbune, Matthew Sharifi, Domenico Carbotta, Ray Chen, Kevin Fu, Bogdan Prisacari, Fo Lee, Mucun Lu, Neha Garg, Jacopo Sannazzaro Natta, Barbara Poblocka, Jae Seo, Matthew Miao, Thomas Qian, Luv Kothari
-
Publication number: 20240362746Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, that use generative adversarial models to increase the quality of sensor data generated by a first environmental sensor to resemble the quality of sensor data generated by another sensor having a higher quality than the first environmental sensor. A set of first and second training data generated by a first environmental sensor having a first quality and a second sensor having a target quality, respectively, is received. A generative adversarial mode is trained, using the set of first training data and the set of second training data, to modify sensor data from the first environmental sensor by reducing a difference in quality between the sensor data generated by the first environmental sensor and sensor data generated by the target environmental sensor.Type: ApplicationFiled: July 11, 2024Publication date: October 31, 2024Inventors: Victor Carbune, Daniel M. Keysers, Thomas Deselaers
-
Publication number: 20240347060Abstract: Some implementations process, using warm word model(s), a stream of audio data to determine a portion of the audio data that corresponds to particular word(s) and/or phrase(s) (e.g., a warm word) associated with an assistant command, process, using an automatic speech recognition (ASR) model, a preamble portion of the audio data (e.g., that precedes the warm word) and/or a postamble portion of the audio data (e.g., that follows the warm word) to generate ASR output, and determine, based on processing the ASR output, whether a user intended the assistant command to be performed. Additional or alternative implementations can process the stream of audio data using a speaker identification (SID) model to determine whether the audio data is sufficient to identify the user that provided a spoken utterance captured in the stream of audio data, and determine if that user is authorized to cause performance of the assistant command.Type: ApplicationFiled: June 21, 2024Publication date: October 17, 2024Inventors: Victor Carbune, Matthew Sharifi, Ondrej Skopek, Justin Lu, Daniel Valcarce, Kevin Kilgour, Mohamad Hassan Rom, Nicolo D'Ercole, Michael Golikov
-
Patent number: 12117308Abstract: To present a navigation directions preview, a server device receives a request for navigation directions from a starting location to a destination location and generates a set of navigation directions in response to the request. The set of navigation directions includes a set of route segments for traversing from the starting location to the destination location. The server device selects a subset of the route segments based on characteristics of each route segment in the set of route segments. For each selected route segment, the server device provides a preview of the route segment to be displayed on a client device. The preview of the route segment includes panoramic street level imagery depicting the route segment.Type: GrantFiled: August 18, 2020Date of Patent: October 15, 2024Assignee: GOOGLE LLCInventors: Victor Carbune, Matthew Sharifi
-
Publication number: 20240338396Abstract: Techniques are described herein for determining an information gain score for one or more documents of interest to the user and present information from the documents based on the information gain score. An information gain score for a given document is indicative of additional information that is included in the document beyond information contained in documents that were previously viewed by the user. In some implementations, the information gain score may be determined for one or more documents by applying data from the documents across a machine learning model to generate an information gain score. Based on the information gain scores of a set of documents, the documents can be provided to the user in a manner that reflects the likely information gain that can be attained by the user if the user were to view the documents.Type: ApplicationFiled: May 7, 2024Publication date: October 10, 2024Inventors: Victor Carbune, Pedro Gonnet Anders
-
Patent number: 12111834Abstract: Systems and methods for generating and providing outputs in a multi-device system can include leveraging environment-based prompt generation and generative model response generation to provide dynamic response generation and display. The systems and methods can obtain input data associated with one or more computing devices within an environment, can obtain environment data descriptive of the plurality of computing devices within the environment, and can generate a prompt based on the input data and environment data. The prompt can be processed with a generative model to generate a model-generated output. The model-generated output can then be transmitted to a particular computing device of the plurality of computing devices.Type: GrantFiled: December 20, 2023Date of Patent: October 8, 2024Assignee: GOOGLE LLCInventors: Victor Carbune, Arash Sadr, Matthew Sharifi
-
Patent number: 12111875Abstract: Implementations described herein relate to pairing a location-based automated assistant with a user device. The user device can include, for example, a headphones apparatus and/or a device that is paired with the headphones apparatus. The user device provides an indication that it is present at a location that is associated with a location-based automated assistant. A trust measure is determined that is indicative of trust between the user device and the location-based automated assistant. User information is provided by the user device to the location-based automated assistant. The location-based automated assistant determines response data to provide, via one or more speakers associated with the user device, that is specific to the location and further based on the user information.Type: GrantFiled: December 14, 2022Date of Patent: October 8, 2024Assignee: GOOGLE LLCInventors: Victor Carbune, Matthew Sharifi
-
Patent number: 12112754Abstract: Implementations relate to an automated assistant that can respond to communications received via a third party application and/or other third party communication modality. The automated assistant can determine that the user is participating in multiple different conversations via multiple different third party communication services. In some implementations, conversations can be processed to identify particular features of the conversations. When the automated assistant is invoked to provide input to a conversation, the automated assistant can compare the input to the identified conversation features in order to select the particular conversation that is most relevant to the input. In this way, the automated assistant can assist with any of multiple disparate conversations that are each occurring via a different third party application.Type: GrantFiled: November 20, 2023Date of Patent: October 8, 2024Assignee: GOOGLE LLCInventors: Victor Carbune, Matthew Sharifi
-
Publication number: 20240328808Abstract: To provide content-aware audio navigation instructions, a client device executing a mapping application obtains one or more audio navigation directions for traversing from a starting location to a destination location along a route. The client device also identifies electronic media content playing from a source different from the mapping application which is executing at the client device or in proximity to the client device. The client device determines characteristics of the electronic media content and adjusts the audio navigation directions in accordance with the characteristics of the electronic media content. Then the client device presents the adjusted audio navigation directions to a user.Type: ApplicationFiled: June 12, 2024Publication date: October 3, 2024Inventors: Victor Carbune, Matthew Sharifi
-
Patent number: 12106755Abstract: Techniques are described herein for warm word arbitration between automated assistant devices. A method includes: determining that warm word arbitration is to be initiated between a first assistant device and one or more additional assistant devices, including a second assistant device; broadcasting, by the first assistant device, to the one or more additional assistant devices, an active set of warm words for the first assistant device; for each of the one or more additional assistant devices, receiving, from the additional assistant device, an active set of warm words for the additional assistant device; identifying a matching warm word included in the active set of warm words for the first assistant device and included in the active set of warm words for the second assistant device; and enabling or disabling detection of the matching warm word by the first assistant device, in response to identifying the matching warm word.Type: GrantFiled: January 11, 2022Date of Patent: October 1, 2024Assignee: GOOGLE LLCInventors: Matthew Sharifi, Victor Carbune
-
Patent number: 12106758Abstract: Systems and methods described herein relate to determining whether to incorporate recognized text, that corresponds to a spoken utterance of a user of a client device, into a transcription displayed at the client device, or to cause an assistant command, that is associated with the transcription and that is based on the recognized text, to be performed by an automated assistant implemented by the client device. The spoken utterance is received during a dictation session between the user and the automated assistant. Implementations can process, using automatic speech recognition model(s), audio data that captures the spoken utterance to generate the recognized text. Further, implementations can determine whether to incorporate the recognized text into the transcription or cause the assistant command to be performed based on touch input being directed to the transcription, a state of the transcription, and/or audio-based characteristic(s) of the spoken utterance.Type: GrantFiled: May 17, 2021Date of Patent: October 1, 2024Assignee: GOOGLE LLCInventors: Victor Carbune, Alvin Abdagic, Behshad Behzadi, Jacopo Sannazzaro Natta, Julia Proskurnia, Krzysztof Andrzej Goj, Srikanth Pandiri, Viesturs Zarins, Nicolo D'Ercole, Zaheed Sabur, Luv Kothari
-
Publication number: 20240321277Abstract: Implementations described herein relate to an application and/or automated assistant that can identify arrangement operations to perform for arranging text during speech-to-text operations—without a user having to expressly identify the arrangement operations. In some instances, a user that is dictating a document (e.g., an email, a text message, etc.) can provide a spoken utterance to an application in order to incorporate textual content. However, in some of these instances, certain corresponding arrangements are needed for the textual content in the document. The textual content that is derived from the spoken utterance can be arranged by the application based on an intent, vocalization features, and/or contextual features associated with the spoken utterance and/or a type of the application associated with the document, without the user expressly identifying the corresponding arrangements.Type: ApplicationFiled: May 29, 2024Publication date: September 26, 2024Inventors: Victor Carbune, Krishna Sapkota, Behshad Behzadi, Julia Proskurnia, Jacopo Sannazzaro Natta, Justin Lu, Magali Boizot-Roche, Marius Sajgalik, Nicolo D'Ercole, Zaheed Sabur, Luv Kothari
-
Publication number: 20240318971Abstract: The present disclosure is directed to interactive voice navigation. In particular, a computing system can provide audio information including one or more navigation instructions to a user via a computing system associated with the user. The computing system can activate an audio sensor associated with the computing system. The computing system can collect, using the audio sensor, audio data associated with the user. The computing system can determine, based on the audio data, whether the audio data is associated with one or more navigation instructions. The computing system can, in accordance with a determination that the audio data is associated with one or more navigation instructions, determine a context-appropriate audio response. The computing system can provide the context-appropriate audio response to the user.Type: ApplicationFiled: February 29, 2024Publication date: September 26, 2024Inventors: Victor Carbune, Matthew Sharifi, Blaise Aguera-Arcas
-
Publication number: 20240312455Abstract: Implementations relate to transferring actions from a shared device to a personal device that is associated with an account of a user. Some implementations relate to determining that a request is associated with sensitive information, determining that one or more other users are co-present with the shared device, and transferring the request that is related to sensitive information to a personal device of the user. Some implementations relate determining that a user is no longer co-present with a shared device that is currently performing one or more actions and transferring one or more of the actions to a personal device that is associated with an account of the user.Type: ApplicationFiled: March 14, 2023Publication date: September 19, 2024Inventors: Matthew Sharifi, Victor Carbune
-
Patent number: 12094454Abstract: Implementations described herein include detecting a stream of audio data that captures a spoken utterance of the user and that captures ambient noise occurring within a threshold time period of the spoken utterance being spoken by the user. Implementations further include processing a portion of the audio data that includes the ambient noise to determine ambient noise classification(s), processing a portion of the audio data that includes the spoken utterance to generate a transcription, processing both the transcription and the ambient noise classification(s) with a machine learning model to generate a user intent and parameter(s) for the user intent, and performing one or more automated assistant actions based on the user intent and using the parameter(s).Type: GrantFiled: January 5, 2022Date of Patent: September 17, 2024Assignee: GOOGLE LLCInventors: Victor Carbune, Matthew Sharifi
-
Patent number: 12087297Abstract: A method includes receiving a first instance of raw audio data corresponding to a voice-based command and receiving a second instance of the raw audio data corresponding to an utterance of audible contents for an audio-based communication spoken by a user. When a voice filtering recognition routine determines to activate voice filtering for at least the voice of the user, the method also includes obtaining a respective speaker embedding of the user and processing, using the respective speaker embedding, the second instance of the raw audio data to generate enhanced audio data for the audio-based communication that isolates the utterance of the audible contents spoken by the user and excludes at least a portion of the one or more additional sounds that are not spoken by the user The method also includes executing.Type: GrantFiled: September 9, 2022Date of Patent: September 10, 2024Assignee: Google LLCInventors: Matthew Sharifi, Victor Carbune