Patents by Inventor Victor Carbune

Victor Carbune has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20220366910
    Abstract: Systems and methods described herein relate to determining whether to incorporate recognized text, that corresponds to a spoken utterance of a user of a client device, into a transcription displayed at the client device, or to cause an assistant command, that is associated with the transcription and that is based on the recognized text, to be performed by an automated assistant implemented by the client device. The spoken utterance is received during a dictation session between the user and the automated assistant. Implementations can process, using automatic speech recognition model(s), audio data that captures the spoken utterance to generate the recognized text. Further, implementations can determine whether to incorporate the recognized text into the transcription or cause the assistant command to be performed based on touch input being directed to the transcription, a state of the transcription, and/or audio-based characteristic(s) of the spoken utterance.
    Type: Application
    Filed: May 17, 2021
    Publication date: November 17, 2022
    Inventors: Victor Carbune, Alvin Abdagic, Behshad Behzadi, Jacopo Sannazzaro Natta, Julia Proskurnia, Krzysztof Andrzej Goj, Srikanth Pandiri, Viesturs Zarins, Nicolo D'Ercole, Zaheed Sabur, Luv Kothari
  • Publication number: 20220366903
    Abstract: Some implementations process, using warm word model(s), a stream of audio data to determine a portion of the audio data that corresponds to particular word(s) and/or phrase(s) (e.g., a warm word) associated with an assistant command, process, using an automatic speech recognition (ASR) model, a preamble portion of the audio data (e.g., that precedes the warm word) and/or a postamble portion of the audio data (e.g., that follows the warm word) to generate ASR output, and determine, based on processing the ASR output, whether a user intended the assistant command to be performed. Additional or alternative implementations can process the stream of audio data using a speaker identification (SID) model to determine whether the audio data is sufficient to identify the user that provided a spoken utterance captured in the stream of audio data, and determine if that user is authorized to cause performance of the assistant command.
    Type: Application
    Filed: May 17, 2021
    Publication date: November 17, 2022
    Inventors: Victor Carbune, Matthew Sharifi, Ondrej Skopek, Justin Lu, Daniel Valcarce, Kevin Kilgour, Mohamad Hassan Rom, Nicolo D'Ercole, Michael Golikov
  • Publication number: 20220355814
    Abstract: To identify driving event sounds during navigation, a client device in a vehicle provides a set of navigation directions for traversing from a starting location to a destination location along a route. During navigation to the destination location, the client device identifies audio that includes a driving event sound from within the vehicle or an area surrounding the vehicle. In response to determining that the audio includes the driving event sound, the client device determines whether the driving event sound is artificial. In response to determining that the driving event sound is artificial, the client device presents a notification to the driver indicating that the driving event sound is artificial or masks the driving event sound to prevent the driver from hearing the driving event sound.
    Type: Application
    Filed: November 18, 2020
    Publication date: November 10, 2022
    Inventors: Matthew Sharifi, Victor Carbune
  • Patent number: 11495217
    Abstract: Techniques are described herein for enabling an automated assistant to adjust its behavior depending on a detected age range and/or “vocabulary level” of a user who is engaging with the automated assistant. In various implementations, data indicative of a user's utterance may be used to estimate one or more of the user's age range and/or vocabulary level. The estimated age range/vocabulary level may be used to influence various aspects of a data processing pipeline employed by an automated assistant. In various implementations, aspects of the data processing pipeline that may be influenced by the user's age range/vocabulary level may include one or more of automated assistant invocation, speech-to-text (“STT”) processing, intent matching, intent resolution (or fulfillment), natural language generation, and/or text-to-speech (“TTS”) processing. In some implementations, one or more tolerance thresholds associated with one or more of these aspects, such as grammatical tolerances, vocabularic tolerances, etc.
    Type: Grant
    Filed: December 27, 2019
    Date of Patent: November 8, 2022
    Assignee: GOOGLE LLC
    Inventors: Pedro Gonnet Anders, Victor Carbune, Daniel Keysers, Thomas Deselaers, Sandro Feuz
  • Patent number: 11494667
    Abstract: Example aspects of the present disclosure are directed to systems and methods that enable improved adversarial training of machine-learned models. An adversarial training system can generate improved adversarial training examples by optimizing or otherwise tuning one or hyperparameters that guide the process of generating of the adversarial examples. The adversarial training system can determine, solicit, or otherwise obtain a realism score for an adversarial example generated by the system. The realism score can indicate whether the adversarial example appears realistic. The adversarial training system can adjust or otherwise tune the hyperparameters to produce improved adversarial examples (e.g., adversarial examples that are still high-quality and effective while also appearing more realistic). Through creation and use of such improved adversarial examples, a machine-learned model can be trained to be more robust against (e.g.
    Type: Grant
    Filed: January 18, 2018
    Date of Patent: November 8, 2022
    Assignee: GOOGLE LLC
    Inventors: Victor Carbune, Thomas Deselaers
  • Patent number: 11494631
    Abstract: Methods, systems, and apparatuses for implementing advanced content retrieval are described. Machine learning methods may be implemented so that a system may predict when a user device may experience network disconnections. The system may also predict the type of content one or more applications on the user device may seek to download during the network disconnection period. Neural networks may be trained based on user activity log data and may implement machine-learning techniques to determine user preferences and settings for advanced content retrieval. The system may predict when a user may want to download content in advance, the type of content the user may be interested in, anticipated network connectivity, and anticipated battery consumption. The system may then generate recommendations for the user device based on the predictions. If a user agrees with the recommendations, the system may obtain and cache the content.
    Type: Grant
    Filed: September 27, 2017
    Date of Patent: November 8, 2022
    Assignee: GOOGLE LLC
    Inventors: Victor Carbune, Sandro Feuz
  • Publication number: 20220351227
    Abstract: Group actions may be performed on behalf of multiple users based in part on the suitability of the various user devices of the different users to perform such group actions. Different user devices may also be used to generate different query intent determinations for a query such that the query intent determination made by a particular user device may be used to fulfill the query.
    Type: Application
    Filed: July 14, 2022
    Publication date: November 3, 2022
    Inventors: Victor Carbune, Matthew Sharifi
  • Patent number: 11488597
    Abstract: Implementations set forth herein relate to an automated assistant that allows a user to create, edit, and/or share documents without directly interfacing with a document editing application. The user can provide an input to the automated assistant in order to cause the automated assistant to interface with the document editing application and create a document. In order to identify a particular action to perform with respect to a document, and/or identify a particular subsection within the document to direct the action, the automated assistant can rely on semantic annotations. As a user continues to interact with the automated assistant to edit a document, the semantic annotations can be updated according to how the document is changing and/or how the user refers to the document. This can allow the automated assistant to more readily fulfill document-related requests that may lack express details.
    Type: Grant
    Filed: September 8, 2020
    Date of Patent: November 1, 2022
    Assignee: GOOGLE LLC
    Inventors: Victor Carbune, Matthew Sharifi
  • Patent number: 11490172
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, identify and classify the various video pathways in an interactive video based on the content of these video pathways. A video comprising multiple video segments is obtained from a video library. Each video segment is directly linked to at least one other video segment and the multiple video segments comprise a beginning segment, intermediate segments (including interactive segments), and final segments. Multiple video pathways in the video are identified. For each identified video pathway, classification data is generated and each such video pathway is then stored in the video library. When the video is selected from a particular category of the video library, the video segments of a video pathway that has a classification which is the same as the classification associated with the particular category, is then displayed.
    Type: Grant
    Filed: July 23, 2019
    Date of Patent: November 1, 2022
    Assignee: GOOGLE LLC
    Inventors: Victor Carbune, Andrii Maksai, Sandro Feuz
  • Publication number: 20220339542
    Abstract: The disclosed subject matter can receive a source video and identifies one or more player actions based on the source video. A second video can be received that is based on a currently executing game environment. A portion of the source video that exhibits a first gameplay situation that is similar to a gameplay situation in the second video can be determined. A property of the determined portion of the source video can be adjusted to produce a guide video. The guide video can be overlaid on the currently executing game environment.
    Type: Application
    Filed: October 4, 2019
    Publication date: October 27, 2022
    Inventors: Alexandru-Marian Damian, Victor Carbune
  • Publication number: 20220342537
    Abstract: A method includes obtaining proximity information for each of a plurality of assistant-enabled devices within an environment of a user device. Each assistant-enabled device is controllable by an assistant application to perform a respective set of available actions associated with the assistant-enabled device. For each assistant-enabled device, the method also includes determining a proximity score based on the proximity information indicating a proximity estimation of the corresponding assistant-enabled device relative to the user device. The method further includes generating, using the proximity scores determined for the assistant-enabled devices, a ranked list of candidate assistant-enabled devices, and for each corresponding assistant-enabled device in the ranked list, displaying, in a graphical user interface (GUI), a respective set of controls for performing the respective set of actions associated with the corresponding assistant-enabled device.
    Type: Application
    Filed: July 12, 2022
    Publication date: October 27, 2022
    Applicant: Google LLC
    Inventors: Matthew Sharifi, Victor Carbune
  • Patent number: 11483170
    Abstract: Systems and methods for video conference content auto-retrieval and focus based on learned relevance is provided. In accordance with the systems and methods, audio streams and video streams from client devices participating in a video conference are received. Based on the audio streams, a subject being discussed during the video conference at a point in time is determined. A video stream that is most relevant to the subject being discussed during the video conference at the point in time is determined from the video streams. The determined video stream is provided to the client devices for presentation on the client devices while the subject is being discussed during the video conference.
    Type: Grant
    Filed: December 30, 2019
    Date of Patent: October 25, 2022
    Assignee: Google LLC
    Inventors: Victor Carbune, Daniel Keysers, Thomas Deselaers
  • Publication number: 20220335932
    Abstract: Systems and methods for determining whether to combine responses from multiple automated assistants. An automated assistant may be invoked by a user utterance, followed by a query, which is provided to a plurality of automated assistants. A first response is received from a first automated assistant and a second response is received from a second automated assistant. Based on similarity between the responses, a primary automated assistant determines whether to combine the responses into a combined response. Once the combined response has been generated, one or more actions are performed in response to the combined response.
    Type: Application
    Filed: April 15, 2021
    Publication date: October 20, 2022
    Inventors: Matthew Sharifi, Victor Carbune
  • Patent number: 11468052
    Abstract: Methods, systems, and computer readable media related to generating a combined search query based on search parameters of a current search query of a user and search parameters of one or more previously submitted search quer(ies) of the user that are determined to be of the same line of inquiry as the current search query. Two or more search queries may be determined to share a line of inquiry when it is determined that they are within a threshold level of semantic similarity to one another. Once a shared line of inquiry has been identified and a combined search query generated, users may interact with the search parameters and/or the search results to update the search parameters of the combined search query.
    Type: Grant
    Filed: June 25, 2020
    Date of Patent: October 11, 2022
    Assignee: GOOGLE LLC
    Inventors: Matthew Sharifi, Victor Carbune
  • Publication number: 20220316917
    Abstract: To provide dynamic generation and suggestion of map tiles, a server device receives from a user device a request for map data for a particular geographic region. The server device obtains a set of user contextual data and a set of candidate map tiles associated with the particular geographic region. The server device then selects one or more of the set of candidate map tiles based on the set of user contextual data, and transmits the one or more selected map tile to the user device for display.
    Type: Application
    Filed: December 19, 2019
    Publication date: October 6, 2022
    Inventors: Victor Carbune, Kevin Allekotte
  • Patent number: 11462219
    Abstract: A method includes receiving a first instance of raw audio data corresponding to a voice-based command and receiving a second instance of the raw audio data corresponding to an utterance of audible contents for an audio-based communication spoken by a user. When a voice filtering recognition routine determines to activate voice filtering for at least the voice of the user, the method also includes obtaining a respective speaker embedding of the user and processing, using the respective speaker embedding, the second instance of the raw audio data to generate enhanced audio data for the audio-based communication that isolates the utterance of the audible contents spoken by the user and excludes at least a portion of the one or more additional sounds that are not spoken by the user The method also includes executing.
    Type: Grant
    Filed: October 30, 2020
    Date of Patent: October 4, 2022
    Assignee: Google LLC
    Inventors: Matthew Sharifi, Victor Carbune
  • Patent number: 11455996
    Abstract: Implementations set forth herein relate to an automated assistant that provides a response for certain user queries based on a level of interaction of the user with respect to the automated assistant. Interaction can be characterized by sensor data, which can be processed using one or more trained machine learning models in order to identify parameters for generating a response. In this way, the response can be limited to preserve computational resources and/or ensure that the response is more readily understood given the amount of interaction exhibited by the user. In some instances, a response that embodies information that is supplemental, to an otherwise suitable response, can be provided when a user is exhibiting a particular level of interaction. In other instances, such supplemental information can be withheld when the user is not exhibiting that particular level of interaction, at least in order to preserve computational resources.
    Type: Grant
    Filed: August 4, 2020
    Date of Patent: September 27, 2022
    Assignee: GOOGLE LLC
    Inventors: Victor Carbune, Matthew Sharifi
  • Publication number: 20220301129
    Abstract: System and methods are provided for generating panoramic imagery. An example method may be performed by one or more processors and includes obtaining first panoramic imagery depicting a geographic area. The method also includes obtaining an image depicting one or more physical objects absent from the first panoramic imagery. Further, the method includes transforming the first panoramic imagery into second panoramic imagery depicting the one or more physical objects and including at least a portion of the first panoramic imagery.
    Type: Application
    Filed: September 2, 2020
    Publication date: September 22, 2022
    Inventors: Matthew Sharifi, Victor Carbune
  • Publication number: 20220299335
    Abstract: To provide content-aware audio navigation instructions, a client device executing a mapping application obtains one or more audio navigation directions for traversing from a starting location to a destination location along a route. The client device also identifies electronic media content playing from a source different from the mapping application which is executing at the client device or in proximity to the client device. The client device determines characteristics of the electronic media content and adjusts the audio navigation directions in accordance with the characteristics of the electronic media content. Then the client device presents the adjusted audio navigation directions to a user.
    Type: Application
    Filed: October 22, 2020
    Publication date: September 22, 2022
    Inventors: Victor Carbune, Matthew Sharifi
  • Publication number: 20220292128
    Abstract: Implementations described herein relate to receiving user input directed to an automated assistant, processing the user input to determine whether data from a server and/or third-party application is needed to perform certain fulfillment of an assistant command included in the user input, and generating a prompt that requests a user consent to transmitting of a request to the server and/or the third-party application to obtain the data needed to perform the certain fulfillment. In implementations where the user consents, the data can be obtained and utilized to perform the certain fulfillment. In implementations where the user does not consent, client data can be generated locally at a client device and utilized to perform alternate fulfillment of the assistant command. In various implementations, the request transmitted to the server and/or third-party application can be modified based on ambient noise captured when the user input is received.
    Type: Application
    Filed: March 12, 2021
    Publication date: September 15, 2022
    Inventors: Matthew Sharifi, Victor Carbune