Patents Assigned to GOOGLE

HOTWORD DETECTION ON MULTIPLE DEVICES

Publication number: 20240233727

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for hotword detection on multiple devices are disclosed. In one aspect, a method includes the actions of receiving, by a computing device, audio data that corresponds to an utterance. The actions further include determining a likelihood that the utterance includes a hotword. The actions further include determining a loudness score for the audio data. The actions further include based on the loudness score, determining an amount of delay time. The actions further include, after the amount of delay time has elapsed, transmitting a signal that indicates that the computing device will initiate speech recognition processing on the audio data.

Type: Application

Filed: March 26, 2024

Publication date: July 11, 2024

Applicant: GOOGLE LLC

Inventors: Jakob Nicolaus FOERSTER, Alexander H. Gruenstein
Concurrent reception of multiple user speech input for translation

Patent number: 12032924

Abstract: An improved translation experience is provided using an auxiliary device, such as a pair of earbuds, and a wirelessly coupled mobile device. Microphones on both the auxiliary device and the mobile device simultaneously capture input from, respectively, a primary user (e.g., wearing the auxiliary device) and a secondary user (e.g., a foreign language speaker providing speech that the primary user desires to translate). Both microphones continually listen, rather than alternating between the mobile device and the auxiliary device. Each device may determine when to endpoint and send a block of speech for translation, for example based on pauses in the speech. Each device may accordingly send the received speech for translation and output, such that it is provided in a natural flow of communication.

Type: Grant

Filed: October 4, 2021

Date of Patent: July 9, 2024

Assignee: GOOGLE LLC

Inventor: Deric Cheng
Systems and methods for analyzing text extracted from images and performing appropriate transformations on the extracted text

Patent number: 12033620

Abstract: The present disclosure provides computer-implemented methods, systems, and devices for responding to requests associated with an image. A computing system obtains, wherein the image depicts a first set of textual content. The computing system determines one or more characteristics of the first set of textual content. The computing system determines a response type from a plurality of response types based on the one or more characteristics. The computing system generates a model input, wherein the model input comprises data descriptive of the first set of textual content and a prompt associated with the response type. The computing system provides providing the model input as an input to a machine-learned language model. The computing system receives a second set of text as an output of the machine-learned language model as a result of the machine-learned language model processing the model input.

Type: Grant

Filed: September 8, 2023

Date of Patent: July 9, 2024

Assignee: GOOGLE LLC

Inventors: Harshit Kharbanda, Jessica Lee, Christopher James Kelley, Fabian Roth, Dounia Berrada, Samer Hassan Hassan, Afroz Mohiuddin, Mikhail Khalman, Ali Essam Ali Elqursh, Belinda Luna Zeng
Image watermarked with a string associated with image metadata associated with the image

Patent number: 12033232

Abstract: The present disclosure provides systems and methods for improved image watermarking to improve robustness and capacity, without degrading perceptibility. Specifically, the systems and methods discussed herein allow for a higher decoding success rate, at the same distortion level and message rate; or a higher message rate, at the same distortion level and decoding success rate. Implementations of these systems utilize a side chain of additional information, available only to the decoder and not the encoder, to achieve asymptotically lossless data compression, allowing the same message to be transmitted in fewer bits.

Type: Grant

Filed: June 18, 2020

Date of Patent: July 9, 2024

Assignee: GOOGLE LLC

Inventors: Daan He, Dake He
Systems and methods for extracting information from a physical document

Patent number: 12033412

Abstract: Systems and methods for extracting information from documents are provided. In one example embodiment, a computer-implemented method includes obtaining one or more units of text from an image of a document. The method includes determining one or more annotated values from the one or more units of text and determining a set of candidate labels for each annotated value. The method determines each set of candidate labels by performing a search for the candidate labels based at least in part on a language associated with the document and a location of each annotated value. The method includes determining a canonical label for each annotated value based at least in part on the associated candidate labels, and mapping at least one annotated value to an action that is presented to a user based at least in part on the canonical label associated with the annotated value.

Type: Grant

Filed: January 28, 2019

Date of Patent: July 9, 2024

Assignee: GOOGLE LLC

Inventors: Rakesh Iyer, Lisha Ruan
Voice shortcut detection with speaker verification

Patent number: 12033641

Abstract: Techniques disclosed herein are directed towards streaming keyphrase detection which can be customized to detect one or more particular keyphrases, without requiring retraining of any model(s) for those particular keyphrase(s). Many implementations include processing audio data using a speaker separation model to generate separated audio data which isolates an utterance spoken by a human speaker from one or more additional sounds not spoken by the human speaker, and processing the separated audio data using a text independent speaker identification model to determine whether a verified and/or registered user spoke a spoken utterance captured in the audio data. Various implementations include processing the audio data and/or the separated audio data using an automatic speech recognition model to generate a text representation of the utterance.

Type: Grant

Filed: January 30, 2023

Date of Patent: July 9, 2024

Assignee: GOOGLE LLC

Inventors: Rajeev Rikhye, Quan Wang, Yanzhang He, Qiao Liang, Ian C. McGraw
Photorealistic talking faces from audio

Patent number: 12033259

Abstract: Provided is a framework for generating photorealistic 3D talking faces conditioned only on audio input. In addition, the present disclosure provides associated methods to insert generated faces into existing videos or virtual environments. We decompose faces from video into a normalized space that decouples 3D geometry, head pose, and texture. This allows separating the prediction problem into regressions over the 3D face shape and the corresponding 2D texture atlas. To stabilize temporal dynamics, we propose an auto-regressive approach that conditions the model on its previous visual state. We also capture face illumination in our model using audio-independent 3D texture normalization.

Type: Grant

Filed: January 29, 2021

Date of Patent: July 9, 2024

Assignee: GOOGLE LLC

Inventors: Vivek Kwatra, Christian Frueh, Avisek Lahiri, John Lewis
Arranging and/or clearing speech-to-text content without a user providing express instructions

Patent number: 12033637

Abstract: Implementations described herein relate to an application and/or automated assistant that can identify arrangement operations to perform for arranging text during speech-to-text operations—without a user having to expressly identify the arrangement operations. In some instances, a user that is dictating a document (e.g., an email, a text message, etc.) can provide a spoken utterance to an application in order to incorporate textual content. However, in some of these instances, certain corresponding arrangements are needed for the textual content in the document. The textual content that is derived from the spoken utterance can be arranged by the application based on an intent, vocalization features, and/or contextual features associated with the spoken utterance and/or a type of the application associated with the document, without the user expressly identifying the corresponding arrangements.

Type: Grant

Filed: June 3, 2021

Date of Patent: July 9, 2024

Assignee: GOOGLE LLC

Inventors: Victor Carbune, Krishna Sapkota, Behshad Behzadi, Julia Proskurnia, Jacopo Sannazzaro Natta, Justin Lu, Magali Boizot-Roche, Márius {hacek over (S)}ajgalík, Nicolo D'Ercole, Zaheed Sabur, Luv Kothari
Optimization of parameters of a system, product, or process

Patent number: 12032464

Abstract: A computer-implemented method is provided for optimization of parameters of a system, product, or process. The method includes establishing an optimization procedure for a system, product, or process. The system, product, or process has an evaluable performance that is dependent on values of one or more adjustable parameters. The method includes receiving one or more prior evaluations of performance of the system, product, or process. The one or more prior evaluations are respectively associated with one or more prior variants of the system, product, or process. The one or more prior variants are each defined by a set of values for the one or more adjustable parameters. The method includes utilizing an optimization algorithm to generate a suggested variant based at least in part on the one or more prior evaluations of performance and the associated set of values.

Type: Grant

Filed: June 2, 2017

Date of Patent: July 9, 2024

Assignee: GOOGLE LLC

Inventors: Daniel Reuben Golovin, Benjamin Solnik, Subhodeep Moitra, David W. Sculley, II
Panning in a three dimensional environment on a mobile device

Patent number: 12032802

Abstract: This invention relates to panning in a three dimensional environment on a mobile device. In an embodiment, a computer-implemented method for navigating a virtual camera in a three dimensional environment on a mobile device having a touch screen. A user input is received indicating that an object has touched a first point on a touch screen of the mobile device and the object has been dragged to a second point on the touch screen. A first target location in the three dimensional environment is determined based on the first point on the touch screen. A second target location in the three dimensional environment is determined based on the second point on the touch screen. Finally, a three dimensional model is moved in the three dimensional environment relative to the virtual camera according to the first and second target locations.

Type: Grant

Filed: July 2, 2021

Date of Patent: July 9, 2024

Assignee: GOOGLE LLC

Inventor: David Kornmann
Compound prediction for video coding

Patent number: 12034963

Abstract: Generating a compound predictor block for a current block of video includes generating, for the current block, a first predictor block using one of inter-prediction or intra-prediction and generating a second predictor block. The first predictor block includes a first pixel and the second predictor block includes a second pixel that is co-located with the first pixel. A first weight is determined for the first pixel using a difference between a value of the first pixel and a value of the second pixel. A second weight is determined for the second pixel using the first weight. The compound predictor block is generated by combining the first predictor block and the second predictor block. The compound predictor block includes a weighted pixel that is determined using a weighted sum of the first pixel and the second pixel using the first weight and the second weight.

Type: Grant

Filed: April 28, 2022

Date of Patent: July 9, 2024

Assignee: GOOGLE LLC

Inventors: Debargha Mukherjee, James Bankoski, Yue Chen, Yuxin Liu, Sarah Parker
Direct speech-to-speech translation via machine learning

Patent number: 12032920

Abstract: The present disclosure provides systems and methods that train and use machine-learned models such as, for example, sequence-to-sequence models, to perform direct and text-free speech-to-speech translation. In particular, aspects of the present disclosure provide an attention-based sequence-to-sequence neural network which can directly translate speech from one language into speech in another language, without relying on an intermediate text representation.

Type: Grant

Filed: March 7, 2020

Date of Patent: July 9, 2024

Assignee: GOOGLE LLC

Inventors: Ye Jia, Zhifeng Chen, Yonghui Wu, Melvin Johnson, Fadi Biadsy, Ron Weiss, Wolfgang Macherey
Automated assistant performance of a non-assistant application operation(s) in response to a user input that can be limited to a parameter(s)

Patent number: 12032874

Abstract: Implementations set forth herein relate to an automated assistant that can provide a selectable action intent suggestion when a user is accessing a third party application that is controllable via the automated assistant. The action intent can be initialized by the user without explicitly invoking the automated assistant using, for example, an invocation phrase (e.g., “Assistant . . . ”). Rather, the user can initialize performance of the corresponding action by identifying one or more action parameters. In some implementations, the selectable suggestion can indicate that a microphone is active for the user to provide a spoken utterance that identifies a parameter(s). When the action intent is initialized in response to the spoken utterance from the user, the automated assistant can control the third party application according to the action intent and any identified parameter(s).

Type: Grant

Filed: August 7, 2023

Date of Patent: July 9, 2024

Assignee: GOOGLE LLC

Inventors: Joseph Lange, Marcin Nowak-Przygodzki
Sparse recovery autoencoder

Patent number: 12033080

Abstract: A sparse dataset is encoded using a data-driven learned sensing matrix. For example, an example method includes receiving a dataset of sparse vectors with dimension d from a requesting process, initializing an encoding matrix of dimension k×d, selecting a subset of sparse vectors from the dataset, and updating the encoding matrix via machine learning. Updating the encoding matrix includes using a linear encoder to generate an encoded vector of dimension k for each vector in the subset, the linear encoder using the encoding matrix, using a non-linear decoder to decode each of the encoded vectors, the non-linear decoder using a transpose of the encoding matrix in a projected subgradient, and adjusting the encoding matrix using back propagation. The method also includes returning an embedding of each sparse vector in the dataset of sparse vectors, the embedding being generated with the updated encoding matrix.

Type: Grant

Filed: June 14, 2019

Date of Patent: July 9, 2024

Assignee: GOOGLE LLC

Inventors: Xinnan Yu, Shanshan Wu, Daniel Holtmann-Rice, Dmitry Storcheus, Sanjiv Kumar, Afshin Rostamizadeh
Method and Apparatus for Using Image Data to Aid Voice Recognition

Publication number: 20240221745

Abstract: A device performs a method for using image data to aid voice recognition. The method includes the device capturing image data of a vicinity of the device and adjusting, based on the image data, a set of parameters for voice recognition performed by the device. The set of parameters for the device performing voice recognition include, but are not limited to: a trigger threshold of a trigger for voice recognition; a set of beamforming parameters; a database for voice recognition; and/or an algorithm for voice recognition. The algorithm may include using noise suppression or using acoustic beamforming.

Type: Application

Filed: March 15, 2024

Publication date: July 4, 2024

Applicant: GOOGLE TECHNOLOGY HOLDINGS LLC

Inventors: Robert A. Zurek, Adrian M. Schuster, Fu-Lin Shau, Jincheng Wu
Optimization of parameter values for machine-learned models

Patent number: 12026612

Abstract: A computer-implemented method can include receiving, by one or more computing devices, one or more prior evaluations of performance of a machine learning model, the one or more prior evaluations being respectively associated with one or more prior variants of the machine-learning model, the one or more prior variants of the machine-learning model each having been configured using a different set of adjustable parameter values. The method can include utilizing, by the one or more computing devices, an optimization algorithm to generate a suggested variant of the machine-learning model based at least in part on the one or more prior evaluations of performance and the associated set of adjustable parameter values, the suggested variant of the machine-learning model being defined by a suggested set of adjustable parameter values.

Type: Grant

Filed: June 2, 2017

Date of Patent: July 2, 2024

Assignee: GOOGLE LLC

Inventors: Daniel Reuben Golovin, Benjamin Solnik, Subhodeep Moitra, David W. Sculley, II, Gregory Peter Kochanski
Action suggestions for user-selected content

Patent number: 12026593

Abstract: Systems and methods are provided for suggesting actions for selected text based on content displayed on a mobile device. An example method can include converting a selection made via a display device into a query, providing the query to an action suggestion model that is trained to predict an action given a query, each action being associated with a mobile application, receiving one or more predicted actions, and initiating display of the one or more predicted actions on the display device. Another example method can include identifying, from search records, queries where a website is highly ranked, the website being one of a plurality of websites in a mapping of websites to mobile applications. The method can also include generating positive training examples for an action suggestion model from the identified queries, and training the action suggestion model using the positive training examples.

Type: Grant

Filed: October 15, 2020

Date of Patent: July 2, 2024

Assignee: GOOGLE LLC

Inventors: Matthew Sharifi, Daniel Ramage, David Petrou
Passive disambiguation of assistant commands

Patent number: 12027164

Abstract: Implementations set forth herein relate to an automated assistant that can initialize execution of an assistant command associated with an interpretation that is predicted to be responsive to a user input, while simultaneously providing suggestions for alternative assistant command(s) associated with alternative interpretation(s) that is/are also predicted to be responsive to the user input. The alternative assistant command(s) that are suggested can be selectable such that, when selected, the automated assistant can pivot from executing the assistant command to initializing execution of the selected alternative assistant command(s). Further, the alternative assistant command(s) that are suggested can be partially fulfilled prior to any user selection thereof. Accordingly, implementations set forth herein can enable the automated assistant to quickly and efficiently pivot between assistant commands that are predicted to be responsive to the user input.

Type: Grant

Filed: June 16, 2021

Date of Patent: July 2, 2024

Assignee: GOOGLE LLC

Inventors: Brett Barros, Theo Goguely
Automated initiation and adaptation of a dialog with a user via user interface devices of a computing device of the user

Patent number: 12026530

Abstract: Methods and apparatus directed to utilizing an automated messaging system to initiate and/or adapt a dialog with at least one user, where the dialog occurs via user interface input and output devices of at least one computing device of the user. In some of those implementations, the automated messaging system identifies at least one task associated with the user and initiates the dialog with the user based on identifying the task. The automated messaging system may initiate the dialog to provide the user with additional information related to the task and/or to determine, based on user input provided during the dialog, values for one or more parameters of the task. In some implementations, the automated messaging system may further initiate performance of the task utilizing parameters determined during the dialog.

Type: Grant

Filed: November 7, 2022

Date of Patent: July 2, 2024

Assignee: GOOGLE LLC

Inventors: Guangqiang Zhang, Zhou Bailiang
Display screen or portion thereof with transitional graphical user interface

Patent number: D1034631

Type: Grant

Filed: March 15, 2022

Date of Patent: July 9, 2024

Assignee: GOOGLE LLC

Inventors: Mourad Sabour, Andrea Iosue, Daniel Sim, Jonathon Marks, Sanjoy Ghosh, Robert Mel Chung, Pengquan Meng

prev … 34 35 36 37 38 39 40 41 42 … next