Abstract: A method for reducing idleness in a machine-learning training system can include performing operations by computing devices. A first set of training operations can access and prepare a plurality of training examples of a set of training data. A second set of training operations can train a machine-learned model based at least in part on the set of training data and can include one or more repeat iterations in which at least a portion of the second set of training operations is repeatedly performed such that the training example(s) are repeatedly used to train the machine-learned model. A rate of the repeat iteration(s) can be based at least in part on an echo factor that can be based at least in part on a comparison of a first computational time of the first set of training operations to a second computational time of the second set of training operations.
Type:
Grant
Filed:
May 11, 2020
Date of Patent:
December 27, 2022
Assignee:
GOOGLE LLC
Inventors:
Dami Choi, Alexandre Tachard Passos, Christopher James Shallue, George Edward Dahl
Abstract: Systems and methods for multi-attendee video conferencing are described. A system can perform spatial audio modulation techniques in video conference calls based on content type or participant role. In particular, by assigning user roles and content types to specific regions in a two- or three-dimensional audio sound space or “soundstage,” users can identify—simply by listening—the source of the audio (e.g., who the current speaker is and/or whether the sound came from a specific type of content). Thus, in example implementations of the present disclosure, each of a number of conference roles and/or content types can be allocated a particular virtual location within the audio soundstage.
Type:
Grant
Filed:
June 4, 2021
Date of Patent:
December 27, 2022
Assignee:
GOOGLE LLC
Inventors:
Karsten Seipp, Jae Pum Park, Anton Volkov
Abstract: The present disclosure provides systems and methods that include or otherwise leverage use of a multiscale quantization model that is configured to provide a quantized dataset. In particular, the multiscale quantization model can receive and perform vector quantization of a first dataset. The multiscale quantization model can generate a residual dataset based at least in part on a result of the vector quantization. The multiscale quantization model can apply a rotation matrix to the residual dataset to generate a rotated residual dataset that includes a plurality of rotated residuals. The multiscale quantization model can perform reparameterization of each rotated residual in the rotated residual dataset into a direction component and a scale component. The multiscale quantization model can perform product quantization of the direction components of the plurality of rotated residuals, and perform scalar quantization of the scale components of the plurality of rotated residuals.
Type:
Grant
Filed:
May 14, 2018
Date of Patent:
December 20, 2022
Assignee:
GOOGLE LLC
Inventors:
Xiang Wu, David Simcha, Daniel Holtmann-Rice, Sanjiv Kumar, Ananda Theertha Suresh, Ruiqi Guo, Xinnan Yu
Abstract: Some implementations relate to performing speech biasing, NLU biasing, and/or other biasing based on historical assistant interaction(s). It can be determined, for one or more given historical interactions of a given user, whether to affect future biasing for (1) the given user account, (2) additional user account(s), and/or (3) the shared assistant device as a whole. Some implementations disclosed herein additionally and/or alternatively relate to: determining, based on utterance(s) of a given user to a shared assistant device, an association of first data and second data; storing the association as accessible to a given user account of the given user; and determining whether to store the association as also accessible by additional user account(s) and/or the shared assistant device.
Abstract: The present disclosure provides systems and methods for compressing and/or distributing machine learning models. In one example, a computer-implemented method is provided to compress machine-learned models, which includes obtaining, by one or more computing devices, a machine-learned model. The method includes selecting, by the one or more computing devices, a weight to be quantized and quantizing, by the one or more computing devices, the weight. The method includes propagating, by the one or more computing devices, at least a part of a quantization error to one or more non-quantized weights and quantizing, by the one or more computing devices, one or more of the non-quantized weights. The method includes providing, by the one or more computing devices, a quantized machine-learned model.
Abstract: A computer-implemented method can include receiving a first signal corresponding to a first flow of acoustic energy, applying a transform to the received first signal using at least a first amplitude-independent window size at a first frequency and a second amplitude-independent window size at a second frequency, the second amplitude-independent window size improving a temporal response at the second frequency, wherein the second frequency is subject to amplitude reduction due to a resonance phenomenon associated with the first frequency, and storing a first encoded signal, the first encoded signal based on applying the transform to the received first signal.
Abstract: The present disclosure is directed to an automated neural architecture search approach for designing new neural network architectures such as, for example, resource-constrained mobile CNN models. In particular, the present disclosure provides systems and methods to perform neural architecture search using a novel factorized hierarchical search space that permits layer diversity throughout the network, thereby striking the right balance between flexibility and search space size. The resulting neural architectures are able to be run relatively faster and using relatively fewer computing resources (e.g., less processing power, less memory usage, less power consumption, etc.), all while remaining competitive with or even exceeding the performance (e.g., accuracy) of current state-of-the-art mobile-optimized models.
Type:
Grant
Filed:
January 28, 2019
Date of Patent:
December 20, 2022
Assignee:
GOOGLE LLC
Inventors:
Mingxing Tan, Quoc Le, Bo Chen, Vijay Vasudevan, Ruoming Pang
Abstract: A computer-implemented method for generating video representations utilizing a hierarchical video encoder includes obtaining a video, wherein the video includes a plurality of frames, processing each of the plurality of frames with a machine-learned frame-level encoder model to respectively generate a plurality of frame representations for the plurality of frames, the plurality of frame representations respective to the plurality of frames determining a plurality of segment representations representative of a plurality of video segments including one or more of the plurality of frames, the plurality of segment representations based at least in part on the plurality of frame representations, processing the plurality of segment representations with a machine-learned segment-level encoder model to generate a plurality of contextualized segment representations, determining a video representation based at least in part on the plurality of contextualized segment representations, and providing the video representati
Type:
Grant
Filed:
January 29, 2021
Date of Patent:
December 20, 2022
Assignee:
GOOGLE LLC
Inventors:
Vihan Jain, Joonseok Lee, Ming Zhao, Sheide Chammas, Hexiang Hu, Bowen Zhang, Fei Sha, Tze Way Eugene Ie
Abstract: Systems, apparatuses, and methods for capturing voice messages are provided. In one embodiment, a method can include receiving, by one or more processors of a mobile user device, a user input indicative of a voice message at a first time. The method can further include identifying contextual data indicative of one or more computing devices within proximity of the mobile user device. The method can include providing a set of data for storage in one or more memory devices of the mobile user device. The set of data can indicate the voice message and the contextual data indicative of the computing devices. The method can further include providing an output indicative of the voice message and the contextual data to one or more secure computing devices at a second time.
Type:
Grant
Filed:
December 1, 2020
Date of Patent:
December 13, 2022
Assignee:
GOOGLE LLC
Inventors:
Jonathan Brandt Moeller, Jeremy Drew Payne
Abstract: Systems, methods and computer program products are provided for managing contactless transactions. A first tap is performed when a system is placed within a predetermined proximity to a payment terminal. A first select command including an AID corresponding to a first application is received from the payment terminal. A first response based on the first select command is transmitted to the payment terminal. A data request including information indicating supported data types is received from the payment terminal. A second response based on the data request and including transaction data is transmitted to the payment terminal. The transaction data includes at least a portion of commerce data stored in the at least one memory.
Abstract: Systems and methods are provided to pre-train projection networks for use as transferable natural language representation generators. In particular, example pre-training schemes described herein enable learning of transferable deep neural projection representations over randomized locality sensitive hashing (LSH) projections, thereby surmounting the need to store any embedding matrices because the projections can be dynamically computed at inference time.
Abstract: Text independent speaker recognition models can be utilized by an automated assistant to verify a particular user spoke a spoken utterance and/or to identify the user who spoke a spoken utterance. Implementations can include automatically updating a speaker embedding for a particular user based on previous utterances by the particular user. Additionally or alternatively, implementations can include verifying a particular user spoke a spoken utterance using output generated by both a text independent speaker recognition model as well as a text dependent speaker recognition model. Furthermore, implementations can additionally or alternatively include prefetching content for several users associated with a spoken utterance prior to determining which user spoke the spoken utterance.
Type:
Grant
Filed:
December 2, 2019
Date of Patent:
December 13, 2022
Assignee:
GOOGLE LLC
Inventors:
Pu-sen Chao, Diego Melendo Casado, Ignacio Lopez Moreno, Quan Wang
Abstract: The systems and methods described herein can include a digital assistant application that receives sensor signals from sensors installed in a vehicle and determines an entry event into the vehicle. The digital assistant application can receive, responsive to the entry event into the vehicle, a plurality authentication input signals from a plurality of sensors associated with the vehicle. The digital assistant application can determine a plurality of authentication states based on the plurality of authentication input signals and a plurality of authentication credentials. The digital assistant application can identify an access permission level of a plurality of access permission levels based at least in part on the plurality of identifies authentication states. The digital assistant application can identify, responsive to the access permission level, a subset of a set of functionalities available via the vehicle, and provide vehicular access to the subset of functionalities.
Type:
Grant
Filed:
October 6, 2020
Date of Patent:
December 13, 2022
Assignee:
GOOGLE LLC
Inventors:
Haris Ramie, Victor Chan, Vikas Gupta, Lingjun Li
Abstract: There is provided a method of operating a wearable heads-up display, which display includes a light source, a light guide, and an incoupler carried by the light guide. The method includes emitting first and second beams having first and second wavelengths respectively, directing the first and second beams towards the incoupler, and directing, by the incoupler, at least a portion of the first and second beams into the light guide. Moreover the method includes internally reflecting, by the light guide, the portions of the first and second beams to form first and second reflected beams respectively. The first and second beams respectively may have first and second incoupling losses. Furthermore, the method includes adjusting a beam characteristic of at least one of the first and second beams to control a difference between their respective incoupling losses.
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for predicting the accuracy of user submissions. One of the methods includes receiving, from a user, an update to an attribute of an entity related to a topic. If the user is determined to be reliable relative to the topic based on user profile data of the user, the knowledge base is updated with the update to the attribute of the entity.
Abstract: Provided are computing systems and methods directed to active learning and may provide advantages or improvements to active learning applications for skewed data sets. A challenge in training and developing high-quality models for many supervised learning scenarios is obtaining labeled training examples. Provided are systems and methods for active learning on a training dataset that includes both labeled and unlabeled datapoints. In particular, the systems and methods described herein can select (e.g., at each of a number of iterations) a number of the unlabeled datapoints for which labels should be obtained to gain additional labeled datapoints on which to train a machine-learned model (e.g., machine-learned classifier model). Generally, provided are cost-effective methods and systems for selecting data to improve machine-learned models in applications such as the identification of content items in text, images, and/or audio.
Type:
Grant
Filed:
January 23, 2020
Date of Patent:
December 13, 2022
Assignee:
GOOGLE LLC
Inventors:
Qi Zhao, Abbas Kazerouni, Sandeep Tata, Jing Xie, Marc Najork