Patents Assigned to Google LLC

Streaming Automatic Speech Recognition With Non-Streaming Model Distillation

Publication number: 20240029716

Abstract: A method for training a streaming automatic speech recognition student model includes receiving a plurality of unlabeled student training utterances. The method also includes, for each unlabeled student training utterance, generating a transcription corresponding to the respective unlabeled student training utterance using a plurality of non-streaming automated speech recognition (ASR) teacher models. The method further includes distilling a streaming ASR student model from the plurality of non-streaming ASR teacher models by training the streaming ASR student model using the plurality of unlabeled student training utterances paired with the corresponding transcriptions generated by the plurality of non-streaming ASR teacher models.

Type: Application

Filed: October 4, 2023

Publication date: January 25, 2024

Applicant: Google LLC

Inventors: Thibault Doutre, Wei Han, Min Ma, Zhiyun Lu, Chung-Cheng Chiu, Ruoming Pang, Arun Narayanan, Ananya Misra, Yu Zhang, Liangliang Cao
Unified End-To-End Speech Recognition And Endpointing Using A Switch Connection

Publication number: 20240029719

Abstract: A single E2E multitask model includes a speech recognition model and an endpointer model. The speech recognition model includes an audio encoder configured to encode a sequence of audio frames into corresponding higher-order feature representations, and a decoder configured to generate probability distributions over possible speech recognition hypotheses for the sequence of audio frames based on the higher-order feature representations. The endpointer model is configured to operate between a VAD mode and an EOQ detection mode. During the VAD mode, the endpointer model receives input audio frames, and determines, for each input audio frame, whether the input audio frame includes speech. During the EOQ detection mode, the endpointer model receives latent representations for the sequence of audio frames output from the audio encoder, and determines, for each of the latent representation, whether the latent representation includes final silence.

Type: Application

Filed: June 23, 2023

Publication date: January 25, 2024

Applicant: Google LLC

Inventors: Shaan Jagdeep Patrick Bijwadia, Shuo-yiin Chang, Bo Li, Yanzhang He, Tara N. Sainath, Chao Zhang
Flickering Reduction with Partial Hypothesis Re-ranking for Streaming ASR

Publication number: 20240029718

Abstract: A method includes processing, using a speech recognizer, a first portion of audio data to generate a first lattice, and generating a first partial transcription for an utterance based on the first lattice. The method includes processing, using the recognizer, a second portion of the data to generate, based on the first lattice, a second lattice representing a plurality of partial speech recognition hypotheses for the utterance and a plurality of corresponding speech recognition scores. For each particular partial speech recognition hypothesis, the method includes generating a corresponding re-ranked score based on the corresponding speech recognition score and whether the particular partial speech recognition hypothesis shares a prefix with the first partial transcription.

Type: Application

Filed: July 13, 2023

Publication date: January 25, 2024

Applicant: Google LLC

Inventors: Antoine Jean Bruguier, David Qiu, Yangzhang He, Trevor Strohman
ATTENTIVE SCORING FUNCTION FOR SPEAKER IDENTIFICATION

Publication number: 20240029742

Abstract: A speaker verification method includes receiving audio data corresponding to an utterance, processing the audio data to generate a reference attentive d-vector representing voice characteristics of the utterance, the evaluation ad-vector includes ne style classes each including a respective value vector concatenated with a corresponding routing vector. The method also includes generating using a self-attention mechanism, at least one multi-condition attention score that indicates a likelihood that the evaluation ad-vector matches a respective reference ad-vector associated with a respective user. The method also includes identifying the speaker of the utterance as the respective user associated with the respective reference ad-vector based on the multi-condition attention score.

Type: Application

Filed: October 2, 2023

Publication date: January 25, 2024

Applicant: Google LLC

Inventors: Ignacio Lopez Moreno, Quan Wang, Jason Pelecanos, Yiling Huang, Mert Saglam
Using Aligned Text and Speech Representations to Train Automatic Speech Recognition Models without Transcribed Speech Data

Publication number: 20240029715

Abstract: A method includes receiving training data that includes unspoken textual utterances in a target language. Each unspoken textual utterance not paired with any corresponding spoken utterance of non-synthetic speech. The method also includes generating a corresponding alignment output for each unspoken textual utterance using an alignment model trained on transcribed speech utterance in one or more training languages each different than the target language. The method also includes generating a corresponding encoded textual representation for each alignment output using a text encoder and training a speech recognition model on the encoded textual representations generated for the alignment outputs. Training the speech recognition model teaches the speech recognition model to learn how to recognize speech in the target language.

Type: Application

Filed: July 20, 2023

Publication date: January 25, 2024

Applicant: Google LLC

Inventors: Andrew Rosenberg, Zhehuai Chen, Ankur Bapna, Yu Zhang, Bhuvana Ramabhadran
Smart-Device-Based Radar System Performing Angular Position Estimation

Publication number: 20240027600

Abstract: Techniques and apparatuses are described that implement a smart-device-based radar system capable of performing angular position estimation. A machine-learned module analyzes complex range data generated to estimate angular positions of objects. The machine-learned module is implemented using a multi-stage architecture. In a local stage, the machine-learned module splits the complex range data into different range intervals and separately processes subsets of the complex range data using individual branch modules. In a global stage, the machine-learned module merges the feature data generated from the individual branch modules using a symmetric function and generates angular position data. By using machine-learning techniques and processing the complex range data directly, the radar system can achieve higher angular resolutions compared to other radar systems that utilize other techniques, such as analog or digital beamforming.

Type: Application

Filed: August 7, 2020

Publication date: January 25, 2024

Applicant: Google LLC

Inventor: Muhammad Muneeb Saleem
DETERMINATION OF USER PRESENCE AND ABSENCE USING WIFI CONNECTIONS

Publication number: 20240031847

Abstract: Systems and techniques are provided for determination of user presence and absence using WiFi connections. Reports may be received from WiFi access points in an environment. The reports may include an identifier of a WiFi device, an indication of a connection to or disconnection from a WiFi access point, a time of the connection or disconnection, and an identifier of the WiFi access point. A connection sequence for the WiFi device may be generated from the reports. Whether the WiFi device is present in or absent from the environment as of a specified time may be determined based on the connection sequence. An indication of presence for a user associated with the WiFi device may generated if the WiFi device is present in the environment. An indication of absence for the user associated with the WiFi device may be generated if the WiFi device is absent from the environment.

Type: Application

Filed: October 3, 2023

Publication date: January 25, 2024

Applicant: Google LLC

Inventors: Marci Meingast, Andrew Axley, Daniele Midi
Joint Speech and Text Streaming Model for ASR

Publication number: 20240028829

Abstract: A method includes receiving training data that includes a set of unspoken textual utterances. For each respective unspoken textual utterance, the method includes, tokenizing the respective textual utterance into a sequence of sub-word units, generating a first higher order textual feature representation for a corresponding sub-word unit tokenized from the respective unspoken textual utterance, receiving the first higher order textual feature representation generated by a text encoder, and generating a first probability distribution over possible text units. The method also includes training an encoder based on the first probability distribution over possible text units generated by a first-pass decoder for each respective unspoken textual utterance in the set of unspoken textual utterances.

Type: Application

Filed: July 1, 2023

Publication date: January 25, 2024

Applicant: Google LLC

Inventors: Tara N. Sainath, Zhouyuan Huo, Zhehuai Chen, Yu Zhang, Weiran Wang, Trevor Strohman, Rohit Prakash Prabhavalkar, Bo Li, Ankur Bapna
Biasing voice correction suggestions

Patent number: 11881207

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for natural language processing. One of the method includes receiving a voice input from a user device; generating a recognition output; receiving a user selection of one or more terms in the recognition output; receiving a user input of one or more letters replacing the user selected one or more terms; determining suggested correction candidates based in part on the user input and the voice input; and providing one or more suggested correction candidates to the user device as suggested corrected recognition outputs.

Type: Grant

Filed: March 23, 2022

Date of Patent: January 23, 2024

Assignee: Google LLC

Inventors: Evgeny A. Cherepanov, Jakob Nicolaus Foerster, Vikram Sridar, Ishai Rabinovitz, Omer Tabach
Persistent media player

Patent number: 11882330

Abstract: A persistent media player is disclosed. A method for providing the persistent media player includes displaying, by an electronic device, a first portion of a scrollable document in a user interface (UI) of an application executed on the electronic device. The first portion includes a media player that is to present a first media item. The method further includes receiving an input to scroll to a second portion of the scrollable document. The method also includes displaying the second portion of the scrollable document, where the first portion is no longer visible and where the media player continues to be visible.

Type: Grant

Filed: June 17, 2022

Date of Patent: January 23, 2024

Assignee: Google LLC

Inventors: Justin Lewis, Gavin James
Scalable exactly-once data processing using transactional streaming writes

Patent number: 11880290

Abstract: A method for processing data exactly once using transactional stream writes includes receiving, from a client, a batch of data blocks for storage on memory hardware in communication with the data processing hardware. The batch of data blocks is associated with a corresponding sequence number and represents a number of rows of a table stored on the memory hardware. The method also includes partitioning the batch of data blocks into a plurality of sub-batches of data blocks. For each sub-batch of data blocks, the method further includes assigning the sub-batch of data blocks to a buffered stream; writing, using the assigned buffered stream, the sub-batch of data blocks to the memory hardware; updating a storage log with an intent to commit the sub-batch of data blocks using the assigned buffered stream; and committing the sub-batch of data blocks to the memory hardware.

Type: Grant

Filed: February 6, 2023

Date of Patent: January 23, 2024

Assignee: Google LLC

Inventors: Pavan Edara, Reuven Lax, Yi Yang, Gurpreet Singh Nanda
Robot task optimization based on historical task and location correlated durations

Patent number: 11878425

Abstract: Methods, apparatus, systems, and computer-readable media are provided for optimizing robot-implemented tasks based at least in part on historical task and location correlated duration data collected from one or more robots. Historical task and location correlated duration data may, in some implementations, include durations of different tasks performed in different locations by one or more robots in one or more particular environments, and knowledge of such durations may be used to optimize tasks performed by the same or different robots in the future.

Type: Grant

Filed: December 27, 2021

Date of Patent: January 23, 2024

Assignee: GOOGLE LLC

Inventors: Gregory Prisament, Laura Stoia, Yuchen Wu, Alan Thompson
Methods, systems, and media for delivering manifestless streaming media content

Patent number: 11882168

Abstract: Methods, systems, and media for delivering manifestless streaming media content are provided. In some embodiments, the method comprises: receiving, from a user device, a request for a URL corresponding to a format of a live stream that is provided in a plurality of formats, wherein the live stream comprises a plurality of segments for each of the plurality of formats; resolving the request to a specific segment of the live stream based on the URL, wherein the resolving comprises: identifying the format of the live stream associated with the request from the plurality of formats based on the URL; and identifying a segment of the plurality of segments corresponding to the identified format to the user device in a response to the request.

Type: Grant

Filed: January 13, 2023

Date of Patent: January 23, 2024

Assignee: Google LLC

Inventors: Tristan Schmelcher, William Cyr, Thomas DeWeese, Nils Krahnstoever, Matthew Carson, Pawel Jurczyk, Thomas Dinger, Jeffrey Calow
Weakly-supervised action localization by sparse temporal pooling network

Patent number: 11881022

Abstract: Systems and methods for a weakly supervised action localization model are provided. Example models according to example aspects of the present disclosure can localize and/or classify actions in untrimmed videos using machine-learned models, such as convolutional neural networks. The example models can predict temporal intervals of human actions given video-level class labels with no requirement of temporal localization information of actions. The example models can recognize actions and identify a sparse set of keyframes associated with actions through adaptive temporal pooling of video frames, wherein the loss function of the model is composed of a classification error and a sparsity of frame selection. Following action recognition with sparse keyframe attention, temporal proposals for action can be extracted using temporal class activation mappings, and final time intervals can be estimated corresponding to target actions.

Type: Grant

Filed: March 10, 2023

Date of Patent: January 23, 2024

Assignee: GOOGLE LLC

Inventors: Ting Liu, Gautam Prasad, Phuc Xuan Nguyen, Bohyung Han
Firewall insights processing and machine learning

Patent number: 11882095

Abstract: A computer-implemented method causes data processing hardware to perform operations for training a firewall utilization model. The operations include receiving firewall utilization data for firewall connection requests during a utilization period. The firewall utilization data includes hit counts for each sub-rule associated with at least one firewall rule. The operations also include generating training data based on the firewall utilization data. The training data includes unused sub-rules corresponding to sub-rules having no hits during the utilization period and hit sub-rules corresponding to sub-rules having more than zero hits during the utilization period. The operations also include training a firewall utilization model on the training data. The operations further include, for each sub-rule associated with the at least one firewall rule, determining a corresponding sub-rule utilization probability indicating a likelihood the sub-rule will be used for a future connection request.

Type: Grant

Filed: April 13, 2021

Date of Patent: January 23, 2024

Assignee: Google LLC

Inventors: Firat Kalaycilar, Xiang Wang, Gregory Lee Slaughter
Determining expected hash-values in functions with control flow

Patent number: 11880688

Abstract: This document describes techniques and apparatuses that enable determining expected hash-values in functions with control flow. A computing device receives a function comprising function instructions within at least three basic blocks connected via multiple execution paths. Hash-input instructions are inserted within a plurality of the basic blocks that indirectly force hash values at the respective insertion points. Hash values at ends of the plurality of the basic blocks are set to a canonical value and an expected hash-value and hash input-values are calculated using a hash function. By using the canonical value and the hash input-values, the expected hash-value is the same regardless of which execution path is executed.

Type: Grant

Filed: September 30, 2020

Date of Patent: January 23, 2024

Assignee: Google LLC

Inventors: Nathaniel Casey Voorhies, Antonio Cortes Perez
Programmable injector grid plate

Patent number: 11880030

Abstract: A programmable beam blocker includes a liquid crystal based grid of pixels, one or more groups of pixels, or plurality of pixels, corresponding to individual beams of light. The application of a voltage through one pixel can change the phase of the liquid crystal material to prevent the transmission of light through it.

Type: Grant

Filed: November 23, 2020

Date of Patent: January 23, 2024

Assignee: Google LLC

Inventors: Thomas L. Haslett, Robert M. Krause, Jill Berger, Kevin Yasumura
Retroreflective join graph generation for relational database queries

Patent number: 11880370

Abstract: A method, system and computer program product for join graph generation based upon a log of previously executed database queries includes method for generating a join graph for relational database queries. The method includes loading into memory of a computer, a log of a set of database queries previously executed against data in a database and sequentially parsing each of the queries in the log to identify different semantically characterizable components of each of the queries. The method further includes generating a join graph for each of the queries from corresponding ones of the components. Finally, the method includes selectively adding each of the generated join graphs to a set of join graphs in a data model for the data in the database.

Type: Grant

Filed: February 17, 2022

Date of Patent: January 23, 2024

Assignee: Google LLC

Inventors: Julian Hyde, Jonathan Swenson
Confidence level based controls on a second device

Patent number: 11880559

Abstract: A method includes obtaining proximity information for each of a plurality of assistant-enabled devices within an environment of a user device. Each assistant-enabled device is controllable by an assistant application to perform a respective set of available actions associated with the assistant-enabled device. For each assistant-enabled device, the method also includes determining a proximity score based on the proximity information indicating a proximity estimation of the corresponding assistant-enabled device relative to the user device. The method further includes generating, using the proximity scores determined for the assistant-enabled devices, a ranked list of candidate assistant-enabled devices, and for each corresponding assistant-enabled device in the ranked list, displaying, in a graphical user interface (GUI), a respective set of controls for performing the respective set of actions associated with the corresponding assistant-enabled device.

Type: Grant

Filed: July 12, 2022

Date of Patent: January 23, 2024

Assignee: Google LLC

Inventors: Matthew Sharifi, Victor Carbune
Generating notifications that provide context for predicted content interruptions

Patent number: 11882339

Abstract: Implementations set forth herein relate to providing notifications regarding events that may interrupt content being rendered at an interface. The notifications can be preemptive and/or can indicate a predicted time and/or source for the events. The event can be, for example, a person attempting to contact a user who is viewing content at a display interface. The person can be associated with a food delivery that has been ordered by the user via a delivery application. An application, such an automated assistant application, can predict when the person is expected to arrive with the food delivery, and generate a notification ahead of the person arriving. In some implementations, the notification can be rendered at a scrubber user interface (UI) at a location corresponding to the time that the food delivery is expected to arrive, thereby putting the user on notice of when the streaming content may be interrupted.

Type: Grant

Filed: November 11, 2022

Date of Patent: January 23, 2024

Assignee: GOOGLE LLC

Inventors: Cliff Kuang, Jesse Kaczmarek, Andy Gugel, Jonathan Lee

prev … 162 163 164 165 166 167 168 169 170 … next