Google Patent Applications

Google patent applications that are pending before the United States Patent and Trademark Office (USPTO).

Thermal Mitigation for An Electronic Speaker Device and Associated Apparatuses and Methods

Publication number: 20240155818

Abstract: The present disclosure describes thermal mitigation for an electronic speaker device and associated systems and methods. The thermal mitigation includes monitoring several thermal zones to determine or estimate thermal conditions in corresponding parts of the electronic speaker device. The thermal zones may include a System-on-Chip (SoC) integrated circuit (IC) component, audio components including power-dissipating IC components, and a temperature of an exterior surface of a housing component of the electronic speaker device. To mitigate thermal runaway, different throttling schemes may be triggered based on the thermal zones exceeding certain thermal limits. The throttling schemes may include reducing the amount of power supplied to the SoC, reducing audio power of the audio components to a lower wattage, or manipulating SoC cores such as by disabling one or more of the cores or adjusting utilization of the SoC cores.

Type: Application

Filed: November 8, 2023

Publication date: May 9, 2024

Applicant: Google LLC

Inventors: Emil Rahim, Chintan Trehan, Ihab A. Ali, Wilson Tang
Systems and Methods for Monitoring High Charge Levels in Rechargeable Batteries

Publication number: 20240154435

Abstract: An indexed sequence of bits in a buffer is allocated for tracking a battery charging state. The indexed sequence of bits has a first number of bits. A battery voltage of a rechargeable battery is sampled at a sampling rate. For each sampled battery voltage, the battery voltage is compared with a voltage threshold. A next bit position in the indexed sequence of bits is identified. In accordance with a determination that a comparison result is true, a predefined first value is added to the next bit position. A second number of bits that are filled with the predefined first value is determined. A ratio between the second number and the first number is also determined. In accordance with a determination that the ratio exceeds a threshold step-down ratio, a battery charge voltage is stepped down. The rechargeable battery is charged to a step-down voltage.

Type: Application

Filed: January 5, 2024

Publication date: May 9, 2024

Applicant: Google LLC

Inventors: Michael Jonathon Chen, William Alan Saperstein, James Robert Lim, David Wang
LINK MARGIN IMPROVEMENTS USING A VARIABLE PHYSICAL LAYER SYMBOL RATE

Publication number: 20240155711

Abstract: Various arrangements are presented for increasing a link margin of a wireless audio link. A short-range wireless communication link having a first physical layer (PHY) symbol rate is established between an audio source device and an audio output device. An audio stream is transmitted using the communication link, which includes a connected isochronous stream (CIS) link. A number of packet retransmissions are detected on the CIS. Based on the detected number of packet retransmissions on the CIS, the first PHY symbol rate of the CIS can be altered to a second PHY symbol rate for transmitting the audio stream.

Type: Application

Filed: November 2, 2023

Publication date: May 9, 2024

Applicant: Google LLC

Inventors: Li-Xuan Chuo, Qi Jiang, Daniel Barros, Sunil Kumar
NETWORK ADDRESS TRANSLATION FOR VIRTUAL MACHINES

Publication number: 20240154930

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for receiving a packet from a client, the packet having header information including a destination Internet Protocol (IP) address, a destination port, a source IP address, and a source port, and wherein the source IP address and source port are associated with the client; selecting a destination virtual machine based on the destination port; modifying the packet by replacing the destination IP address in the header information with an IP address of the selected destination virtual machine; and sending the modified packet to the destination virtual machine.

Type: Application

Filed: January 17, 2024

Publication date: May 9, 2024

Applicant: Google LLC

Inventor: Evan K. Anderson
Multi-Adaptive Phase-Changing Device Communications

Publication number: 20240154646

Abstract: In aspects, a base station establishes a wireless connection with a user equipment, UE. The base station determines to include at least a first adaptive phase-changing device, APD, and a second APD in a wireless communication path with the UE. In response to determining to include multiple APDs in the communication path, the base station determines a first surface configuration for a first surface of the first APD and a second surface configuration for a second surface of the second APD. The base station directs the first APD to apply the first surface configuration to the first surface and directs the second APD to apply the second surface configuration to the second surface. The base station and the UE communicate with the UE using wireless transmissions that travel along a wireless communication path that includes the first surface of the first APD and the second surface of the second APD.

Type: Application

Filed: March 1, 2021

Publication date: May 9, 2024

Applicant: Google LLC

Inventors: Jibing Wang, Erik Richard Stauffer
Determining Expected Hash-Values in Functions with Control Flow

Publication number: 20240152361

Abstract: This document describes techniques and apparatuses that enable determining expected hash-values in functions with control flow. A computing device receives a function comprising function instructions within at least three basic blocks connected via multiple execution paths. Hash-input instructions are inserted within a plurality of the basic blocks that indirectly force hash values at the respective insertion points. Hash values at ends of the plurality of the basic blocks are set to a canonical value and an expected hash-value and hash input-values are calculated using a hash function. By using the canonical value and the hash input-values, the expected hash-value is the same regardless of which execution path is executed.

Type: Application

Filed: January 18, 2024

Publication date: May 9, 2024

Applicant: Google LLC

Inventors: Nathaniel Casey Voorhies, Antonio Cortes Perez
BROWSING HIERARCHICAL DATASETS

Publication number: 20240152265

Abstract: A method includes a hierarchical dataset that includes a root-data object and data collections nested under the root-data object. Each data collection includes one or more data objects, each data object associated with one or more other data collections. The method also includes displaying a hierarchical user interface on a screen. The hierarchical user interface includes column. The columns include data-object columns and data-collection columns, wherein the columns alternate between data-object columns and data-collection columns. Each data-object column displays a list of the one or more data objects of a respective data collection. Each data-collection column displays a list of the one or more data collections of a respective data object, the data-collection columns includes a root-data-collection column displaying a list of the one or more data collections associated with the root-data object.

Type: Application

Filed: January 11, 2024

Publication date: May 9, 2024

Applicant: Google LLC

Inventor: Michael Kleinerman
HOTWORD DETECTION ON MULTIPLE DEVICES

Publication number: 20240153507

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for hotword detection on multiple devices are disclosed. In one aspect, a method includes the actions of receiving audio data that corresponds to an utterance. The actions further include determining that the utterance likely includes a particular, predefined hotword. The actions further include transmitting (i) data indicating that the computing device likely received the particular, predefined hotword, (ii) data identifying the computing device, and (iii) data identifying a group of nearby computing devices that includes the computing device. The actions further include receiving an instruction to commence speech recognition processing on the audio data. The actions further include in response to receiving the instruction to commence speech recognition processing on the audio data, processing at least a portion of the audio data using an automated speech recognizer on the computing device.

Type: Application

Filed: January 18, 2024

Publication date: May 9, 2024

Applicant: GOOGLE LLC

Inventors: Diego Melendo Casado, Alexander H. Gruenstein, Jakob Nicolaus Foerster
SYNCHRONOUS SOUNDS FOR AUDIO ASSISTANT ON DEVICES

Publication number: 20240152314

Abstract: The various implementations described herein include methods and systems for synchronous audio playback. An electronic device can receive an identification of a first device as a common clock device that has a first internal clock being designated as a master clock. The electronic device receives a synchronized audio playback command that includes audio data to be output and a future playback time. In response to receiving the audio data, the device determines a synchronized audio playback time for audio to be output. An optimal time for output can be calculated and transmitted to the server system for future playback time calculations.

Type: Application

Filed: January 19, 2024

Publication date: May 9, 2024

Applicant: Google LLC

Inventors: Kenneth Mackay, Adrian Paul Diaconu, Xiaowei Jiang, Christopher K. Chan
EFFICIENT MACHINE LEARNING MODEL ARCHITECTURE SELECTION

Publication number: 20240152809

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for providing a machine learning model that is trained to perform a machine learning task. In one aspect, a method comprises receiving a request to train a machine learning model on a set of training examples; determining a set of one or more meta-data values characterizing the set of training examples; using a mapping function to map the set of meta-data values characterizing the set of training examples to data identifying a particular machine learning model architecture; selecting, using the particular machine learning model architecture, a final machine learning model architecture for performing the machine learning task; and training a machine learning model having the final machine learning model architecture on the set of training examples.

Type: Application

Filed: January 15, 2024

Publication date: May 9, 2024

Applicant: Google LLC

Inventors: Jyrki A. Alakuijala, Quentin Lascombes De Laroussilhe, Andrey Khorlin, Jeremiah Joseph Harmsen, Andrea Gesmundo
MASSIVE MULTILINGUAL SPEECH-TEXT JOINT SEMI-SUPERVISED LEARNING FOR TEXT-TO-SPEECH

Publication number: 20240153484

Abstract: A method includes receiving training data that includes a plurality of sets of text-to-speech (TTS) spoken utterances each associated with a respective language and including TTS utterances of synthetic speech spoken that includes a corresponding reference speech representation paired with a corresponding input text sequence. For each TTS utterance in each set of the TTS spoken training utterances of the received training data, the method includes generating a corresponding TTS encoded textual representation for the corresponding input text sequence, generating a corresponding speech encoding for the corresponding TTS utterance of synthetic speech, generating a shared encoder output, generating a predicted speech representation for the corresponding TTS utterance of synthetic speech, and determining a reconstruction loss. The method also includes training a TTS model based on the reconstruction losses determined for the TTS utterances in each set of the TTS spoken training utterances.

Type: Application

Filed: October 25, 2023

Publication date: May 9, 2024

Applicant: Google LLC

Inventors: Andrew M. Rosenberg, Takaaki Saeki, Zhehuai Chen, Byungha Chun, Bhuvana Ramabhadran
ZERO-SHOT FORM ENTITY QUERY FRAMEWORK

Publication number: 20240153297

Abstract: A method for extracting entities comprises obtaining a document that includes a series of textual fields that includes a plurality of entities. Each entity represents information associated with a predefined category. The method includes generating, using the document, a series of tokens representing the series of textual fields. The method includes generating an entity prompt that includes the series of tokens and one of the plurality of entities and generating a schema prompt that includes a schema associated with the document. The method includes generating a model query that includes the entity prompt and the schema prompt and determining, using an entity extraction model and the model query, a location of the one of the plurality of entities among the series of tokens. The method includes extracting, from the document, the one of the plurality of entities using the location of the one of the plurality of entities.

Type: Application

Filed: November 3, 2023

Publication date: May 9, 2024

Applicant: Google LLC

Inventors: Zizhao Zhang, Zifeng Wang, Vincent Perot, Jacob Devlin, Chen-Yu Lee, Guolong Su, Hao Zhang, Tomas Jon Pfister
Multi-Output Decoders for Multi-Task Learning of ASR and Auxiliary Tasks

Publication number: 20240153495

Abstract: A method includes receiving a training dataset that includes one or more spoken training utterances for training an automatic speech recognition (ASR) model. Each spoken training utterance in the training dataset paired with a corresponding transcription and a corresponding target sequence of auxiliary tokens. For each spoken training utterance, the method includes generating a speech recognition hypothesis for a corresponding spoken training utterance, determining a speech recognition loss based on the speech recognition hypothesis and the corresponding transcription, generating a predicted auxiliary token for the corresponding spoken training utterance, and determining an auxiliary task loss based on the predicted auxiliary token and the corresponding target sequence of auxiliary tokens. The method also includes the ASR model jointly on the speech recognition loss and the auxiliary task loss determined for each spoken training utterance.

Type: Application

Filed: October 26, 2023

Publication date: May 9, 2024

Applicant: Google LLC

Inventors: Weiran Wang, Ding Zhao, Shaojin Ding, Hao Zhang, Shuo-yiin Chang, David Johannes Rybach, Tara N. Sainath, Yanzhang He, Ian McGraw, Shankar Kumar
Contextual Biasing With Text Injection

Publication number: 20240153498

Abstract: A method includes receiving context biasing data that includes a set of unspoken textual utterances corresponding to a particular context. The method also includes obtaining a list of carrier phrases associated with the particular context. For each respective unspoken textual utterance, the method includes generating a corresponding training data pair that includes the respective unspoken textual utterance and a carrier phrase. For each respective training data pair, the method includes tokenizing the respective training data pair into a sequence of sub-word units, generating a first higher order textual feature representation for a corresponding sub-word unit, receiving the first higher order textual feature representation, and generating a first probability distribution over possible text units. The method also includes training a speech recognition model based on the first probability distribution over possible text units.

Type: Application

Filed: October 20, 2023

Publication date: May 9, 2024

Applicant: Google LLC

Inventors: Tara N. Sainath, Rohit Prakash Prabhavalkar, Diamantino Antonio Caseiro, Patrick Maxim Rondon, Cyril Allauzen
SYSTEMS AND METHODS FOR WIRELESSLY PROVIDING AN AUDIO STREAM

Publication number: 20240143272

Abstract: Features described herein pertain to systems and methods for wirelessly providing an audio stream. When audio that is to be output to an audio output device is associated with an application a set of parameters for modifying an established Connected Isochronous Stream (CIS) of a wireless link between an audio source and the audio output device can be determined and the CIS of the wireless link can be modified based on the set of parameters. The audio that is associated with the application can be output to the audio output device using the modified CIS of the wireless link.

Type: Application

Filed: October 26, 2023

Publication date: May 2, 2024

Applicant: Google LLC

Inventors: Daniel Barros, Sunil Kumar, Li-Xuan Chuo, Qi Jiang
Rejecting Biased Data Using A Machine Learning Model

Publication number: 20240144095

Abstract: A method for rejecting biased data using a machine learning model includes receiving a cluster training data set including a known unbiased population of data and training a clustering model to segment the received cluster training data set into clusters based on data characteristics of the known unbiased population of data. Each cluster of the cluster training data set includes a cluster weight. The method also includes receiving a training data set for a machine learning model and generating training data set weights corresponding to the training data set for the machine learning model based on the clustering model. The method also includes adjusting each training data set weight of the training data set weights to match a respective cluster weight and providing the adjusted training data set to the machine learning model as an unbiased training data set.

Type: Application

Filed: January 5, 2024

Publication date: May 2, 2024

Applicant: Google LLC

Inventors: Christopher Farrar, Steven Ross
Interpretable Tabular Data Learning Using Sequential Sparse Attention

Publication number: 20240144005

Abstract: A method of interpreting tabular data includes receiving, at a deep tabular data learning network (TabNet) executing on data processing hardware, a set of features. For each of multiple sequential processing steps, the method also includes: selecting, using a sparse mask of the TabNet, a subset of relevant features of the set of features; processing using a feature transformer of the TabNet, the subset of relevant features to generate a decision step output and information for a next processing step in the multiple sequential processing steps; and providing the information to the next processing step. The method also includes determining a final decision output by aggregating the decision step outputs generated for the multiple sequential processing steps.

Type: Application

Filed: January 4, 2024

Publication date: May 2, 2024

Applicant: Google LLC

Inventors: Sercan Omer Arik, Tomas Jon Pfister
EXPORTING MODULAR ENCODER FEATURES FOR STREAMING AND DELIBERATION ASR

Publication number: 20240144917

Abstract: A method includes obtaining a base encoder from a pre-trained model, and receiving training data comprising a sequence of acoustic frames characterizing an utterance paired with a ground-truth transcription of the utterance. At each of a plurality of output steps, the method includes: generating, by the base encoder, a first encoded representation for a corresponding acoustic frame; generating, by an exporter network configured to receive a continuous sequence of first encoded representations generated by the base encoder, a second encoded representation for a corresponding acoustic frame; generating, by an exporter decoder, a probability distribution over possible logits; and determining an exporter decoder loss based on the probability distribution over possible logits generated by the exporter decoder at the corresponding output step and the ground-truth transcription.

Type: Application

Filed: October 25, 2023

Publication date: May 2, 2024

Applicant: Google LLC

Inventors: Rami Magdi Fahmi Botros, Rohit Prakash Prabhavalkar, Johan Schalkwyk, Tara N. Sainath, Ciprian Ioan Chelba, Francoise Beaufays
SCALABLE EXACTLY-ONCE DATA PROCESSING USING TRANSACTIONAL STREAMING WRITES

Publication number: 20240143469

Abstract: A method for processing data exactly once using transactional stream writes includes receiving, from a client, a batch of data blocks for storage on memory hardware in communication with the data processing hardware. The batch of data blocks is associated with a corresponding sequence number and represents a number of rows of a table stored on the memory hardware. The method also includes partitioning the batch of data blocks into a plurality of sub-batches of data blocks. For each sub-batch of data blocks, the method further includes assigning the sub-batch of data blocks to a buffered stream; writing, using the assigned buffered stream, the sub-batch of data blocks to the memory hardware; updating a storage log with an intent to commit the sub-batch of data blocks using the assigned buffered stream; and committing the sub-batch of data blocks to the memory hardware.

Type: Application

Filed: December 20, 2023

Publication date: May 2, 2024

Applicant: Google LLC

Inventors: Pavan Edara, Reuven Lax, Ji Yang, Gurpreet Singh Nanda
Camera Assembly with Concave-Shaped Front Face

Publication number: 20240147035

Abstract: The various implementations described herein include a video camera assembly that includes: (1) a housing; (2) an image sensor positioned within the housing and having a field of view corresponding to a scene in the smart home environment; and (3) a concave-shaped front face positioned in front of the image sensor such that light from the scene passes through the front face prior to entering the image sensor; where the front face includes: (a) an inner section corresponding to the image sensor; and (b) an outer section between the housing and the inner section, the outer section having a concave shape that extends from an outer periphery of the outer section to an inner periphery of the outer section; and where the concave shape extends around an entirety of the outer periphery.

Type: Application

Filed: January 10, 2024

Publication date: May 2, 2024

Applicant: Google LLC

Inventors: Mark Kraz, Kevin Edward Booth, Tyler Scott Wilson, Nicholas Webb, Jason Evans Goulden, William Dong, Jeffrey Law, Rochus Jacob, Adam Duckworth Mittleman, Oliver Mueller
Event Based Recording

Publication number: 20240146866

Abstract: An electronic device comprises an image sensor, one or more processors, and memory storing instructions for receiving an event recording profile based on configuration data of the electronic device, the configuration data including a location type or a power type; receiving a plurality of images of a scene captured by the image sensor; detecting a trigger event based on one or more of the plurality of images of the scene; in response to detecting the trigger event, identifying an object of interest in one or more of the plurality of images of the scene; creating an event clip from the stored images that include the object of interest, wherein creating the event clip includes configuring a clip length based on the event recording profile; and providing the event clip for display.

Type: Application

Filed: December 18, 2023

Publication date: May 2, 2024

Applicant: Google LLC

Inventors: John Jordan Nold, Joe Delone Venters, Liana Kong, Scott Mullins
FIREWALL INSIGHTS PROCESSING AND MACHINE LEARNING

Publication number: 20240146695

Abstract: A computer-implemented method causes data processing hardware to perform operations for training a firewall utilization model. The operations include receiving firewall utilization data for firewall connection requests during a utilization period. The firewall utilization data includes hit counts for each sub-rule associated with at least one firewall rule. The operations also include generating training data based on the firewall utilization data. The training data includes unused sub-rules corresponding to sub-rules having no hits during the utilization period and hit sub-rules corresponding to sub-rules having more than zero hits during the utilization period. The operations also include training a firewall utilization model on the training data. The operations further include, for each sub-rule associated with the at least one firewall rule, determining a corresponding sub-rule utilization probability indicating a likelihood the sub-rule will be used for a future connection request.

Type: Application

Filed: December 21, 2023

Publication date: May 2, 2024

Applicant: Google LLC

Inventors: Firat Kalaycilar, Xiang Wang, Gregory Lee Slaughter
END-TO-END SPEECH DIARIZATION VIA ITERATIVE SPEAKER EMBEDDING

Publication number: 20240144957

Abstract: A method includes receiving an input audio signal corresponding to utterances spoken by multiple speakers. The method also includes encoding the input audio signal into a sequence of T temporal embeddings. During each of a plurality of iterations each corresponding to a respective speaker of the multiple speakers, the method includes selecting a respective speaker embedding for the respective speaker by determining a probability that the corresponding temporal embedding includes a presence of voice activity by a single new speaker for which a speaker embedding was not previously selected during a previous iteration and selecting the respective speaker embedding for the respective speaker as the temporal embedding. The method also includes, at each time step, predicting a respective voice activity indicator for each respective speaker of the multiple speakers based on the respective speaker embeddings selected during the plurality of iterations and the temporal embedding.

Type: Application

Filed: December 19, 2023

Publication date: May 2, 2024

Applicant: Google LLC

Inventors: David Grangier, Neil Zeghidour, Oliver Teboul
METHODS, SYSTEMS, AND MEDIA FOR PRESENTING NOTIFICATIONS INDICATING RECOMMENDED CONTENT

Publication number: 20240146985

Abstract: Methods, systems, and media for presenting notifications indicating recommended content are provided. A notification of recommended content can be provided. An indication that a user device has initiated a casting session with a display device can be received. A request for recommended content to be presented on the display device can be received. A media content item can be identified based on at least one media content item that has been previously selected by a user account associated with the user device. A notification can be generated that includes an indication of the identified media content item and a selectable input that, when selected, causes the identified media content item to begin being presented on the display device.

Type: Application

Filed: January 11, 2024

Publication date: May 2, 2024

Applicant: Google LLC

Inventors: Justin Lewis, Richard Rapp
CONFIDENCE-BASED APPLICATION-SPECIFIC USER INTERACTIONS

Publication number: 20240134462

Abstract: This application is directed to a method for controlling user experience (UX) operations on an electronic device that executes an application. A touchless UX operation associated with the application has an initiation condition including at least detection of a presence and a gesture in a required proximity range with a required confidence level. The electronic device then determines from a first sensor signal the proximity of the presence with respect to the electronic device. In accordance with a determination that the determined proximity is in the required proximity range, the electronic device determines from a second sensor signal a gesture associated with the proximity of the presence and an associated confidence level of the determination of the gesture. In accordance with a determination that the determined gesture and associated confidence level satisfy the initiation condition, the electronic device initializes the touchless UX operation associated with the application.

Type: Application

Filed: January 2, 2024

Publication date: April 25, 2024

Applicant: Google LLC

Inventors: Ashton Udall, Andrew Christopher Felch, James Paul Tobin
Knowledge Distillation with Domain Mismatch For Speech Recognition

Publication number: 20240135918

Abstract: A method includes receiving distillation data including a plurality of out-of-domain training utterances. For each particular out-of-domain training utterance of the distillation data, the method includes generating a corresponding augmented out-of-domain training utterance, and generating, using a teacher ASR model trained on training data corresponding to a target domain, a pseudo-label corresponding to the corresponding augmented out-of-domain training utterance. The method also includes distilling a student ASR model from the teacher ASR model by training the student ASR model using the corresponding augmented out-of-domain training utterances paired with the corresponding pseudo-labels generated by the teacher ASR model.

Type: Application

Filed: October 16, 2023

Publication date: April 25, 2024

Applicant: Google LLC

Inventors: Tien-Ju Yang, You-Chi Cheng, Shankar Kumar, Jared Lichtarge, Ehsan Amid, Yuxin Ding, Rajiv Mathews, Mingqing Chen
Using Memory Protection Data

Publication number: 20240135042

Abstract: The present disclosure describes techniques and apparatuses that are directed to using memory protection data within a computing device. Techniques include allocating regions of a memory for storing application data and protection data. Techniques also include creating a bitmap having bit values corresponding to memory blocks within the allocated regions. The one or more bit values can be indicative of whether application data and/or protection data are present in a memory block. The techniques and apparatuses can enable memory protection, such as memory security (e.g., encryption) and memory safety (e.g., error correction code (ECC) usage), to be efficiently used while permitting discontiguous memory allocations and without substantial operating system modification.

Type: Application

Filed: February 16, 2021

Publication date: April 25, 2024

Applicant: Google LLC

Inventors: Yanru Li, Deepti Vijayalakshmi Sriramagiri
Signal Adjustments in User Equipment-Coordination Set Joint Transmissions

Publication number: 20240137073

Abstract: Techniques described herein describe aspects of signal adjustments in user equipment-coordination set, UECS, joint transmissions. A base station analyzes a first joint transmission from multiple user equipments, UEs, participating in a UECS, where the multiple UEs include a coordinating UE of the UECS and at least one non-coordinating UE participating in the UECS. The base station determines that the first joint transmission fails to meet a performance metric and directs the multiple UEs participating in the UECS to add signal adjustments to a second joint transmission.

Type: Application

Filed: January 18, 2022

Publication date: April 25, 2024

Applicant: Google LLC

Inventors: Jibing Wang, Erik Richard Stauffer
RESIDUAL ADAPTERS FOR FEW-SHOT TEXT-TO-SPEECH SPEAKER ADAPTATION

Publication number: 20240135915

Abstract: A method for residual adapters for few-shot text-to-speech speaker adaptation includes obtaining a text-to-speech (TTS) model configured to convert text into representations of synthetic speech, the TTS model pre-trained on an initial training data set. The method further includes augmenting the TTS model with a stack of residual adapters. The method includes receiving an adaption training data set including one or more spoken utterances spoken by a target speaker, each spoken utterance in the adaptation training data set paired with corresponding input text associated with a transcription of the spoken utterance. The method also includes adapting, using the adaption training data set, the TTS model augmented with the stack of residual adapters to learn how to synthesize speech in a voice of the target speaker by optimizing the stack of residual adapters while parameters of the TTS model are frozen.

Type: Application

Filed: October 23, 2023

Publication date: April 25, 2024

Applicant: Google LLC

Inventors: Nobuyuki Morioka, Byungha Chun, Nanxin Chen, Yu Zhang, Yifan Ding
Universal Monolingual Output Layer for Multilingual Speech Recognition

Publication number: 20240135923

Abstract: A method includes receiving a sequence of acoustic frames as input to a multilingual automated speech recognition (ASR) model configured to recognize speech in a plurality of different supported languages and generating, by an audio encoder of the multilingual ASR, a higher order feature representation for a corresponding acoustic frame in the sequence of acoustic frames. The method also includes generating, by a language identification (LID) predictor of the multilingual ASR, a language prediction representation for a corresponding higher order feature representation. The method also includes generating, by a decoder of the multilingual ASR, a probability distribution over possible speech recognition results based on the corresponding higher order feature representation, a sequence of non-blank symbols, and a corresponding language prediction representation. The decoder includes monolingual output layer having a plurality of output nodes each sharing a plurality of language-specific wordpiece models.

Type: Application

Filed: October 11, 2023

Publication date: April 25, 2024

Applicant: Google LLC

Inventors: Chao Zhang, Bo Li, Tara N. Sainath, Trevor Strohman, Shuo-yiin Chang
METHOD FOR SPEECH-TO-SPEECH CONVERSION

Publication number: 20240135117

Abstract: The present disclosure relates to a streaming speech-to-speech conversion model, where an encoder runs in real time while a user is speaking, then after the speaking stops, a decoder generates output audio in real time. A streaming-based approach produces an acceptable delay with minimal loss in conversion quality when compared to other non-streaming server-based models. A hybrid model approach for combines look-ahead in the encoder and a non-causal stacker with non-causal self-attention.

Type: Application

Filed: October 23, 2023

Publication date: April 25, 2024

Applicant: GOOGLE LLC

Inventors: Oleg RYBAKOV, Fadi BIADSY
IDENTIFY MALICIOUS SOFTWARE

Publication number: 20240134980

Abstract: A method for identifying malicious software includes receiving and executing a software application, identifying a plurality of uniform resource identifiers the software application interacts with during execution of the software application, and generating a vector representation for the software application using a feed-forward neural network configured to receive the plurality of uniform resource identifiers as feature inputs. The method also includes determining similarity scores for a pool of training applications, each similarity score associated with a corresponding training application and indicating a level of similarity between the vector representation for the software application and a respective vector representation for the corresponding training application.

Type: Application

Filed: December 20, 2023

Publication date: April 25, 2024

Applicant: Google LLC

Inventors: Richard Cannings, Sai Deep Tetali, Mo Yu, Salvador Mandujano
EVALUATION-BASED SPEAKER CHANGE DETECTION EVALUATION METRICS

Publication number: 20240135934

Abstract: A method includes obtaining a multi-utterance training sample that includes audio data characterizing utterances spoken by two or more different speakers and obtaining ground-truth speaker change intervals indicating time intervals in the audio data where speaker changes among the two or more different speakers occur. The method also includes processing the audio data to generate a sequence of predicted speaker change tokens using a sequence transduction model. For each corresponding predicted speaker change token, the method includes labeling the corresponding predicted speaker change token as correct when the predicted speaker change token overlaps with one of the ground-truth speaker change intervals. The method also includes determining a precision metric of the sequence transduction model based on a number of the predicted speaker change tokens labeled as correct and a total number of the predicted speaker change tokens in the sequence of predicted speaker change tokens.

Type: Application

Filed: October 9, 2023

Publication date: April 25, 2024

Applicant: Google LLC

Inventors: Guanlong Zhao, Quan Wang, Han Lu, Yiling Huang, Jason Pelecanos
CONVERSATION-AWARE PROACTIVE NOTIFICATIONS FOR A VOICE INTERFACE DEVICE

Publication number: 20240135914

Abstract: A method for proactive notifications in a voice interface device includes: receiving a first user voice request for an action with an future performance time; assigning the first user voice request to a voice assistant service for performance; subsequent to the receiving, receiving a second user voice request and in response to the second user voice request initiating a conversation with the user; and during the conversation: receiving a notification from the voice assistant service of performance of the action; triggering a first audible announcement to the user to indicate a transition from the conversation and interrupting the conversation; triggering a second audible announcement to the user to indicate performance of the action; and triggering a third audible announcement to the user to indicate a transition back to the conversation and rejoining the conversation.

Type: Application

Filed: January 2, 2024

Publication date: April 25, 2024

Applicant: Google LLC

Inventors: Kenneth Mixter, Daniel Colish, Tuan Nguyen
LANGUAGE MODELS USING DOMAIN-SPECIFIC MODEL COMPONENTS

Publication number: 20240127807

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for language models using domain-specific model components. In some implementations, context data for an utterance is obtained. A domain-specific model component is selected from among multiple domain-specific model components of a language model based on the non-linguistic context of the utterance. A score for a candidate transcription for the utterance is generated using the selected domain-specific model component and a baseline model component of the language model that is domain-independent. A transcription for the utterance is determined using the score the transcription is provided as output of an automated speech recognition system.

Type: Application

Filed: December 21, 2023

Publication date: April 18, 2024

Applicant: Google LLC

Inventors: Fadi Biadsy, Diamantino Antonio Caseiro
DUAL BAND WIRELESS COMMUNICATIONS FOR MULTIPLE CONCURRENT AUDIO STREAMS

Publication number: 20240129658

Abstract: Various arrangements for performing wireless device-to-device communication are presented. An audio output device, such as an earbud or pair of earbuds, can establish a connection with an audio source via a first Bluetooth interface that communicates using a Bluetooth communication protocol on a 2.4 GHz Bluetooth frequency band. The audio output device can negotiate that Bluetooth frequency-shifted communication, such as on a 5 or 6 GHz frequency band, is available for use with the audio source. The audio output device may then perform Bluetooth frequency-shifted communication with the audio source such that the audio output device receives an audio stream from the audio source using Bluetooth frequency-shifted communication and the Bluetooth communication protocol.

Type: Application

Filed: October 14, 2022

Publication date: April 18, 2024

Applicant: Google LLC

Inventor: Daniel Barros
Computerized Methods and Apparatus for Data Cloning

Publication number: 20240126656

Abstract: Methods for creating a live copy of a data object from a production system for use by third party applications include receiving at least one request for a copy of production data from an application; creating a live backup copy; creating a flash copy of the live backup copy, and a flash copy bitmap; creating a modified version of the live backup copy by changing a subset of data in the live backup copy; recording the changed subset of data using the flash copy bitmap; mounting, the modified version of the live backup copy to the application; and transforming the modified version of the live backup copy back to the live backup copy when unmounting the modified version of the live backup copy of the production data from the application by applying changes associated with the flash copy bitmap to the live backup copy.

Type: Application

Filed: October 23, 2023

Publication date: April 18, 2024

Applicant: Google LLC

Inventors: Yeganjaiah Gottemukkula, Madhav Mutalik, Siddhartha Karnik, Tracy Melbourne Taylor
Trusted Computing for Digital Devices

Publication number: 20240126886

Abstract: This document describes techniques and systems for providing trusted computing for digital devices. The techniques and systems may use cryptographic algorithms to provide trusted computing and processing. By doing so, the techniques help ensure authentic computation and prevent nefarious acts. For example, a method is described that receives a signature associated with a designee and validates the signature. The signature may be associated with a designee of a host computing device, and the signature may be generated according to firmware associated with an integrated circuit of the host computing device and a first private key of a first asymmetric key pair. Signature validation may be based on a second asymmetric key pair having a second private key and a second public key, the second private key stored in write-once memory of the host computing device.

Type: Application

Filed: February 24, 2021

Publication date: April 18, 2024

Applicant: Google LLC

Inventors: Oskar Gerhard Senft, Miguel Angel Osorio Lozano, Timothy Jay Chen, Dominic Anthony Rizzo
PHYSICAL LAYER IMPROVEMENTS FOR SHORT RANGE WIRELESS COMMUNICATIONS

Publication number: 20240129699

Abstract: Various arrangements are presented that provide improvements of short-range wireless communications, such as Bluetooth LE Audio communication. An audio source device may determine that unidirectional audio is to be output. In response to determining that unidirectional audio is to be output, a first physical layer (PHY) configuration can be set for a first communication link in the downlink direction from the audio source device to the audio output device. A second PHY configuration can be set for the communication link in the uplink direction from the audio output device to the audio source device. The first PHY configuration has a greater symbol rate than the second PHY configuration.

Type: Application

Filed: August 1, 2023

Publication date: April 18, 2024

Applicant: Google LLC

Inventors: Sunil Kumar, Victor Yeh
PUPPETEERING A REMOTE AVATAR BY FACIAL EXPRESSIONS

Publication number: 20240127523

Abstract: A method includes receiving a first facial framework and a first captured image of a face. The first facial framework corresponds to the face at a first frame and includes a first facial mesh of facial information. The method also includes projecting the first captured image onto the first facial framework and determining a facial texture corresponding to the face based on the projected first captured image. The method also includes receiving a second facial framework at a second frame that includes a second facial mesh of facial information and updating the facial texture based on the received second facial framework. The method also includes displaying the updated facial texture as a three-dimensional avatar. The three-dimensional avatar corresponds to a virtual representation of the face.

Type: Application

Filed: December 21, 2023

Publication date: April 18, 2024

Applicant: Google LLC

Inventors: Tarek Hefny, Nicholas Reiter, Brandon Young, Arun Kandoor, Dillon Cower
NETWORK ANOMALY DETECTION

Publication number: 20240127055

Abstract: A method for detecting network anomalies includes receiving a control message from a cellular network and extracting one or more features from the control message. The method also includes predicting a potential label for the control message using a predictive model configured to receive the one or more extracted features from the control message as feature inputs. Here, the predictive model is trained on a set of training control messages where each training control message includes one or more corresponding features and an actual label. The method further includes determining that a probability of the potential label satisfies a confidence threshold. The method also includes analyzing the control message to determine whether the control message corresponds to a respective network performance issue. When the control message impacts network performance, the method includes communicating the network performance issue to a network entity responsible for the network performance issue.

Type: Application

Filed: December 28, 2023

Publication date: April 18, 2024

Applicant: GOOGLE LLC

Inventors: James PEROULAS, Poojita THUKRAL, Dutt KALAPATAPU, Andreas TERZIS, Krishna SAYANA
CREATING DYNAMIC DATA-BOUND CONTAINER HOSTED VIEWS AND EDITABLE FORMS

Publication number: 20240119222

Abstract: A method for using a user-fillable form in a host container includes receiving, at a host container, a user-fillable form bound to dynamic data from an underlying data source where the user-fillable form has a data structure generated by prepopulated coding. The method further includes translating the user-fillable form into a hostable format for the host container. The method also includes rendering, using the hostable format for the host container, the user-fillable form in a user interface. The method further includes receiving, at the user interface of the host container, from a user of the host container, a data entry for input to the user-fillable form and updating, by the host container, the dynamic data from the underlying data source by persisting data from the data entry in a data store associated with the underlying data source.

Type: Application

Filed: December 15, 2023

Publication date: April 11, 2024

Applicant: Google LLC

Inventors: Michael Jeffrey Procopio, Sarmad Hashmi
ADAPTIVE ARTIFICIAL NEURAL NETWORK SELECTION TECHNIQUES

Publication number: 20240119286

Abstract: Computer-implemented techniques can include obtaining, by a client computing device, a digital media item and a request for a processing task on the digital item and determining a set of operating parameters based on (i) available computing resources at the client computing device and (ii) a condition of a network. Based on the set of operating parameters, the client computing device or a server computing device can select one of a plurality of artificial neural networks (ANNs), each ANN defining which portions of the processing task are to be performed by the client and server computing devices. The client and server computing devices can coordinate processing of the processing task according to the selected ANN. The client computing device can also obtain final processing results corresponding to a final evaluation of the processing task and generate an output based on the final processing results.

Type: Application

Filed: December 15, 2023

Publication date: April 11, 2024

Applicant: GOOGLE LLC

Inventors: Matthew SHARIFI, Jakob Nicolaus FOERSTER
EARBUD-TO-EARBUD CROSS-ACKNOWLEDGEMENT AND COMMUNICATION RELAY

Publication number: 20240121549

Abstract: Various arrangements for short-range wireless communication between audio output devices, such as true wireless earbuds, are presented herein. A first earbud of a pair of earbuds may determine that a first audio packet addressed to the first earbud from an audio source was not properly received. However, a second earbud of the pair of earbuds may properly receive the first audio packet addressed to the first earbud. The second earbud can then, directly to the first earbud, transmit a cross acknowledgement indicating that the second earbud properly received the audio packet.

Type: Application

Filed: June 2, 2023

Publication date: April 11, 2024

Applicant: Google LLC

Inventors: Daniel Barros, Sunil Kumar
JOINT CONNECTED ISOCHRONOUS STREAM COMMUNICATION WITH CROSS ACKNOWLEDGEMENT

Publication number: 20240121064

Abstract: Various arrangements for short-range wireless communication are presented herein. An earbud of a pair of true wireless earbuds can receive an audio packet addressed to the other earbud of the pair. A single connected isochronous stream (CIS) within a connected isochronous group (CIG) may be present between the pair of true wireless earbuds and an audio source which transmitted the audio packet. The earbud can transmit a cross-acknowledgement indicating receipt of the audio packet to the other earbud. The earbud can also transmit audio data from the audio packet to the other earbud after the cross acknowledgement.

Type: Application

Filed: July 5, 2023

Publication date: April 11, 2024

Applicant: Google LLC

Inventors: Daniel Barros, Sunil Kumar
LOCATION-BASED RESPONSES TO TELEPHONE REQUESTS

Publication number: 20240119936

Abstract: A method for receiving processed information at a remote device is described. The method includes transmitting from the remote device a verbal request to a first information provider and receiving a digital message from the first information provider in response to the transmitted verbal request. The digital message includes a symbolic representation indicator associated with a symbolic representation of the verbal request and data used to control an application. The method also includes transmitting, using the application, the symbolic representation indicator to a second information provider for generating results to be displayed on the remote device.

Type: Application

Filed: December 18, 2023

Publication date: April 11, 2024

Applicant: Google LLC

Inventors: Gudmundur HAFSTEINSSON, Michael J. Lebeau, Natalia Marmasse, Sumit Agarwal, Dipochand Nishar
Mitigating Display Diffraction Flares for Under-Display Sensing

Publication number: 20240118772

Abstract: This document describes systems and techniques directed at mitigating display diffraction flares for under-display sensing. In aspects, an equation may be derived that models the effects of a display in producing a diffraction phenomenon at an image plane of a sensing region for an under-display light-sensing device. The equation may be used to determine an arrangement (e.g., an optimized arrangement) of components (e.g., sub-pixels) within the display that minimizes a diffraction efficiency for at least one diffraction order and, thereby, mitigates an intensity and/or a prevalence of optical artifacts in light-sensing data. In implementations, an image intensity point-spread-function is utilized to calculate diffraction efficiencies for respective diffraction orders (e.g., the lowest diffraction orders, the diffraction orders with the greatest brightness).

Type: Application

Filed: December 11, 2023

Publication date: April 11, 2024

Applicant: Google LLC

Inventors: Xi Chen, Changgeng Liu, Ion Bita, Marek Mienko
EARBUD-TO-EARBUD COMMUNICATION RELAY

Publication number: 20240121550

Abstract: Various arrangements of wireless earbuds are presented. A first earbud, can include a first speaker, a first processing system, and a first wireless communication interface, that communicates with an audio source device using Bluetooth communications. A second earbud can include a second speaker, a second processing system, and a second wireless communication interface, that communicates with the audio source device and the first earbud using Bluetooth communications. The first earbud and the second earbud may be configured to wirelessly communicate with each other following completion of a first connected isochronous stream (CIS) event for the first earbud and second CIS event for the second earbud within a connected isochronous group (CIG) event.

Type: Application

Filed: June 2, 2023

Publication date: April 11, 2024

Applicant: Google LLC

Inventors: Sunil Kumar, Daniel Barros
Voice Query Handling in an Environment with Multiple Users

Publication number: 20240119944

Abstract: A method includes detecting multiple users, receiving a first query issued by a first user, the first query including a command for a digital assistant to perform a first action, and enabling a round robin mode to control performance of actions commanded by queries. The method also includes, while performing the first action, receiving audio data corresponding to a second query including a command to perform a second action, performing speaker identification on the audio data, determining that the second query was spoken by the first user, preventing performing the second action, and prompting at least another user to issue a query. The method further includes receiving a third query issued by a second user, the third query including a command for the digital assistant to perform a third action, and when the digital assistant completes performing the first action, executing performance of the third action.

Type: Application

Filed: October 6, 2022

Publication date: April 11, 2024

Applicant: Google LLC

Inventors: Matthew Sharifi, Victor Carbune
Aggregatable Application Programming Interface

Publication number: 20240118956

Abstract: A method for an aggregatable application programming interface (API) includes receiving, from a third party service, an aggregation request requesting aggregation of client data from a client of the third party service. The method also includes receiving, from an API executed by a client device of the client, a first portion of the client data. The method includes storing the first portion of the client data and receiving, from the API, a second portion of the client data. The method includes determining that the second portion of the client data is a final portion of the client data. In response, the method includes aggregating the first portion of the client data with the second portion of the client data. The method also includes transmitting the aggregated client data to the third party service.

Type: Application

Filed: October 11, 2022

Publication date: April 11, 2024

Applicant: Google LLC

Inventor: Naitian Liu

1 2 3 4 5 … next