Google Patent Applications

Google patent applications that are pending before the United States Patent and Trademark Office (USPTO).

Switching Network For Dynamically Reconfigurable Power Plane

Publication number: 20230315170

Abstract: A system including a power bus configured to supply power to a plurality of server racks arranged within a space of a building, a first power source connection positioned at a first side of the building and configured to supply power from a first power source to the power bus, a second power source positioned at a second side of the building different from the first side and configured to supply power from a second power source to the power bus, and a plurality of diverter switches arranged within the power bus. Each diverter switch may be configured to receive a respective control signal and, responsive to the respective control signal, redirect power within the power bus.

Type: Application

Filed: June 12, 2023

Publication date: October 5, 2023

Applicant: Google LLC

Inventors: Drazena Brocilo, Selver Corhodzic
Predictive Load Transient Based Voltage Regulator Turbo for Voltage Droop Minimization

Publication number: 20230318456

Abstract: Controlling voltage supplied to a load includes predicting a load current transient, generating a turbo signal in response to predicting the load current transient, and increasing, in response to the turbo signal, responsiveness of a voltage regulator supplying voltage to the load.

Type: Application

Filed: April 4, 2022

Publication date: October 5, 2023

Applicant: Google LLC

Inventors: Chenhao Nan, Qiong Wang, Kaushik Vaidyanathan, Houle Gan, Xin Li
Robustness Metric for Cloud Providers

Publication number: 20230315527

Abstract: A method includes receiving a system independence query requesting determination of a level of independence between a first system and a second system. The method includes obtaining a first set of time-series data including a first series of data points listed in time order and obtaining a second set of time-series data including a second series of data points listed in time order. Each data point of the first and second series of data points represents a respective system value of a feature associated with the first and second system. The method includes determining an amount of correlation between the first set of time-series data and the second set of time-series data. When the amount of correlation between the first set of time-series data and the second set of time-series data satisfies a correlation threshold, the method includes reporting that the first system and the second system are independent.

Type: Application

Filed: March 30, 2022

Publication date: October 5, 2023

Applicant: Google LLC

Inventors: Krzysztof Duleba, John Heizelman
HOME MONITORING AND CONTROL SYSTEM

Publication number: 20230314231

Abstract: This application is directed to a home monitoring and control system including a doorbell installed at a door of a home. The doorbell has a button configured to, upon being pressed, wirelessly initiate a first communication to indicate presence of a person at the door. The doorbell also has a camera configured to capture video data within a field of view, and a processor configured to cause a communication component to enable the first communication and wirelessly stream via a remote server the video data captured by the camera to a monitoring device associated with an occupant of the home.

Type: Application

Filed: June 9, 2023

Publication date: October 5, 2023

Applicant: Google LLC

Inventors: Anthony M. Fadell, Matthew L. Rogers, Yoky Matsuoka, David Sloo, Maxime Veron, Isabel I. Guenette, Shigefumi Honjo
Enhancing Domain Keys Identified Mail (DKIM) Signatures

Publication number: 20230318844

Abstract: A method for securing messages includes obtaining, at a message server, a message for a user of a message service hosted by the message server. The message includes a header and the header includes a digital signature signed by an author of the message and a list of one or more recipients of the message. The method includes determining whether the digital signature by the author is valid and determining, using the list of one or more recipients, whether the user is a declared recipient of the message. When the digital signature by the author is valid and the user is the declared recipient of the message, the method includes delivering the message to a user device of the user. When the digital signature by the author is valid and the user is not the declared recipient of the message, the method includes alerting the user.

Type: Application

Filed: April 1, 2022

Publication date: October 5, 2023

Applicant: Google LLC

Inventor: Wei-Haw Chuang
Interface for Communicating a Threshold in a Camera

Publication number: 20230319399

Abstract: This document describes techniques and systems that enable an interface for communicating a threshold in a camera. An electronic device recognizes an in-camera, drag gesture that triggers a camera application to switch modes from a real-time display mode (displaying real-time preview images in a viewfinder) to a buffer-display mode, which displays frames recorded in the camera buffer. During the motion of the drag gesture, the electronic device provides dynamic visual feedback indicating a relation between a drag distance of the drag gesture and a target threshold for the drag gesture. For simplicity and conciseness, the visual feedback can be combined with the virtual shutter control. After meeting the threshold, the user releases the touch input of the drag gesture and the system triggers the camera application to switch modes. This allows capture of a “missed” moment that was recorded in the camera buffer but not stored in non-volatile memory.

Type: Application

Filed: July 19, 2021

Publication date: October 5, 2023

Applicant: Google LLC

Inventor: Rachit Gupta
Alignment Prediction to Inject Text into Automatic Speech Recognition Training

Publication number: 20230317059

Abstract: A method includes receiving training data that includes unspoken textual utterances, un-transcribed non-synthetic speech utterances, and transcribed non-synthetic speech utterances. Each unspoken textual utterance is not paired with any corresponding spoken utterance of non-synthetic speech. Each un-transcribed non-synthetic speech utterance not paired with a corresponding transcription. Each transcribed non-synthetic speech utterance paired with a corresponding transcription. The method also includes generating a corresponding alignment output for each unspoken textual utterance of the received training data using an alignment model. The method also includes pre-training an audio encoder on the alignment outputs generated for corresponding to the unspoken textual utterances, the un-transcribed non-synthetic speech utterances, and the transcribed non-synthetic speech utterances to teach the audio encoder to jointly learn shared speech and text representations.

Type: Application

Filed: February 13, 2023

Publication date: October 5, 2023

Applicant: Google LLC

Inventors: Andrew M Rosenberg, Zhehuai Chen, Yu Zhang, Bhuvana Ramabhadran, Pedro J. Moreno Mengibar
CONTROLLING DUAL-MODE BLUETOOTH LOW ENERGY MULTIMEDIA DEVICES

Publication number: 20230319479

Abstract: The description relates to a device (CTRL-DEV) for controlling a dual-mode Bluetooth low energy multimedia device (DM-BLE), the dual-mode BLE multimedia device comprising a first sound system (SS1) and a second sound system (SS2) which are arranged to simultaneously stream an input multimedia stream, the first and second sound systems being respectively associated with at least one first Bluetooth multimedia device (SPK1, SPK2, SPKN) and at least one Bluetooth multimedia device (BLE-SPK1, BLE-SPK2, BLE-SPKN). The description also refers to a dual-mode Bluetooth low energy multimedia device (DM-BLE?), a method, a computer program and a non-transitory computer-readable storage medium.

Type: Application

Filed: April 13, 2023

Publication date: October 5, 2023

Applicant: Google LLC

Inventors: Thomas Girardier, Julien Goupy, Nicolas Guezellot Prudhomme
Adaptable Workload System

Publication number: 20230315551

Abstract: A method includes determining a cluster reliability of a computing cluster including a maximum computing capacity and representative of a reliability of the computing cluster when utilizing an entirety of the maximum computing capacity. The operations include receiving a provisioning request of the computing cluster including a threshold reliability of the computing cluster. In response to the provisioning request, determining, using the cluster reliability, a reserved computing capacity of the computing cluster based on the threshold reliability. The reserved computing capacity is less than the maximum computing capacity. Based on the reserved computing capacity and the maximum computing capacity, the operations include determining an unreserved computing capacity of the computing cluster. The operations include provisioning the computing cluster for execution of a user workload. The user workload executes on the unreserved computing capacity.

Type: Application

Filed: March 30, 2022

Publication date: October 5, 2023

Applicant: Google LLC

Inventors: Gobind Jit Singh Johar, Stephen James Muir, Philip William Stoneman, William Mark Pulford, Jonathon Buckley, Bodie William Francis, Andrew Oates
Access Point Device

Publication number: 20230319979

Abstract: This document describes an access point device and associated systems and methods. The techniques and systems include an access point device that includes a housing with an antenna carrier, a circuit board assembly, a heat sink, and a heat shield positioned within the housing. The housing includes a top housing member connected to a bottom housing member. The top housing member includes a concave-down top-end portion connected to a generally cylindrical vertical wall via rounded corners. The antenna carrier supports multiple antennas positioned proximate to an inner surface of the vertical wall. The heat sink is positioned between the antenna carrier and the circuit board assembly. The circuit board assembly is positioned between the heat shield and the heat sink, and the heat shield is positioned between the circuit board assembly and the bottom housing member.

Type: Application

Filed: June 8, 2023

Publication date: October 5, 2023

Applicant: Google LLC

Inventors: Yau-Shing Lee, Rolando Willcox Esparza, George Liu, Wing Tung Wong, Frédéric Heckmann, Vivian W. Tang
Systems and Methods of Detecting and Responding to a Visitor to a Smart Home Environment

Publication number: 20230306826

Abstract: A method of detecting and responding to a visitor to a smart home environment via an electronic greeting system of the smart home environment, including determining that a visitor is approaching an entryway of the smart home environment; initiating a facial recognition operation while the visitor is approaching the entryway; initiating an observation window in response to the determination that a visitor is approaching the entryway; obtaining context information from one or more sensors of the smart home environment during the observation window; and at the end of the time window, initiating a response to the detected approach of the visitor based on the context information and/or an outcome of the facial recognition operation.

Type: Application

Filed: June 1, 2023

Publication date: September 28, 2023

Applicant: Google LLC

Inventors: Jason Evans Goulden, Rengarajan Aravamudhan, Hae Rim Jeong, Michael Dixon, James Edward Stewart, Sayed Yusef Shafi, Sahana Mysore, Seungho Yang, Yu-An Lien, Christopher Charles Burns, Rajeev Nongpiur, Jeffrey Boyd
Speech Recognition Using Word or Phoneme Time Markers Based on User Input

Publication number: 20230306965

Abstract: A method for separating target speech from background noise contained in an input audio signal includes receiving the input audio signal captured by a user device, wherein the input audio signal corresponds to target speech of multiple words spoken by a target user and containing background noise in the presence of the user device while the target user spoke the multiple words in the target speech. The method also includes receiving a sequence of time markers input by the target user in cadence with the target user speaking the multiple words in the target speech, and correlating the sequence of time markers with the input audio signal to generate enhanced audio features that separate the target speech from the background noise in the input audio signal. The method also includes processing, using a speech recognition model, the enhanced audio features to generate a transcription of the target speech.

Type: Application

Filed: January 30, 2023

Publication date: September 28, 2023

Applicant: Google LLC

Inventor: Dongeek Shin
QUERY RESTARTABILITY

Publication number: 20230306028

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for restarting a query using a token. One of the methods includes receiving, by a computer from a requesting device, a query; determining, using a data storage system, a current result responsive to the query; generating, using the current result, a restart token that represents operations performed to determine a plurality of results responsive to the query including the current result responsive to the query and that can be used to determine a new result responsive to the query that was not included in the plurality of results responsive to the query; and providing, to the requesting device, a message that includes a) first data for the restart token that represents operations performed to determine the plurality of results responsive to the query and b) second data for the current result responsive to the query.

Type: Application

Filed: May 26, 2023

Publication date: September 28, 2023

Applicant: Google LLC

Inventors: Yevgeniy Kogan, Rajesh Rao, Sergey Melnik
GARBAGE COLLECTION FOR DATA STORAGE

Publication number: 20230305733

Abstract: Methods, systems, apparatus, including computer programs encoded on computer storage media, for reclaiming storage space in a storage environment. In one aspect, the method includes actions of aggregating data that is indicative of access to one or more data objects, determining a future storage cost associated with each of a plurality of data objects, determining an access window for each of the plurality of data objects, identifying a data object based on (i) the future storage cost that satisfies a predetermined threshold and (ii) a data object access window, providing a notification to a user device that requests feedback from a user indicating whether the data object can be deleted, and in response to receiving data that indicates that the data object can be deleted, generating an instruction to cause deletion of the data object upon the expiration of the access window.

Type: Application

Filed: March 26, 2022

Publication date: September 28, 2023

Applicant: Google LLC

Inventors: Konstantinos Nikoloudakis, Sven Koehler, Danyao Wang, Sahand Saba, Long Fei, Simon Tyler Wise, David Halladay Schneider
Automatic Detection and Mitigation of Denial-of-Service Attacks

Publication number: 20230308476

Abstract: A method for mitigating network abuse includes obtaining a first set of network traffic messages of network traffic currently received by a network service and determining, via a first model, whether network abuse is occurring based on the first set of network traffic messages. When the network abuse is occurring, the method includes obtaining a second set of current network traffic messages. The method also includes, for each network traffic message in the second set of network traffic messages, labeling, via a second model, the network traffic message as an abusing network traffic message or a non-abusing network traffic message. The method also includes generating, via a third model, at least one network traffic rule. Each network traffic rule, when implemented, reduces an effect of the abusing network traffic messages.

Type: Application

Filed: May 9, 2023

Publication date: September 28, 2023

Applicant: Google LLC

Inventors: Francois Pepin, Andre Lloyd Perlee Harder, Prajakta Joshi, Amitabha Roy, Saila Talagadadeevi, Emil Kiner, Chia-Tung Kuo, Jiayu Ye
Thermal Gradient Battery Monitoring System and Methods

Publication number: 20230304951

Abstract: A battery pack includes a battery, a first temperature sensor configured to provide a first temperature value associated with a temperature of the battery, a heat source disposed proximate to the battery and configured to heat the battery, a second temperature sensor configured to provide a second temperature value associated with a temperature of the heat source, and a control board coupled to the first temperature sensor and the second temperature sensor, wherein the control board is configured to receive the first temperature value and the second temperature value. The control board is configured to compare the first temperature value and the second temperature value to determine a temperature gradient between the battery and the heat source and transmit an alert if the temperature gradient exceeds a first temperature gradient threshold.

Type: Application

Filed: May 4, 2023

Publication date: September 28, 2023

Applicant: Google LLC

Inventors: David Wang, Arun Raghupathy, James Robert Lim, Ihab A. Ali, Chang Hong Ye
UNIVERSAL HAND CONTROLLER

Publication number: 20230305630

Abstract: Techniques of controlling electronic devices using gestures use a wearable device on a user which translates, via a model, user movements into signals that both identify an electronic device to be controlled and a specific action to take with regard to that electronic device. The wearable device includes an inertial measurement unit (IMU) sensor and a photoplethysmography (PPG) sensor and measure six degrees of freedom (6DOF). The model is a convolutional neural network (CNN) that takes x, y, and z-acceleration signals generated by the IMU and PPG and places each acceleration component generated from each sensor in a separate channel. The CNN takes the input from each channel and generates a respective, separate model for each channel. The output at each of the stacked layers are combined in a fully connected layer to produce CNN output identifying an electronic device and a control for the electronic device.

Type: Application

Filed: March 28, 2022

Publication date: September 28, 2023

Applicant: GOOGLE LLC

Inventors: Dongeek Shin, Ricardo John Campbell
Spatial Audio Communication Between Devices with Speaker Array and/or Microphone Array

Publication number: 20230308825

Abstract: The technology generally relates to spatial audio communication between devices. For example, a first device and a second device may be connected via a communication link. The first device may capture audio signals in an environment through two or more microphones. The first device may encode the captured audio with direction information. The first device may transmit the encoded audio via the communication link to the second device. The second device may decode the encoded audio to be output by one or more speakers of the second device. The second device may output the decoded audio to recreate positions of the captured audio signals.

Type: Application

Filed: March 21, 2023

Publication date: September 28, 2023

Applicant: Google LLC

Inventors: Jian Guo, Frances Maria Hui Hong Kwee
Streaming End-to-end Multilingual Speech Recognition with Joint Language Identification

Publication number: 20230306958

Abstract: A method includes receiving a sequence of acoustic frames as input to an automatic speech recognition (ASR) model. The method also includes generating, by a first encoder, a first higher order feature representation for a corresponding acoustic frame. The method also includes generating, by a second encoder, a second higher order feature representation for a corresponding first higher order feature representation. The method also includes generating, by a language identification (ID) predictor, a language prediction representation based on a concatenation of the first higher order feature representation and the second higher order feature representation. The method also includes generating, by a first decoder, a first probability distribution over possible speech recognition hypotheses based on a concatenation of the second higher order feature representation and the language prediction representation.

Type: Application

Filed: March 23, 2023

Publication date: September 28, 2023

Applicant: Google LLC

Inventors: Chao Zhang, Bo Li, Tara N. Sainath, Trevor Strohman, Sepand Mavandadi, Shuo-yiin Chang, Parisa Haghani
LABEL PROPAGATION IN A DISTRIBUTED SYSTEM

Publication number: 20230306060

Abstract: Data are maintained in a distributed computing system that describe a graph. The graph represents relationships among items. The graph has a plurality of vertices that represent the items and a plurality of edges connecting the plurality of vertices. At least one vertex of the plurality of vertices includes a set of label values indicating the at least one vertex's strength of association with a label from a set of labels. The set of labels describe possible characteristics of an item represented by the at least one vertex. At least one edge of the plurality of edges includes a set of label weights for influencing label values that traverse the at least one edge. A label propagation algorithm is executed for a plurality of the vertices in the graph in parallel for a series of synchronized iterations to propagate labels through the graph.

Type: Application

Filed: June 1, 2023

Publication date: September 28, 2023

Applicant: Google LLC

Inventors: Matthew H. Austern, James C. Dehnert, Aart J.c. Bik, Grzegorz J. Czajkowski, Grzegorz Malewicz
Time Series Forecasting

Publication number: 20230297583

Abstract: A method for time series forecasting includes receiving a time series forecasting query from a user requesting the data processing hardware to perform a plurality of time series forecasts. Each time series forecast is a forecast of future data based on respective current data. Simultaneously, for each time series forecast of the plurality of time series forecasts requested by the time series forecasting query, the method includes training a plurality of models for the respective time series forecast. The method also includes determining which model of the plurality of models best fits the respective time series forecast and forecasting the future data based on the determined best fitting model and the respective current data. The method also includes returning, to the user, the forecasted future data for each of the plurality of time series forecasts request by the timer series forecasting query.

Type: Application

Filed: May 25, 2023

Publication date: September 21, 2023

Applicant: Google LLC

Inventors: Xi Cheng, Amir H. Hormati, Lisa Yin, Umar Syed
Optimal Time-to-Event Modeling for Longitudinal Prediction fo Open Entitles

Publication number: 20230297899

Abstract: A method for optimal time-to-event (TTE) modeling includes obtaining a forecast request requesting performance of a TTE forecast forecasting an amount of time an event will occur after a starting point in time. The method includes obtaining a cutoff value representing an amount of time after the starting point in time that the event has not occurred. The method also includes forecasting, using an uncertainty forecasting model, the amount of time the event will occur after the starting point in time and updating the forecasted amount of time based on the cutoff value. The method also includes returning the updated forecasted amount of time the event will occur after the starting point in time.

Type: Application

Filed: March 14, 2023

Publication date: September 21, 2023

Applicant: Google LLC

Inventors: Jingtao Wang, Wangyang Zhang, Michael Peter Perrone
Deliberation by Text-Only and Semi-Supervised Training

Publication number: 20230298563

Abstract: A method of text-only and semi-supervised training for deliberation includes receiving training data including unspoken textual utterances that are each not paired with any corresponding spoken utterance of non-synthetic speech, and training a deliberation model that includes a text encoder and a deliberation decoder on the unspoken textual utterances. The method also includes receiving, at the trained deliberation model, first-pass hypotheses and non-causal acoustic embeddings. The first-pass hypotheses is generated by a recurrent neural network-transducer (RNN-T) decoder for the non-causal acoustic embeddings encoded by a non-causal encoder. The method also includes encoding, using the text encoder, the first-pass hypotheses generated by the RNN-T decoder, and generating, using the deliberation decoder attending to both the first-pass hypotheses and the non-causal acoustic embeddings, second-pass hypotheses.

Type: Application

Filed: March 18, 2023

Publication date: September 21, 2023

Applicant: Google LLC

Inventors: Ke Hu, Tara N. Sainath, Yanzhang He, Rohit Prabhavalkar, Sepand Mavandadi, Weiran Wang, Trevor Strohman
Hotphrase Triggering Based On A Sequence Of Detections

Publication number: 20230298588

Abstract: A method includes receiving audio data corresponding to an utterance spoken by the user and captured by the user device. The utterance includes a command for a digital assistant to perform an operation. The method also includes determining, using a hotphrase detector configured to detect each trigger word in a set of trigger words associated with a hotphrase, whether any of the trigger words in the set of trigger words are detected in the audio data during the corresponding fixed-duration time window. The method also includes determining identifying, in the audio corresponding to the utterance, the hotphrase when each other trigger word in the set of trigger words was also detected in the audio data. The method also includes triggering an automated speech recognizer to perform speech recognition on the audio data when the hotphrase is identified in the audio data corresponding to the utterance.

Type: Application

Filed: May 25, 2023

Publication date: September 21, 2023

Applicant: Google LLC

Inventors: Victor Carbune, Matthew Sharifi
Using Non-Parallel Voice Conversion for Speech Conversion Models

Publication number: 20230298565

Abstract: A method includes receiving a set of training utterances each including a non-synthetic speech representation of a corresponding utterance, and for each training utterance, generating a corresponding synthetic speech representation by using a voice conversion model. The non-synthetic speech representation and the synthetic speech representation form a corresponding training utterance pair. At each of a plurality of output steps for each training utterance pair, the method also includes generating, for output by a speech recognition model, a first probability distribution over possible non-synthetic speech recognition hypotheses for the non-synthetic speech representation and a second probability distribution over possible synthetic speech recognition hypotheses for the synthetic speech representation.

Type: Application

Filed: April 25, 2022

Publication date: September 21, 2023

Applicant: Google LLC

Inventors: Andrew M. Rosenberg, Gary Wang, Bhuvana Ramabhadran, Fadi Biadsy
Emotionally Intelligent Responses to Information Seeking Questions

Publication number: 20230298580

Abstract: A method for generating emotionally intelligent responses to information seeking questions includes receiving audio data corresponding to a query spoken by a user and captured by an assistant-enabled device associated with the user, and processing, using a speech recognition model, the audio data to determine a transcription of the query. The method also includes performing query interpretation on the transcription of the query to identify an emotional state of the user that spoke the query, and an action to perform. The method also includes obtaining a response preamble based on the emotional state of the user and performing the identified action to obtain information responsive to the query. The method further includes generating a response including the obtained response preamble followed by the information responsive to the query.

Type: Application

Filed: March 18, 2022

Publication date: September 21, 2023

Applicant: Google LLC

Inventors: Madelaine Plauché, Kate Beryl Berman
Optimizing Personal VAD for On-Device Speech Recognition

Publication number: 20230298591

Abstract: A computer-implemented method includes receiving a sequence of acoustic frames corresponding to an utterance and generating a reference speaker embedding for the utterance. The method also includes receiving a target speaker embedding for a target speaker and generating feature-wise linear modulation (FiLM) parameters including a scaling vector and a shifting vector based on the target speaker embedding. The method also includes generating an affine transformation output that scales and shifts the reference speaker embedding based on the FiLM parameters. The method also includes generating a classification output indicating whether the utterance was spoken by the target speaker based on the affine transformation output.

Type: Application

Filed: March 17, 2023

Publication date: September 21, 2023

Applicant: Google LLC

Inventors: Shaojin Ding, Rajeev Rikhye, Qiao Liang, Yanzhang He, Quan Wang, Arun Narayanan, Tom O'Malley, Ian McGraw
Scalable Model Specialization Framework for Speech Model Personalization

Publication number: 20230298574

Abstract: A method for speech conversion includes obtaining a speech conversion model configured to convert input utterances of human speech directly into corresponding output utterances of synthesized speech. The method further includes receiving a speech conversion request including input audio data corresponding to an utterance spoken by a target speaker associated with atypical speech and a speaker identifier uniquely identifying the target speaker. The method includes activating, using the speaker identifier, a particular sub-model for biasing the speech conversion model to recognize a type of the atypical speech associated with the target speaker identified by the speaker identifier.

Type: Application

Filed: March 15, 2023

Publication date: September 21, 2023

Applicant: Google LLC

Inventors: Fadi Biadsy, Youzheng Chen, Xia Zhang, Oleg Rybakov, Andrew M. Rosenberg, Pedro J.Moreno Mengibar
Microphone Array Configuration Invariant, Streaming, Multichannel Neural Enhancement Frontend for Automatic Speech Recognition

Publication number: 20230298612

Abstract: A multichannel neural frontend speech enhancement model for speech recognition includes a speech cleaner, a stack of self-attention blocks each having a multi-headed self attention mechanism, and a masking layer. The speech cleaner receives, as input, a multichannel noisy input signal and a multichannel contextual noise signal, and generates, as output, a single channel cleaned input signal. The stack of self-attention blocks receives, as input, at an initial block of the stack of self-attention blocks, a stacked input including the single channel cleaned input signal and a single channel noisy input signal, and generates, as output, from a final block of the stack of self-attention blocks, an un-masked output. The masking layer receives, as input, the single channel noisy input signal and the un-masked output, and generates, as output, enhanced input speech features corresponding to a target utterance.

Type: Application

Filed: February 20, 2023

Publication date: September 21, 2023

Applicant: Google LLC

Inventors: Joseph Caroselli, Arun Narayanan, Tom O'malley
Rare Word Recognition with LM-aware MWER Training

Publication number: 20230298570

Abstract: A method includes generating, using an audio encoder, a higher-order feature representation for each acoustic frame in a sequence of acoustic frames; generating, using a decoder, based on the higher-order feature representation, a plurality of speech recognition hypotheses, each hypotheses corresponding to a candidate transcription of an utterance and having an associated first likelihood score; generating, using an external language model, for each speech recognition hypothesis, a second likelihood score; determining, using a learnable fusion module, for each speech recognition hypothesis, a set of fusion weights based on the higher-order feature representation and the speech recognition hypothesis; and generating, using the learnable fusion module, for each speech recognition hypothesis, a third likelihood score based on the first likelihood score, the second likelihood score, and the set of fusion weights, the audio encoder and decoder trained using minimum additive error rate training in the presence of t

Type: Application

Filed: March 21, 2023

Publication date: September 21, 2023

Applicant: Google LLC

Inventors: Weiran Wang, Tongzhou Chen, Tara N. Sainath, Ehsan Variani, Rohit Prakash Prabhavalkar, Ronny Huang, Bhuvana Ramabhadran, Neeraj Gaur, Sepand Mavandadi, Charles Caleb Peyser, Trevor Strohman, Yangzhang He, David Rybach
End-to-End Streaming Keyword Spotting

Publication number: 20230298576

Abstract: A method for training hotword detection includes receiving a training input audio sequence including a sequence of input frames that define a hotword that initiates a wake-up process on a device. The method also includes feeding the training input audio sequence into an encoder and a decoder of a memorized neural network. Each of the encoder and the decoder of the memorized neural network include sequentially-stacked single value decomposition filter (SVDF) layers. The method further includes generating a logit at each of the encoder and the decoder based on the training input audio sequence. For each of the encoder and the decoder, the method includes smoothing each respective logit generated from the training input audio sequence, determining a max pooling loss from a probability distribution based on each respective logit, and optimizing the encoder and the decoder based on all max pooling losses associated with the training input audio sequence.

Type: Application

Filed: May 23, 2023

Publication date: September 21, 2023

Applicant: Google LLC

Inventors: Raziel Alvarez Guevara, Hyun Jin Park, Patrick Violette
4-bit Conformer with Accurate Quantization Training for Speech Recognition

Publication number: 20230298569

Abstract: A method for training a model includes obtaining a plurality of training samples. Each respective training sample of the plurality of training samples includes a respective speech utterance and a respective textual utterance representing a transcription of the respective speech utterance. The method includes training, using quantization aware training with native integer operations, an automatic speech recognition (ASR) model on the plurality of training samples. The method also includes quantizing the trained ASR model to an integer target fixed-bit width. The quantized trained ASR model includes a plurality of weights. Each weight of the plurality of weights includes an integer with the target fixed-bit width. The method includes providing the quantized trained ASR model to a user device.

Type: Application

Filed: March 20, 2023

Publication date: September 21, 2023

Applicant: Google LLC

Inventors: Shaojin Ding, Oleg Rybakov, Phoenix Meadowlark, Shivani Agrawal, Yanzhang He, Lukasz Lew
Freeze Words

Publication number: 20230298575

Abstract: A method for detecting freeze words includes receiving audio data that corresponds to an utterance spoken by a user and captured by a user device associated with the user. The method also includes processing, using a speech recognizer, the audio data to determine that the utterance includes a query for a digital assistant to perform an operation. The speech recognizer is configured to trigger endpointing of the utterance after a predetermined duration of non-speech in the audio data. Before the predetermined duration of non-speech, the method includes detecting a freeze word in the audio data. In response to detecting the freeze word in the audio data, the method also includes triggering a hard microphone closing event at the user device. The hard microphone closing event prevents the user device from capturing any audio subsequent to the freeze word.

Type: Application

Filed: May 23, 2023

Publication date: September 21, 2023

Applicant: Google LLC

Inventors: Matthew Sharifi, Aleksandar Kracun
Generalized Automatic Speech Recognition for Joint Acoustic Echo Cancellation, Speech Enhancement, and Voice Separation

Publication number: 20230298609

Abstract: A method for training a generalized automatic speech recognition model for joint acoustic echo cancellation, speech enhancement, and voice separation includes receiving a plurality of training utterances paired with corresponding training contextual signals. The training contextual signals include a training contextual noise signal including noise prior to the corresponding training utterance, a training reference audio signal, and a training speaker vector including voice characteristics of a target speaker that spoke the corresponding training utterance. The operations also include training, using a contextual signal dropout strategy, a contextual frontend processing model on the training utterances to learn how to predict enhanced speech features. Here, the contextual signal dropout strategy uses a predetermined probability to drop out each of the training contextual signals during training of the contextual frontend processing model.

Type: Application

Filed: February 19, 2023

Publication date: September 21, 2023

Applicant: Google LLC

Inventors: Tom O'Malley, Quan Wang, Arun Narayanan
Inter-Intra Prediction With Implicit Models

Publication number: 20230291925

Abstract: Video coding in accordance with an inter-intra prediction model may include coding an inter-prediction motion vector for a current block of a current frame, obtaining spatial block-context pixels oriented relative to the current block, generating an inter-prediction block, generating a corresponding set of reference block-context pixels oriented relative to the inter-prediction block, identifying inter-intra prediction parameters that correspond with minimizing error between the spatial block-context pixels and the reference block-context pixels, generating a prediction block for the current block by, for a current pixel of the current block, obtaining an inter-prediction pixel, determining a predictor for the current pixel using a combination of the inter-prediction pixel and the inter-intra prediction parameters, and including the predictor in the prediction block.

Type: Application

Filed: July 1, 2020

Publication date: September 14, 2023

Applicant: Google LLC

Inventors: Debargha Mukherjee, Yue Chen, Urvang Joshi, Sarah Parker, Elliott Karpilovsky, Hui Su
ECHO DETECTION

Publication number: 20230291840

Abstract: A method includes receiving a microphone audio signal and a playout audio signal, and determining a frequency representation of the microphone audio signal and a frequency representation of the playout audio signal. For each frequency representation, the method also includes determining features based on the frequency representation. Each feature corresponds to a pair of frequencies of the frequency representation and a period of time between the pair of frequencies. The method also includes determining that a match occurs between a first feature based on the frequency representation of the microphone audio signal and a second feature based on the frequency representation of the playout audio signal, and determining that a delay value between the first feature and the second feature corresponds to an echo within the microphone audio signal.

Type: Application

Filed: May 19, 2023

Publication date: September 14, 2023

Applicant: Google LLC

Inventors: Alexandre Loiko, Marcus Wirebrand, Samuel Martin Zackrisson, Iva Creusen, Mans Gustaf Sebastian Ullberg, Alessio Bazzica, Daniel Johansson
LANGUAGE MODEL BIASING SYSTEM

Publication number: 20230290339

Abstract: Methods, systems, and apparatus for receiving audio data corresponding to a user utterance and context data, identifying an initial set of one or more n-grams from the context data, generating an expanded set of one or more n-grams based on the initial set of n-grams, adjusting a language model based at least on the expanded set of n-grams, determining one or more speech recognition candidates for at least a portion of the user utterance using the adjusted language model, adjusting a score for a particular speech recognition candidate determined to be included in the expanded set of n-grams, determining a transcription of user utterance that includes at least one of the one or more speech recognition candidates, and providing the transcription of the user utterance for output.

Type: Application

Filed: May 16, 2023

Publication date: September 14, 2023

Applicant: Google LLC

Inventors: Petar Aleksic, Pedro J. Moreno Mengibar
CONTROLLING HEAD-MOUNTED DEVICE WITH GESTURES INTO WEARABLE DEVICE

Publication number: 20230280591

Abstract: A method performed by a head-mounted device can include, based on a front-facing camera included in the head-mounted device capturing an image of a wearable device, configuring the head-mounted device to receive input via the wearable device, determining that a gesture received by the wearable device includes a request to launch an application, and, in response to determining that the gesture includes the request to launch the application, launching the application.

Type: Application

Filed: March 2, 2022

Publication date: September 7, 2023

Applicant: GOOGLE LLC

Inventors: Dongeek Shin, Isaac Allen Fehr, Sean Kyungmok Bae, Ding Xu
Device Communications Using a Human Transmission Channel

Publication number: 20230283706

Abstract: This document describes techniques and apparatuses directed at device communications using a human transmission channel. In aspects, a computing device having an ultrasonic sensor is configured to receive ultrasonic signals transmitted through a physical medium associated with a user and convert the ultrasonic signal into a first electrical signal. Upon generating the first electrical signal, the computing device can execute commands included in the first electrical signal and/or transmit the commands to a network and devices wirelessly connected thereto. In so doing, the number of smart features can be reduced, and communications between computing devices can be employed using a human transmission channel without a pairing event.

Type: Application

Filed: March 1, 2022

Publication date: September 7, 2023

Applicant: Google LLC

Inventor: Alejandro Kauffmann
FILE OPERATION TASK OPTIMIZATION

Publication number: 20230281041

Abstract: A method includes receiving, by a data processing apparatus, a plurality of file operation requests, each file operation request including a priority, a deadline, and an operation type and representing a request to perform an operation on at least one file maintained in a distributed file system; identifying, by the data processing apparatus, a group of file operation requests to be executed together from the plurality of file operation requests, the identification based at least in part on at least one of: the file operations in the group of file operations being directed to a same storage system, or file operations in the group of file operations sharing a common operation type; and sending a request to execute the group of file operation requests to a system configured to perform the group of file operation requests.

Type: Application

Filed: May 13, 2023

Publication date: September 7, 2023

Applicant: Google LLC

Inventors: Chi Ma, Kenneth J. Goldman, Yonggang Zhao, Stephen P.G. Gildea
Contextually Relevant Suggestions

Publication number: 20230281205

Abstract: A method includes receiving a query requesting a digital assistant service to perform an action. The query includes a gesture-based query input by a user in response to the user performing a predetermined gesture detected by a gesture input device. The method also includes resolving a user intent of the query based on the predetermined gesture performed by the user, receiving a contextual signal associated with the user when the user performed the predetermined gesture, and generating a contextually-relevant response to the query based on the resolved user intent and the contextual signal.

Type: Application

Filed: March 1, 2022

Publication date: September 7, 2023

Applicant: Google LLC

Inventor: Ramprasad Sedouram
Structured Video Documents

Publication number: 20230281248

Abstract: A method includes receiving a content feed that includes audio data corresponding to speech utterances and processing the content feed to generate a semantically-rich, structured document. The structured document includes a transcription of the speech utterances and includes a plurality of words each aligned with a corresponding audio segment of the audio data that indicates a time when the word was recognized in the audio data. During playback of the content feed, the method also includes receiving a query from a user requesting information contained in the content feed and processing, by a large language model, the query and the structured document to generate a response to the query. The response conveys the requested information contained in the content feed. The method also includes providing, for output from a user device associated with the user, the response to the query.

Type: Application

Filed: March 2, 2023

Publication date: September 7, 2023

Applicant: Google LLC

Inventors: Johan SCHALKWYK, Francoise BEAUFAYS
SMART-HOME DEVICE PLACEMENT AND INSTALLATION USING AUGMENTED-REALITY VISUALIZATIONS

Publication number: 20230281935

Abstract: A method for guiding installation of smart-home devices may include capturing, by a camera of a mobile computing device, a view of an installation location for a smart-home device; rendering, by the mobile computing device, a view of a virtual object that represents a real-world obstruction that will interfere with the operation or installation of the smart-home device; and displaying, by the mobile computing device, the view of a virtual object that represents real-world obstruction with the view of the installation location on the display of the mobile computing device.

Type: Application

Filed: February 28, 2023

Publication date: September 7, 2023

Applicant: Google LLC

Inventors: Adam Mittleman, Jason Chamberlain, Jacobi Grillo, Daniel Biran, Mark Kraz, Lauren Chanen, Daniel Foran, David Fichou, William Dong, Bao-Tram Phan Nguyen, Brian Silverstein, Yash Modi, Alex Finlayson, Dongeek Shin
Smart Device Management Resource Picker

Publication number: 20230281283

Abstract: A method for a smart device management resource picker includes receiving an authorization request from a third party. The authorization request requests access to a user resource managed by the device manager. The device manager manages access controls associated with a plurality of user devises, the access controls are configured by a user. The method also includes determining whether the third party is authorized to access the user resource managed by the device manager. When the third party is authorized to access the user resource managed by the device manager, the method includes determining whether the user has configured access controls at the device manager that governs the user resource subject to the authorization request. When the user has configured a respective access control that governs the user resource subject to the authorization request, the method includes communicating a response to the authorization request based on the respective access control.

Type: Application

Filed: May 15, 2023

Publication date: September 7, 2023

Applicant: Google LLC

Inventors: Vipul Modani, Matthew Marshall, Di Zhu, Prem Kumar
Adaptive Frequency Control in Integrated Circuits

Publication number: 20230280816

Abstract: This document describes systems and techniques for adaptive frequency control in integrated circuits. In response to operating conditions that permit a lower frequency of a clock signal, the described systems and techniques dynamically reduce the clock frequency without adjusting the frequency of an input clock signal. The clock frequency is decreased by gating a fraction of the input clock signal and stretching the ungated cycles by an offset amount. By dynamically adjusting the clock frequency in this manner, an integrated circuit can change its clock frequency more quickly and maintain the supply voltage closer to a lower voltage limit to reduce power consumption and allow safer operations.

Type: Application

Filed: July 27, 2020

Publication date: September 7, 2023

Applicant: Google LLC

Inventors: Derek James Basehore, Nick Sanders
Magnetic Sensing Device for Lid

Publication number: 20230280797

Abstract: The technology provides for a magnetic sensing device. The device includes a magnetic sensor configured to generate a first output triggered by a first polarity and a second output triggered by a second polarity. The device includes a first magnet, a second magnet, and a third magnet. The device may be configured such that, when the second magnet is not within a predetermined distance from the first magnet, a magnetic field from the first magnet having the first polarity causes the first output and the second output to have a first set of values. The device may be configured such that, when the second magnet is within the predetermined distance from the first magnet, a magnetic field from the third magnet having the second polarity causes the first output and the second output to have a second set of values.

Type: Application

Filed: March 13, 2023

Publication date: September 7, 2023

Applicant: Google LLC

Inventors: Yao Ding, Hui Li
Contextual Biasing for Speech Recognition

Publication number: 20230274736

Abstract: A method of biasing speech recognition includes receiving audio data encoding an utterance and obtaining a set of one or more biasing phrases corresponding to a context of the utterance. Each biasing phrase in the set of one or more biasing phrases includes one or more words. The method also includes processing, using a speech recognition model, acoustic features derived from the audio data and grapheme and phoneme data derived from the set of one or more biasing phrases to generate an output of the speech recognition model. The method also includes determining a transcription for the utterance based on the output of the speech recognition model.

Type: Application

Filed: May 4, 2023

Publication date: August 31, 2023

Applicant: Google LLC

Inventors: Rohit Prakash Prabhavalkar, Golan Pundak, Tara N. Sainath, Antoine Jean Bruguier
User Equipment-Coordination Set Full-Duplex Communication

Publication number: 20230276424

Abstract: Methods, devices, systems, and means for coordinating full-duplex communication are described in which a user equipment, UE, configured as a coordinating user equipment for a user equipment-coordination set, UECS, selects a first subset of UEs in the UECS to jointly receive downlink signals and selects a second subset of UEs in the UECS to jointly transmit uplink signals. The coordinating UE receives uplink data to transmit to the network entity and receives, from the first subset of UEs, demodulated and sampled downlink data that is received concurrently with joint-transmission of the uplink data. The coordinating UE combines the samples received from each UE in the first subset of UEs and jointly processes the combined samples to provide decoded data using the received uplink data to cancel crosstalk of uplink signals for the transmitted uplink data to downlink signals to the received downlink data.

Type: Application

Filed: July 1, 2021

Publication date: August 31, 2023

Applicant: Google LLC

Inventors: Jibing Wang, Erik Richard Stauffer
Processing Sensor Data with Multi-Model System on Resource-Constrained Device

Publication number: 20230274147

Abstract: Methods, systems, and computer-readable media for multi-model processing on resource-constrained devices. A resource-constrained device can determine, based on a battery-life for a battery of the device, whether to process input through a first model or a second model. The first model can be a gating model that is more energy efficient to execute, and the second model can be a main model that is more accurate than the gating model. Depending on the current battery-life and/or other criteria, the system can process, through the gating model, sensor input that can record activity performed by a user of the resource-constrained device. If the gating model predicts an activity performed by the user that is recorded by the sensor data, the device can process the same or additional input through the main model. Overall power consumption can be reduced with a minimum accuracy maintained over processing input only through the main model.

Type: Application

Filed: May 5, 2023

Publication date: August 31, 2023

Applicant: Google LLC

Inventors: Chun-Te Chu, Claire Jaja, Kara Vaillancourt, Oleg Veryovka
Controlling Expressivity In End-to-End Speech Synthesis Systems

Publication number: 20230274728

Abstract: A system for generating an output audio signal includes a context encoder, a text-prediction network, and a text-to-speech (TTS) model. The context encoder is configured to receive one or more context features associated with current input text and process the one or more context features to generate a context embedding associated with the current input text. The text-prediction network is configured to process the current input text and the context embedding to predict, as output, a style embedding for the current input text. The style embedding specifies a specific prosody and/or style for synthesizing the current input text into expressive speech. The TTS model is configured to process the current input text and the style embedding to generate an output audio signal of expressive speech of the current input text. The output audio signal has the specific prosody and/or style specified by the style embedding.

Type: Application

Filed: May 9, 2023

Publication date: August 31, 2023

Applicant: Google LLC

Inventors: Daisy Stanton, Eric Dean Battenberg, Russell John Wyatt Skerry-Ryan, Soroosh Mariooryad, David Teh-hwa Kao, Thomas Edward Bagby, Sean Matthew Shannon

prev … 4 5 6 7 8 9 10 11 12 … next