Patents by Inventor Ananda Theertha Suresh

Ananda Theertha Suresh has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Fast Speculative Decoding Using Multiple Parallel Drafts

Publication number: 20250209355

Abstract: Systems and methods are provided for low-latency autoregressive generation of sequence output based on a plurality of parallel draft sequences. A lower-latency machine-learned model (e.g., having a smaller number of parameters than a model of interest) can generate a plurality of draft sequences comprising a plurality of draft tokens per sequence. A machine-learned model of interest (e.g., having a high latency per token) can evaluate a plurality of respective conditional probabilities for the respective draft tokens in parallel. An output sequence comprising one or more accepted draft tokens, corrected tokens, and/or additional tokens can be generated based on the draft tokens and conditional probabilities.

Type: Application

Filed: December 20, 2024

Publication date: June 26, 2025

Inventors: Ziteng Sun, Ananda Theertha Suresh, Jae Hun Ro, Ahmad Beirami, Himanshu Jain, Xinnan Yu, Michael Dennis Riley, Sanjiv Kumar
Systems and methods for communication efficient distributed mean estimation

Patent number: 12219004

Abstract: The present disclosure provides systems and methods for communication efficient distributed mean estimation. In particular, aspects of the present disclosure can be implemented by a system in which a number of vectors reside on a number of different clients, and a centralized server device seeks to estimate the mean of such vectors. According to one aspect of the present disclosure, a client computing device can rotate a vector by a random rotation matrix and then subsequently perform probabilistic quantization on the rotated vector. According to another aspect of the present disclosure, subsequent to quantization but prior to transmission, the client computing can encode the quantized vector according to a variable length coding scheme (e.g., by computing variable length codes).

Type: Grant

Filed: August 31, 2023

Date of Patent: February 4, 2025

Assignee: GOOGLE LLC

Inventors: Ananda Theertha Suresh, Sanjiv Kumar, Hugh Brendan McMahan, Xinnan Yu
Sampled softmax with Random Fourier features

Patent number: 12205005

Abstract: Systems and methods for low bias negative sampling of classes according to the sampled softmax method are described herein. The systems and methods can include training a machine-learned model for classifying inputs into one or more classes of a plurality of classes, each of the plurality of classes having an associated class embedding in a plurality of class embeddings. The systems and methods can include selecting, by the one or more computing devices, one or more negative classes from the plurality of classes based at least in part on a probability distribution approximating a softmax distribution, wherein the probability distribution is determined based at least in part on a Random Fourier Features map.

Type: Grant

Filed: July 17, 2020

Date of Patent: January 21, 2025

Assignee: GOOGLE LLC

Inventors: Xinnan Yu, Ankit Singh Rawat, Jiecao Chen, Ananda Theertha Suresh, Sanjiv Kumar
Structured orthogonal random features for kernel-based machine learning

Patent number: 12079700

Abstract: Techniques of generating input for a kernel-based machine learning system that uses a kernel to perform classification operations on data involve generating unbiased estimators for gaussian kernels according to a new framework called Structured Orthogonal Random Features (SORF). The unbiased estimator KSORF to the kernel involves a linear transformation matrix WSORF computed using products of a set of pairs of matrices, each pair including an orthogonal matrix and respective diagonal matrix whose elements are real numbers following a specified probability distribution. Typically, the orthogonal matrix is a Walsh-Hadamard matrix, the specified probability distribution is a Rademacher distribution, and there are at least two, usually three, pairs of matrices multiplied together to form the linear transformation matrix WSORF.

Type: Grant

Filed: October 25, 2017

Date of Patent: September 3, 2024

Assignee: GOOGLE LLC

Inventors: Daniel Holtmann-Rice, Sanjiv Kumar, Xinnan Yu, Krzysztof Marcin Choromanski, Ananda Theertha Suresh
Systems and Methods for Communication Efficient Distributed Mean Estimation

Publication number: 20240098138

Abstract: The present disclosure provides systems and methods for communication efficient distributed mean estimation. In particular, aspects of the present disclosure can be implemented by a system in which a number of vectors reside on a number of different clients, and a centralized server device seeks to estimate the mean of such vectors. According to one aspect of the present disclosure, a client computing device can rotate a vector by a random rotation matrix and then subsequently perform probabilistic quantization on the rotated vector. According to another aspect of the present disclosure, subsequent to quantization but prior to transmission, the client computing can encode the quantized vector according to a variable length coding scheme (e.g., by computing variable length codes).

Type: Application

Filed: August 31, 2023

Publication date: March 21, 2024

Inventors: Ananda Theertha Suresh, Sanjiv Kumar, Hugh Brendan McMahan, Xinnan Yu
Exponential modeling with deep learning features

Patent number: 11922322

Abstract: Aspects of the present disclosure enable humanly-specified relationships to contribute to a mapping that enables compression of the output structure of a machine-learned model. An exponential model such as a maximum entropy model can leverage a machine-learned embedding and the mapping to produce a classification output. In such fashion, the feature discovery capabilities of machine-learned models (e.g., deep networks) can be synergistically combined with relationships developed based on human understanding of the structural nature of the problem to be solved, thereby enabling compression of model output structures without significant loss of accuracy. These compressed models provide improved applicability to “on device” or other resource-constrained scenarios.

Type: Grant

Filed: January 30, 2023

Date of Patent: March 5, 2024

Assignee: GOOGLE LLC

Inventors: Mitchel Weintraub, Ananda Theertha Suresh, Ehsan Variani
Multiscale quantization for fast similarity search

Patent number: 11874866

Abstract: The present disclosure provides systems and methods that include or otherwise leverage use of a multiscale quantization model that is configured to provide a quantized dataset. In particular, the multiscale quantization model can receive and perform vector quantization of a first dataset. The multiscale quantization model can generate a residual dataset based at least in part on a result of the vector quantization. The multiscale quantization model can apply a rotation matrix to the residual dataset to generate a rotated residual dataset that includes a plurality of rotated residuals. The multiscale quantization model can perform reparameterization of each rotated residual in the rotated residual dataset into a direction component and a scale component. The multiscale quantization model can perform product quantization of the direction components of the plurality of rotated residuals, and perform scalar quantization of the scale components of the plurality of rotated residuals.

Type: Grant

Filed: December 14, 2022

Date of Patent: January 16, 2024

Assignee: GOOGLE LLC

Inventors: Xiang Wu, David Simcha, Daniel Holtmann-Rice, Sanjiv Kumar, Ananda Theertha Suresh, Ruiqi Guo, Xinnan Yu
Systems and methods for communication efficient distributed mean estimation

Patent number: 11785073

Abstract: The present disclosure provides systems and methods for communication efficient distributed mean estimation. In particular, aspects of the present disclosure can be implemented by a system in which a number of vectors reside on a number of different clients, and a centralized server device seeks to estimate the mean of such vectors. According to one aspect of the present disclosure, a client computing device can rotate a vector by a random rotation matrix and then subsequently perform probabilistic quantization on the rotated vector. According to another aspect of the present disclosure, subsequent to quantization but prior to transmission, the client computing can encode the quantized vector according to a variable length coding scheme (e.g., by computing variable length codes).

Type: Grant

Filed: October 15, 2021

Date of Patent: October 10, 2023

Assignee: GOOGLE LLC

Inventors: Ananda Theertha Suresh, Sanjiv Kumar, Hugh Brendan McMahan, Xinnan Yu
Exponential Modeling with Deep Learning Features

Publication number: 20230186096

Abstract: Aspects of the present disclosure enable humanly-specified relationships to contribute to a mapping that enables compression of the output structure of a machine-learned model. An exponential model such as a maximum entropy model can leverage a machine-learned embedding and the mapping to produce a classification output. In such fashion, the feature discovery capabilities of machine-learned models (e.g., deep networks) can be synergistically combined with relationships developed based on human understanding of the structural nature of the problem to be solved, thereby enabling compression of model output structures without significant loss of accuracy. These compressed models provide improved applicability to “on device” or other resource-constrained scenarios.

Type: Application

Filed: January 30, 2023

Publication date: June 15, 2023

Inventors: Mitchel Weintraub, Ananda Theertha Suresh, Ehsan Variani
PRIVACY-PRESERVING MACHINE LEARNING MODEL IMPLEMENTATIONS

Publication number: 20230130021

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for implementing privacy-preserving machine learning models (e.g., neural networks) in secure multi-part computing environments. Methods can include computing an output of a particular layer of a neural network deployed in a two computing system environment using a cosine activation function.

Type: Application

Filed: October 26, 2022

Publication date: April 27, 2023

Inventors: Wittawat Jitkrittum, Michal Mateusz Lukasik, Ananda Theertha Suresh, Xinnan Yu, Gang Wang
Multiscale Quantization for Fast Similarity Search

Publication number: 20230123941

Abstract: The present disclosure provides systems and methods that include or otherwise leverage use of a multiscale quantization model that is configured to provide a quantized dataset. In particular, the multiscale quantization model can receive and perform vector quantization of a first dataset. The multiscale quantization model can generate a residual dataset based at least in part on a result of the vector quantization. The multiscale quantization model can apply a rotation matrix to the residual dataset to generate a rotated residual dataset that includes a plurality of rotated residuals. The multiscale quantization model can perform reparameterization of each rotated residual in the rotated residual dataset into a direction component and a scale component. The multiscale quantization model can perform product quantization of the direction components of the plurality of rotated residuals, and perform scalar quantization of the scale components of the plurality of rotated residuals.

Type: Application

Filed: December 14, 2022

Publication date: April 20, 2023

Inventors: Xiang Wu, David Simcha, Daniel Holtmann-Rice, Sanjiv Kumar, Ananda Theertha Suresh, Ruiqi Guo, Xinnan Yu
Exponential modeling with deep learning features

Patent number: 11568260

Abstract: Aspects of the present disclosure enable humanly-specified relationships to contribute to a mapping that enables compression of the output structure of a machine-learned model. An exponential model such as a maximum entropy model can leverage a machine-learned embedding and the mapping to produce a classification output. In such fashion, the feature discovery capabilities of machine-learned models (e.g., deep networks) can be synergistically combined with relationships developed based on human understanding of the structural nature of the problem to be solved, thereby enabling compression of model output structures without significant loss of accuracy. These compressed models provide improved applicability to “on device” or other resource-constrained scenarios.

Type: Grant

Filed: October 16, 2019

Date of Patent: January 31, 2023

Assignee: GOOGLE LLC

Inventors: Mitchel Weintraub, Ananda Theertha Suresh, Ehsan Variani
Multiscale quantization for fast similarity search

Patent number: 11531695

Abstract: The present disclosure provides systems and methods that include or otherwise leverage use of a multiscale quantization model that is configured to provide a quantized dataset. In particular, the multiscale quantization model can receive and perform vector quantization of a first dataset. The multiscale quantization model can generate a residual dataset based at least in part on a result of the vector quantization. The multiscale quantization model can apply a rotation matrix to the residual dataset to generate a rotated residual dataset that includes a plurality of rotated residuals. The multiscale quantization model can perform reparameterization of each rotated residual in the rotated residual dataset into a direction component and a scale component. The multiscale quantization model can perform product quantization of the direction components of the plurality of rotated residuals, and perform scalar quantization of the scale components of the plurality of rotated residuals.

Type: Grant

Filed: May 14, 2018

Date of Patent: December 20, 2022

Assignee: GOOGLE LLC

Inventors: Xiang Wu, David Simcha, Daniel Holtmann-Rice, Sanjiv Kumar, Ananda Theertha Suresh, Ruiqi Guo, Xinnan Yu
Regression and Time Series Forecasting

Publication number: 20220383145

Abstract: A method for regression and time series forecasting includes obtaining a set of hierarchical time series, each time series in the set of hierarchical time series including a plurality of time series data values. The method includes determining, using the set of hierarchical time series, a basis regularization of the set of hierarchical time series and an embedding regularization of the set of hierarchical time series. The method also includes training a model using the set of hierarchical time series and a loss function based on the basis regularization and the embedding regularization. The method includes forecasting, using the trained model and one of the time series in the set of hierarchical time series, an expected time series data value in the one of the time series.

Type: Application

Filed: May 25, 2022

Publication date: December 1, 2022

Applicant: Google LLC

Inventors: Rajat Sen, Shuxin Nie, Yaguang Li, Abhimanyu Das, Nicolas Loeff, Ananda Theertha Suresh, Pranjal Awasthi, Biswajit Paria
Systems and Methods for Communication Efficient Distributed Mean Estimation

Publication number: 20220046082

Abstract: The present disclosure provides systems and methods for communication efficient distributed mean estimation. In particular, aspects of the present disclosure can be implemented by a system in which a number of vectors reside on a number of different clients, and a centralized server device seeks to estimate the mean of such vectors. According to one aspect of the present disclosure, a client computing device can rotate a vector by a random rotation matrix and then subsequently perform probabilistic quantization on the rotated vector. According to another aspect of the present disclosure, subsequent to quantization but prior to transmission, the client computing can encode the quantized vector according to a variable length coding scheme (e.g., by computing variable length codes).

Type: Application

Filed: October 15, 2021

Publication date: February 10, 2022

Inventors: Ananda Theertha Suresh, Sanjiv Kumar, Hugh Brendan McMahan, Xinnan Yu
Systems and methods for communication efficient distributed mean estimation

Patent number: 11196800

Abstract: The present disclosure provides systems and methods for communication efficient distributed mean estimation. In particular, aspects of the present disclosure can be implemented by a system in which a number of vectors reside on a number of different clients, and a centralized server device seeks to estimate the mean of such vectors. According to one aspect of the present disclosure, a client computing device can rotate a vector by a random rotation matrix and then subsequently perform probabilistic quantization on the rotated vector. According to another aspect of the present disclosure, subsequent to quantization but prior to transmission, the client computing can encode the quantized vector according to a variable length coding scheme (e.g., by computing variable length codes).

Type: Grant

Filed: September 19, 2017

Date of Patent: December 7, 2021

Assignee: Google LLC

Inventors: Ananda Theertha Suresh, Sanjiv Kumar, Hugh Brendan McMahan, Xinnan Yu
PRIVACY PRESERVING MACHINE LEARNING MODEL TRAINING

Publication number: 20210049298

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for privacy preserving training of a machine learning model.

Type: Application

Filed: August 14, 2020

Publication date: February 18, 2021

Inventors: Ananda Theertha Suresh, Xinnan Yu, Sanjiv Kumar, Sashank Jakkam Reddi, Venkatadheeraj Pichapati
Sampled Softmax with Random Fourier Features

Publication number: 20210019654

Abstract: Systems and methods for low bias negative sampling of classes according to the sampled softmax method are described herein. The systems and methods can include training a machine-learned model for classifying inputs into one or more classes of a plurality of classes, each of the plurality of classes having an associated class embedding in a plurality of class embeddings. The systems and methods can include selecting, by the one or more computing devices, one or more negative classes from the plurality of classes based at least in part on a probability distribution approximating a softmax distribution, wherein the probability distribution is determined based at least in part on a Random Fourier Features map.

Type: Application

Filed: July 17, 2020

Publication date: January 21, 2021

Inventors: Xinnan Yu, Ankit Singh Rawat, Jiecao Chen, Ananda Theertha Suresh, Sanjiv Kumar
Hierarchical quantization for fast inner product search

Patent number: 10719509

Abstract: Implementations provide an efficient system for calculating inner products between high-dimensionality vectors. An example method includes clustering database items represented as vectors, selecting a cluster center for each cluster, and storing the cluster center as an entry in a first layer codebook. The method also includes, for each database item, calculating a residual based on the cluster center for the cluster the database item is assigned to and projecting the residual into subspaces. The method also includes determining, for each of the subspaces, an entry in a second layer codebook for the subspace, and storing the entry in the first layer codebook and the respective entry in the second layer codebook for each of the subspaces as a quantized vector for the database item. The entry can be used to categorize an item represented by a query vector or to provide database items responsive to a query vector.

Type: Grant

Filed: October 11, 2016

Date of Patent: July 21, 2020

Assignee: GOOGLE LLC

Inventors: Sanjiv Kumar, David Morris Simcha, Ananda Theertha Suresh, Ruiqi Guo, Xinnan Yu, Daniel Holtmann-Rice
Multiscale Quantization for Fast Similarity Search

Publication number: 20200183964

Abstract: The present disclosure provides systems and methods that include or otherwise leverage use of a multiscale quantization model that is configured to provide a quantized dataset. In particular, the multiscale quantization model can receive and perform vector quantization of a first dataset. The multiscale quantization model can generate a residual dataset based at least in part on a result of the vector quantization. The multiscale quantization model can apply a rotation matrix to the residual dataset to generate a rotated residual dataset that includes a plurality of rotated residuals. The multiscale quantization model can perform reparameterization of each rotated residual in the rotated residual dataset into a direction component and a scale component. The multiscale quantization model can perform product quantization of the direction components of the plurality of rotated residuals, and perform scalar quantization of the scale components of the plurality of rotated residuals.

Type: Application

Filed: May 14, 2018

Publication date: June 11, 2020

Inventors: Xiang Wu, David Simcha, Daniel Holtmann-Rice, Sanjiv Kumar, Ananda Theertha Suresh, Ruiqi Guo, Xinnan Yu

1 2 next