Patents by Inventor Christopher Lott

Christopher Lott has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

EFFICIENT SPECULATIVE DECODING IN AUTOREGRESSIVE GENERATIVE ARTIFICIAL INTELLIGENCE MODELS

Publication number: 20250245430

Abstract: Certain aspects of the present disclosure provide techniques and apparatus for efficiently generating a response to a query input in a generative artificial intelligence model. An example method generally includes generating, based on an input prompt and using a first machine learning model, a set of tokens including one or more subsets of tokens. Each respective subset of the one or more subsets corresponds to a respective portion of a response to the input prompt and includes a fixed number of tokens corresponding to a beam width for a beam search through the set of tokens. The set of tokens is output to a second machine learning model for verification, and information identifying a selected sequence of tokens from the generated set of tokens is received from the second machine learning model. The selected sequence of tokens is output as the response to the input prompt.

Type: Application

Filed: January 26, 2024

Publication date: July 31, 2025

Inventors: Wonseok JEON, Mukul GAGRANI, Mingu LEE, Raghavv GOEL, Junyoung PARK, Christopher LOTT
Speculative decoding in autoregressive generative artificial intelligence models

Patent number: 12373494

Abstract: Certain aspects of the present disclosure provide techniques and apparatus for generating a response to a query input in a generative artificial intelligence model. An example method generally includes receiving a plurality of sets of tokens generated based on an input prompt and a first generative artificial intelligence model, each set of tokens in the plurality of sets of tokens corresponding to a candidate response to the input prompt; selecting, using a second generative artificial intelligence model and recursive adjustment of a target distribution associated with the received plurality of sets of tokens, a set of tokens from the plurality of sets of tokens; and outputting the selected set of tokens as a response to the input prompt.

Type: Grant

Filed: December 13, 2023

Date of Patent: July 29, 2025

Assignee: QUALCOMM Incorporated

Inventors: Christopher Lott, Mingu Lee, Wonseok Jeon, Roland Memisevic
SPECULATIVE DECODING IN AUTOREGRESSIVE GENERATIVE ARTIFICIAL INTELLIGENCE MODELS

Publication number: 20250231989

Abstract: Certain aspects of the present disclosure provide techniques and apparatus for generating a response to a query input in a generative artificial intelligence model. An example method generally includes receiving a plurality of sets of tokens generated based on an input prompt and a first generative artificial intelligence model, each set of tokens in the plurality of sets of tokens corresponding to a candidate response to the input prompt; selecting, using a second generative artificial intelligence model and recursive adjustment of a target distribution associated with the received plurality of sets of tokens, a set of tokens from the plurality of sets of tokens; and outputting the selected set of tokens as a response to the input prompt.

Type: Application

Filed: April 4, 2025

Publication date: July 17, 2025

Inventors: Christopher LOTT, Mingu LEE, Wonseok JEON, Roland MEMISEVIC
REINFORCED TOTAL VARIATION DISTANCE LOSS FOR MACHINE LEARNING MODELS

Publication number: 20250225374

Abstract: Systems and techniques are described herein for training a machine learning (ML) model. For instance, a process can include obtaining, from a teacher ML model, a first prediction based on an input. The process can further include obtaining, from a student ML model, a second prediction based on the input. The process can include determining a loss based on a difference between the second prediction from the first prediction. For instance, the loss can include a variance reduced total variation distance (TVD) loss based on an unbiased gradient estimate of the loss. The process can further include backpropagating the loss through the student ML model to train the student ML model.

Type: Application

Filed: January 8, 2024

Publication date: July 10, 2025

Inventors: Raghavv GOEL, Mukul GAGRANI, Wonseok JEON, Mingu LEE, Christopher LOTT
HYBRID GENERATIVE ARTIFICIAL INTELLIGENCE MODELS

Publication number: 20250200358

Abstract: Certain aspects of the present disclosure provide techniques and apparatus for generating a response to an input query using a generative artificial intelligence model. An example method generally includes receiving an input for processing. A prompt representing the received input is generated based on the received input, contextual information associated with the received prompt, and a prompt-generating artificial intelligence model. The generated prompt is output to a generative artificial intelligence model for processing. A response to the generated prompt is received from the generative artificial intelligence model and output as a response to the received input.

Type: Application

Filed: December 18, 2023

Publication date: June 19, 2025

Inventors: Mingu LEE, Christopher LOTT, Joseph Binamira SORIAGA, Jilei HOU, Muralidhar Reddy AKULA, Jeffrey Baginsky GEHLHAAR
CODE GENERATION USING MACHINE LEARNING MODELS

Publication number: 20250173127

Abstract: Systems and techniques are described for performing code generating using machine learning models (e.g., large language models). For example, a computing device can generate, based on input data, second input data for a machine learning model. The computing device can generate, based on the second input data, a prompt. The computing device can apply a beam search with sampling on the prompt to generate a set of output samples. The computing device can further apply a static analysis to the set of output samples to generate a set of samples and can output the set of samples.

Type: Application

Filed: November 29, 2023

Publication date: May 29, 2025

Inventors: Weiliang ZENG, James Randall EZICK, Christopher LOTT, Joseph Binamira SORIAGA, Piero ZAPPI, Mingu LEE, Arvind Vardarajan SANTHANAM
SPECULATIVE DECODING IN AUTOREGRESSIVE GENERATIVE ARTIFICIAL INTELLIGENCE MODELS

Publication number: 20250148015

Abstract: Certain aspects of the present disclosure provide techniques and apparatus for generating a response to a query input in a generative artificial intelligence model. An example method generally includes receiving a plurality of sets of tokens generated based on an input prompt and a first generative artificial intelligence model, each set of tokens in the plurality of sets of tokens corresponding to a candidate response to the input prompt; selecting, using a second generative artificial intelligence model and recursive adjustment of a target distribution associated with the received plurality of sets of tokens, a set of tokens from the plurality of sets of tokens; and outputting the selected set of tokens as a response to the input prompt.

Type: Application

Filed: January 7, 2025

Publication date: May 8, 2025

Inventors: Christopher LOTT, Mingu LEE, Wonseok JEON, Roland MEMISEVIC
PRACTICAL ACTIVATION RANGE RESTRICTION FOR NEURAL NETWORK QUANTIZATION

Publication number: 20250124265

Abstract: A processor-implemented method determines a practical domain for a following function in a following layer of an artificial neural network. The artificial neural network includes a leading function in a leading layer and the following function in the following layer, which is a subsequent consecutive layer of the artificial neural network. The method also sets a first quantization range of an output activation of the leading function based on the practical domain.

Type: Application

Filed: December 18, 2023

Publication date: April 17, 2025

Inventors: Mingu LEE, Kanghwan JANG, Christopher LOTT, Liang ZHANG
MULTI-RESOLUTION FIELD REPRESENTATIONS IN NEURAL NETWORKS

Publication number: 20250094780

Abstract: Certain aspects provide techniques and apparatuses for efficiently processing inputs in a neural network using multiple receptive field sizes. An example method includes partitioning a first input into a first set of channels and a second set of channels. At a first layer of a neural network, the first set of channels and the second set of channels are convolved into a first output having a smaller dimensionality a dimensionality of the first input. The first set of channels and the first output are concatenated into a second input. The second input is convolved into a second output via a second layer of the neural network, wherein the second output merges a first receptive field generated by the first layer with a larger second receptive field generated by the second layer. One or more actions are taken based on at least one of the first output and the second output.

Type: Application

Filed: September 15, 2023

Publication date: March 20, 2025

Inventors: Kartikeya BHARDWAJ, Piero ZAPPI, Paul Nicholas WHATMOUGH, Christopher LOTT, Viswanath GANAPATHY, Chirag Sureshbhai PATEL, Joseph Binamira SORIAGA
RE-ARRANGING FEED FORWARD NETWORKS (FFNs) IN TRANSFORMER-BASED MODELS

Publication number: 20250094793

Abstract: A processor-implemented method for image or text processing includes receiving, by an artificial neural network (ANN) model, a set of tokens corresponding to an input. A token interaction block of the ANN model processes the set of tokens according to each channel of the input to generate a spatial mixture of a set of features for each channel of the input. A feed forward network block of the ANN model generates a mixture of channel features based on the spatial mixture of the set of features for each channel of the input. An attention block of the ANN model determines a set of attended features of the mixture of channel features according to a set of attention weights. In turn, the ANN model generates an inference based on the set of attend features of the mixture of channel features.

Type: Application

Filed: September 19, 2023

Publication date: March 20, 2025

Inventors: Manish Kumar SINGH, Tianyu JIANG, Hsin-Pai CHENG, Kartikeya BHARDWAJ, Hong CAI, Mingu LEE, Munawar HAYAT, Christopher LOTT, Fatih Murat PORIKLI
EFFICIENT ADAPTER-BASED CONTEXT SWITCH IN ARTIFICIAL INTELLIGENCE (AI) ACCELERATION DEVICES

Publication number: 20250077313

Abstract: A processor-implemented method for generating a default adapter for context switching includes analyzing a first neural network model and one or more adapters. The first neural network model is pre-trained and each of the adapters is configured with an architecture and parameters for performing a different downstream task of a set of downstream tasks. A default adapter is defined based on a capacity of the one or more adapters. The default adapter is applied to one or more layers of the first neural network model during a context switch to a replace one of the adapters for a different task. A graph corresponding to the first neural network model is unchanged.

Type: Application

Filed: August 31, 2023

Publication date: March 6, 2025

Inventors: Simyung CHANG, Kyu Woong HWANG, Juntae LEE, Kyuhong SHIM, Jihwan BANG, Seunghan YANG, Jaeseong YOU, Minseop PARK, Christopher LOTT
Speculative decoding in autoregressive generative artificial intelligence models

Patent number: 12229192

Abstract: Certain aspects of the present disclosure provide techniques and apparatus for generating a response to a query input in a generative artificial intelligence model. An example method generally includes receiving a plurality of sets of tokens generated based on an input prompt and a first generative artificial intelligence model, each set of tokens in the plurality of sets of tokens corresponding to a candidate response to the input prompt; selecting, using a second generative artificial intelligence model and recursive adjustment of a target distribution associated with the received plurality of sets of tokens, a set of tokens from the plurality of sets of tokens; and outputting the selected set of tokens as a response to the input prompt.

Type: Grant

Filed: December 13, 2023

Date of Patent: February 18, 2025

Assignee: QUALCOMM Incorporated

Inventors: Christopher Lott, Mingu Lee, Wonseok Jeon, Roland Memisevic
ACCELERATING INFERENCING IN GENERATIVE ARTIFICIAL INTELLIGENCE MODELS

Publication number: 20250021761

Abstract: Techniques and apparatus for generating a response to a query input into a generative artificial intelligence model. An example method generally includes generating, based on an input query and a first generative artificial intelligence model, a sequence of tokens corresponding to a candidate response to the input query. The sequence of tokens and the input query are output to a second generative artificial intelligence model for verification. One or more first guidance signals for the generated sequence of tokens are received from the second generative artificial intelligence model. The candidate response to the input query is revised based on the generated sequence of tokens and the one or more first guidance signals, and the revised candidate response is output as a response to the received input query.

Type: Application

Filed: December 19, 2023

Publication date: January 16, 2025

Inventors: Arvind Vardarajan SANTHANAM, Joseph Binamira SORIAGA, Roland MEMISEVIC, Mingu LEE, Christopher LOTT
TRANSFORMER WITH MULTI-SCALE MULTI-CONTEXT ATTENTIONS

Publication number: 20240428576

Abstract: Certain aspects of the present disclosure provide techniques and apparatus for improved machine learning. A transformed version of image pixels is accessed as input to an attention layer of a machine learning model. A number of local attention operations to apply, in one transformer, to the transformed version of image pixels is selected based at least in part on a size of the transformed version of image pixels. A transformer output for the attention layer of the machine learning model is generated based on applying the number of local attention operations and at least one global attention operation to the transformed version of image pixels.

Type: Application

Filed: March 22, 2024

Publication date: December 26, 2024

Inventors: Tianyu JIANG, Manish Kumar SINGH, Hsin-Pai CHENG, Hong CAI, Mingu LEE, Kartikeya BHARDWAJ, Christopher LOTT, Fatih Murat PORIKLI
EFFICIENT OPTIMIZATION OF TENSOR REMATERIALIZATION AND PAGING FOR NEURAL NETWORKS

Publication number: 20240386237

Abstract: A processor-implemented method includes receiving a graph representing an artificial neural network (ANN). The graph includes multiple nodes connected by edges and each node represents an operation. Retention intervals are determined for the multiple node outputs based on rematerialization constraints and paging constraints. The retention intervals correspond to a time interval for retaining each node output in at least one local memory. A sequence of tasks for executing the multiple nodes of the graph representing the ANN is determined based on the retention intervals.

Type: Application

Filed: October 26, 2023

Publication date: November 21, 2024

Inventors: Burak BARTAN, Edward TEAGUE, Christopher LOTT
HYBRID GENERATIVE ARTIFICIAL INTELLIGENCE MODELS

Publication number: 20240362468

Abstract: Certain aspects of the present disclosure provide techniques and apparatus for generating a response to an input query using a generative artificial intelligence model. An example method generally includes receiving an input for processing. A prompt representing the received input is generated based on the received input, contextual information associated with the received prompt, and a prompt-generating artificial intelligence model. The generated prompt is output to a generative artificial intelligence model for processing. A response to the generated prompt is received from the generative artificial intelligence model and output as a response to the received input.

Type: Application

Filed: December 18, 2023

Publication date: October 31, 2024

Inventors: Mingu LEE, Christopher LOTT, Joseph Binamira SORIAGA, Jilei HOU, Muralidhar Reddy AKULA, Jeffrey Baginsky GEHLHAAR
Exploiting activation sparsity in deep neural networks

Patent number: 12131130

Abstract: A method of exploiting activation sparsity in deep neural networks is described. The method includes retrieving an activation tensor and a weight tensor where the activation tensor is a sparse activation tensor. The method also includes generating a compressed activation tensor comprising non-zero activations of the activation tensor, where the compressed activation tensor has fewer columns than the activation tensor. The method further includes processing the compressed activation tensor and the weight tensor to generate an output tensor.

Type: Grant

Filed: February 2, 2023

Date of Patent: October 29, 2024

Assignee: QUALCOMM Incorporated

Inventors: Rexford Alan Hill, Aaron Douglass Lamb, Michael Goldfarb, Amin Ansari, Christopher Lott
SPECULATIVE DECODING IN AUTOREGRESSIVE GENERATIVE ARTIFICIAL INTELLIGENCE MODELS

Publication number: 20240354345

Abstract: Certain aspects of the present disclosure provide techniques and apparatus for generating a response to a query input in a generative artificial intelligence model. An example method generally includes receiving a plurality of sets of tokens generated based on an input prompt and a first generative artificial intelligence model, each set of tokens in the plurality of sets of tokens corresponding to a candidate response to the input prompt; selecting, using a second generative artificial intelligence model and recursive adjustment of a target distribution associated with the received plurality of sets of tokens, a set of tokens from the plurality of sets of tokens; and outputting the selected set of tokens as a response to the input prompt.

Type: Application

Filed: December 13, 2023

Publication date: October 24, 2024

Inventors: Christopher LOTT, Mingu LEE, Wonseok JEON, Roland MEMISEVIC
SPECULATIVE DECODING IN AUTOREGRESSIVE GENERATIVE ARTIFICIAL INTELLIGENCE MODELS

Publication number: 20240354346

Abstract: Certain aspects of the present disclosure provide techniques and apparatus for generating a response to a query input in a generative artificial intelligence model. An example method generally includes receiving a plurality of sets of tokens generated based on an input prompt and a first generative artificial intelligence model, each set of tokens in the plurality of sets of tokens corresponding to a candidate response to the input prompt; selecting, using a second generative artificial intelligence model and recursive adjustment of a target distribution associated with the received plurality of sets of tokens, a set of tokens from the plurality of sets of tokens; and outputting the selected set of tokens as a response to the input prompt.

Type: Application

Filed: December 13, 2023

Publication date: October 24, 2024

Inventors: Christopher LOTT, Mingu LEE, Wonseok JEON, Roland MEMISEVIC
SPECULATIVE DECODING IN AUTOREGRESSIVE GENERATIVE ARTIFICIAL INTELLIGENCE MODELS

Publication number: 20240320433

Abstract: Certain aspects of the present disclosure provide techniques and apparatus for generating a response to an input query using generative models. The method generally includes generating, based on an input query and a first generative model, a first plurality of sets of tokens. The first plurality of sets of tokens are output to a second generative model for verification. While waiting to receive an indication of a selected set of tokens from the first plurality of sets of tokens, a second plurality of sets of tokens are speculatively generated. The indication of a selected set of tokens from the first plurality of sets of tokens is received. Tokens from the second plurality of sets of tokens associated with the selected set of tokens are output to the second generative model for verification, and the selected set of tokens is output as a response to the input query.

Type: Application

Filed: October 2, 2023

Publication date: September 26, 2024

Inventors: Christopher LOTT, Mingu LEE, Joseph Binamira SORIAGA, Jilei HOU

1 2 3 next