Patents by Inventor Christopher Lott

Christopher Lott has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20250245430
    Abstract: Certain aspects of the present disclosure provide techniques and apparatus for efficiently generating a response to a query input in a generative artificial intelligence model. An example method generally includes generating, based on an input prompt and using a first machine learning model, a set of tokens including one or more subsets of tokens. Each respective subset of the one or more subsets corresponds to a respective portion of a response to the input prompt and includes a fixed number of tokens corresponding to a beam width for a beam search through the set of tokens. The set of tokens is output to a second machine learning model for verification, and information identifying a selected sequence of tokens from the generated set of tokens is received from the second machine learning model. The selected sequence of tokens is output as the response to the input prompt.
    Type: Application
    Filed: January 26, 2024
    Publication date: July 31, 2025
    Inventors: Wonseok JEON, Mukul GAGRANI, Mingu LEE, Raghavv GOEL, Junyoung PARK, Christopher LOTT
  • Patent number: 12373494
    Abstract: Certain aspects of the present disclosure provide techniques and apparatus for generating a response to a query input in a generative artificial intelligence model. An example method generally includes receiving a plurality of sets of tokens generated based on an input prompt and a first generative artificial intelligence model, each set of tokens in the plurality of sets of tokens corresponding to a candidate response to the input prompt; selecting, using a second generative artificial intelligence model and recursive adjustment of a target distribution associated with the received plurality of sets of tokens, a set of tokens from the plurality of sets of tokens; and outputting the selected set of tokens as a response to the input prompt.
    Type: Grant
    Filed: December 13, 2023
    Date of Patent: July 29, 2025
    Assignee: QUALCOMM Incorporated
    Inventors: Christopher Lott, Mingu Lee, Wonseok Jeon, Roland Memisevic
  • Publication number: 20250231989
    Abstract: Certain aspects of the present disclosure provide techniques and apparatus for generating a response to a query input in a generative artificial intelligence model. An example method generally includes receiving a plurality of sets of tokens generated based on an input prompt and a first generative artificial intelligence model, each set of tokens in the plurality of sets of tokens corresponding to a candidate response to the input prompt; selecting, using a second generative artificial intelligence model and recursive adjustment of a target distribution associated with the received plurality of sets of tokens, a set of tokens from the plurality of sets of tokens; and outputting the selected set of tokens as a response to the input prompt.
    Type: Application
    Filed: April 4, 2025
    Publication date: July 17, 2025
    Inventors: Christopher LOTT, Mingu LEE, Wonseok JEON, Roland MEMISEVIC
  • Publication number: 20250225374
    Abstract: Systems and techniques are described herein for training a machine learning (ML) model. For instance, a process can include obtaining, from a teacher ML model, a first prediction based on an input. The process can further include obtaining, from a student ML model, a second prediction based on the input. The process can include determining a loss based on a difference between the second prediction from the first prediction. For instance, the loss can include a variance reduced total variation distance (TVD) loss based on an unbiased gradient estimate of the loss. The process can further include backpropagating the loss through the student ML model to train the student ML model.
    Type: Application
    Filed: January 8, 2024
    Publication date: July 10, 2025
    Inventors: Raghavv GOEL, Mukul GAGRANI, Wonseok JEON, Mingu LEE, Christopher LOTT
  • Publication number: 20250200358
    Abstract: Certain aspects of the present disclosure provide techniques and apparatus for generating a response to an input query using a generative artificial intelligence model. An example method generally includes receiving an input for processing. A prompt representing the received input is generated based on the received input, contextual information associated with the received prompt, and a prompt-generating artificial intelligence model. The generated prompt is output to a generative artificial intelligence model for processing. A response to the generated prompt is received from the generative artificial intelligence model and output as a response to the received input.
    Type: Application
    Filed: December 18, 2023
    Publication date: June 19, 2025
    Inventors: Mingu LEE, Christopher LOTT, Joseph Binamira SORIAGA, Jilei HOU, Muralidhar Reddy AKULA, Jeffrey Baginsky GEHLHAAR
  • Publication number: 20250173127
    Abstract: Systems and techniques are described for performing code generating using machine learning models (e.g., large language models). For example, a computing device can generate, based on input data, second input data for a machine learning model. The computing device can generate, based on the second input data, a prompt. The computing device can apply a beam search with sampling on the prompt to generate a set of output samples. The computing device can further apply a static analysis to the set of output samples to generate a set of samples and can output the set of samples.
    Type: Application
    Filed: November 29, 2023
    Publication date: May 29, 2025
    Inventors: Weiliang ZENG, James Randall EZICK, Christopher LOTT, Joseph Binamira SORIAGA, Piero ZAPPI, Mingu LEE, Arvind Vardarajan SANTHANAM
  • Publication number: 20250148015
    Abstract: Certain aspects of the present disclosure provide techniques and apparatus for generating a response to a query input in a generative artificial intelligence model. An example method generally includes receiving a plurality of sets of tokens generated based on an input prompt and a first generative artificial intelligence model, each set of tokens in the plurality of sets of tokens corresponding to a candidate response to the input prompt; selecting, using a second generative artificial intelligence model and recursive adjustment of a target distribution associated with the received plurality of sets of tokens, a set of tokens from the plurality of sets of tokens; and outputting the selected set of tokens as a response to the input prompt.
    Type: Application
    Filed: January 7, 2025
    Publication date: May 8, 2025
    Inventors: Christopher LOTT, Mingu LEE, Wonseok JEON, Roland MEMISEVIC
  • Publication number: 20250124265
    Abstract: A processor-implemented method determines a practical domain for a following function in a following layer of an artificial neural network. The artificial neural network includes a leading function in a leading layer and the following function in the following layer, which is a subsequent consecutive layer of the artificial neural network. The method also sets a first quantization range of an output activation of the leading function based on the practical domain.
    Type: Application
    Filed: December 18, 2023
    Publication date: April 17, 2025
    Inventors: Mingu LEE, Kanghwan JANG, Christopher LOTT, Liang ZHANG
  • Publication number: 20250094780
    Abstract: Certain aspects provide techniques and apparatuses for efficiently processing inputs in a neural network using multiple receptive field sizes. An example method includes partitioning a first input into a first set of channels and a second set of channels. At a first layer of a neural network, the first set of channels and the second set of channels are convolved into a first output having a smaller dimensionality a dimensionality of the first input. The first set of channels and the first output are concatenated into a second input. The second input is convolved into a second output via a second layer of the neural network, wherein the second output merges a first receptive field generated by the first layer with a larger second receptive field generated by the second layer. One or more actions are taken based on at least one of the first output and the second output.
    Type: Application
    Filed: September 15, 2023
    Publication date: March 20, 2025
    Inventors: Kartikeya BHARDWAJ, Piero ZAPPI, Paul Nicholas WHATMOUGH, Christopher LOTT, Viswanath GANAPATHY, Chirag Sureshbhai PATEL, Joseph Binamira SORIAGA
  • Publication number: 20250094793
    Abstract: A processor-implemented method for image or text processing includes receiving, by an artificial neural network (ANN) model, a set of tokens corresponding to an input. A token interaction block of the ANN model processes the set of tokens according to each channel of the input to generate a spatial mixture of a set of features for each channel of the input. A feed forward network block of the ANN model generates a mixture of channel features based on the spatial mixture of the set of features for each channel of the input. An attention block of the ANN model determines a set of attended features of the mixture of channel features according to a set of attention weights. In turn, the ANN model generates an inference based on the set of attend features of the mixture of channel features.
    Type: Application
    Filed: September 19, 2023
    Publication date: March 20, 2025
    Inventors: Manish Kumar SINGH, Tianyu JIANG, Hsin-Pai CHENG, Kartikeya BHARDWAJ, Hong CAI, Mingu LEE, Munawar HAYAT, Christopher LOTT, Fatih Murat PORIKLI
  • Publication number: 20250077313
    Abstract: A processor-implemented method for generating a default adapter for context switching includes analyzing a first neural network model and one or more adapters. The first neural network model is pre-trained and each of the adapters is configured with an architecture and parameters for performing a different downstream task of a set of downstream tasks. A default adapter is defined based on a capacity of the one or more adapters. The default adapter is applied to one or more layers of the first neural network model during a context switch to a replace one of the adapters for a different task. A graph corresponding to the first neural network model is unchanged.
    Type: Application
    Filed: August 31, 2023
    Publication date: March 6, 2025
    Inventors: Simyung CHANG, Kyu Woong HWANG, Juntae LEE, Kyuhong SHIM, Jihwan BANG, Seunghan YANG, Jaeseong YOU, Minseop PARK, Christopher LOTT
  • Patent number: 12229192
    Abstract: Certain aspects of the present disclosure provide techniques and apparatus for generating a response to a query input in a generative artificial intelligence model. An example method generally includes receiving a plurality of sets of tokens generated based on an input prompt and a first generative artificial intelligence model, each set of tokens in the plurality of sets of tokens corresponding to a candidate response to the input prompt; selecting, using a second generative artificial intelligence model and recursive adjustment of a target distribution associated with the received plurality of sets of tokens, a set of tokens from the plurality of sets of tokens; and outputting the selected set of tokens as a response to the input prompt.
    Type: Grant
    Filed: December 13, 2023
    Date of Patent: February 18, 2025
    Assignee: QUALCOMM Incorporated
    Inventors: Christopher Lott, Mingu Lee, Wonseok Jeon, Roland Memisevic
  • Publication number: 20250021761
    Abstract: Techniques and apparatus for generating a response to a query input into a generative artificial intelligence model. An example method generally includes generating, based on an input query and a first generative artificial intelligence model, a sequence of tokens corresponding to a candidate response to the input query. The sequence of tokens and the input query are output to a second generative artificial intelligence model for verification. One or more first guidance signals for the generated sequence of tokens are received from the second generative artificial intelligence model. The candidate response to the input query is revised based on the generated sequence of tokens and the one or more first guidance signals, and the revised candidate response is output as a response to the received input query.
    Type: Application
    Filed: December 19, 2023
    Publication date: January 16, 2025
    Inventors: Arvind Vardarajan SANTHANAM, Joseph Binamira SORIAGA, Roland MEMISEVIC, Mingu LEE, Christopher LOTT
  • Publication number: 20240428576
    Abstract: Certain aspects of the present disclosure provide techniques and apparatus for improved machine learning. A transformed version of image pixels is accessed as input to an attention layer of a machine learning model. A number of local attention operations to apply, in one transformer, to the transformed version of image pixels is selected based at least in part on a size of the transformed version of image pixels. A transformer output for the attention layer of the machine learning model is generated based on applying the number of local attention operations and at least one global attention operation to the transformed version of image pixels.
    Type: Application
    Filed: March 22, 2024
    Publication date: December 26, 2024
    Inventors: Tianyu JIANG, Manish Kumar SINGH, Hsin-Pai CHENG, Hong CAI, Mingu LEE, Kartikeya BHARDWAJ, Christopher LOTT, Fatih Murat PORIKLI
  • Publication number: 20240386237
    Abstract: A processor-implemented method includes receiving a graph representing an artificial neural network (ANN). The graph includes multiple nodes connected by edges and each node represents an operation. Retention intervals are determined for the multiple node outputs based on rematerialization constraints and paging constraints. The retention intervals correspond to a time interval for retaining each node output in at least one local memory. A sequence of tasks for executing the multiple nodes of the graph representing the ANN is determined based on the retention intervals.
    Type: Application
    Filed: October 26, 2023
    Publication date: November 21, 2024
    Inventors: Burak BARTAN, Edward TEAGUE, Christopher LOTT
  • Publication number: 20240362468
    Abstract: Certain aspects of the present disclosure provide techniques and apparatus for generating a response to an input query using a generative artificial intelligence model. An example method generally includes receiving an input for processing. A prompt representing the received input is generated based on the received input, contextual information associated with the received prompt, and a prompt-generating artificial intelligence model. The generated prompt is output to a generative artificial intelligence model for processing. A response to the generated prompt is received from the generative artificial intelligence model and output as a response to the received input.
    Type: Application
    Filed: December 18, 2023
    Publication date: October 31, 2024
    Inventors: Mingu LEE, Christopher LOTT, Joseph Binamira SORIAGA, Jilei HOU, Muralidhar Reddy AKULA, Jeffrey Baginsky GEHLHAAR
  • Patent number: 12131130
    Abstract: A method of exploiting activation sparsity in deep neural networks is described. The method includes retrieving an activation tensor and a weight tensor where the activation tensor is a sparse activation tensor. The method also includes generating a compressed activation tensor comprising non-zero activations of the activation tensor, where the compressed activation tensor has fewer columns than the activation tensor. The method further includes processing the compressed activation tensor and the weight tensor to generate an output tensor.
    Type: Grant
    Filed: February 2, 2023
    Date of Patent: October 29, 2024
    Assignee: QUALCOMM Incorporated
    Inventors: Rexford Alan Hill, Aaron Douglass Lamb, Michael Goldfarb, Amin Ansari, Christopher Lott
  • Publication number: 20240354345
    Abstract: Certain aspects of the present disclosure provide techniques and apparatus for generating a response to a query input in a generative artificial intelligence model. An example method generally includes receiving a plurality of sets of tokens generated based on an input prompt and a first generative artificial intelligence model, each set of tokens in the plurality of sets of tokens corresponding to a candidate response to the input prompt; selecting, using a second generative artificial intelligence model and recursive adjustment of a target distribution associated with the received plurality of sets of tokens, a set of tokens from the plurality of sets of tokens; and outputting the selected set of tokens as a response to the input prompt.
    Type: Application
    Filed: December 13, 2023
    Publication date: October 24, 2024
    Inventors: Christopher LOTT, Mingu LEE, Wonseok JEON, Roland MEMISEVIC
  • Publication number: 20240354346
    Abstract: Certain aspects of the present disclosure provide techniques and apparatus for generating a response to a query input in a generative artificial intelligence model. An example method generally includes receiving a plurality of sets of tokens generated based on an input prompt and a first generative artificial intelligence model, each set of tokens in the plurality of sets of tokens corresponding to a candidate response to the input prompt; selecting, using a second generative artificial intelligence model and recursive adjustment of a target distribution associated with the received plurality of sets of tokens, a set of tokens from the plurality of sets of tokens; and outputting the selected set of tokens as a response to the input prompt.
    Type: Application
    Filed: December 13, 2023
    Publication date: October 24, 2024
    Inventors: Christopher LOTT, Mingu LEE, Wonseok JEON, Roland MEMISEVIC
  • Publication number: 20240320433
    Abstract: Certain aspects of the present disclosure provide techniques and apparatus for generating a response to an input query using generative models. The method generally includes generating, based on an input query and a first generative model, a first plurality of sets of tokens. The first plurality of sets of tokens are output to a second generative model for verification. While waiting to receive an indication of a selected set of tokens from the first plurality of sets of tokens, a second plurality of sets of tokens are speculatively generated. The indication of a selected set of tokens from the first plurality of sets of tokens is received. Tokens from the second plurality of sets of tokens associated with the selected set of tokens are output to the second generative model for verification, and the selected set of tokens is output as a response to the input query.
    Type: Application
    Filed: October 2, 2023
    Publication date: September 26, 2024
    Inventors: Christopher LOTT, Mingu LEE, Joseph Binamira SORIAGA, Jilei HOU