Patents by Inventor Christopher Lott

Christopher Lott has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20250148015
    Abstract: Certain aspects of the present disclosure provide techniques and apparatus for generating a response to a query input in a generative artificial intelligence model. An example method generally includes receiving a plurality of sets of tokens generated based on an input prompt and a first generative artificial intelligence model, each set of tokens in the plurality of sets of tokens corresponding to a candidate response to the input prompt; selecting, using a second generative artificial intelligence model and recursive adjustment of a target distribution associated with the received plurality of sets of tokens, a set of tokens from the plurality of sets of tokens; and outputting the selected set of tokens as a response to the input prompt.
    Type: Application
    Filed: January 7, 2025
    Publication date: May 8, 2025
    Inventors: Christopher LOTT, Mingu LEE, Wonseok JEON, Roland MEMISEVIC
  • Publication number: 20250124265
    Abstract: A processor-implemented method determines a practical domain for a following function in a following layer of an artificial neural network. The artificial neural network includes a leading function in a leading layer and the following function in the following layer, which is a subsequent consecutive layer of the artificial neural network. The method also sets a first quantization range of an output activation of the leading function based on the practical domain.
    Type: Application
    Filed: December 18, 2023
    Publication date: April 17, 2025
    Inventors: Mingu LEE, Kanghwan JANG, Christopher LOTT, Liang ZHANG
  • Publication number: 20250094793
    Abstract: A processor-implemented method for image or text processing includes receiving, by an artificial neural network (ANN) model, a set of tokens corresponding to an input. A token interaction block of the ANN model processes the set of tokens according to each channel of the input to generate a spatial mixture of a set of features for each channel of the input. A feed forward network block of the ANN model generates a mixture of channel features based on the spatial mixture of the set of features for each channel of the input. An attention block of the ANN model determines a set of attended features of the mixture of channel features according to a set of attention weights. In turn, the ANN model generates an inference based on the set of attend features of the mixture of channel features.
    Type: Application
    Filed: September 19, 2023
    Publication date: March 20, 2025
    Inventors: Manish Kumar SINGH, Tianyu JIANG, Hsin-Pai CHENG, Kartikeya BHARDWAJ, Hong CAI, Mingu LEE, Munawar HAYAT, Christopher LOTT, Fatih Murat PORIKLI
  • Publication number: 20250094780
    Abstract: Certain aspects provide techniques and apparatuses for efficiently processing inputs in a neural network using multiple receptive field sizes. An example method includes partitioning a first input into a first set of channels and a second set of channels. At a first layer of a neural network, the first set of channels and the second set of channels are convolved into a first output having a smaller dimensionality a dimensionality of the first input. The first set of channels and the first output are concatenated into a second input. The second input is convolved into a second output via a second layer of the neural network, wherein the second output merges a first receptive field generated by the first layer with a larger second receptive field generated by the second layer. One or more actions are taken based on at least one of the first output and the second output.
    Type: Application
    Filed: September 15, 2023
    Publication date: March 20, 2025
    Inventors: Kartikeya BHARDWAJ, Piero ZAPPI, Paul Nicholas WHATMOUGH, Christopher LOTT, Viswanath GANAPATHY, Chirag Sureshbhai PATEL, Joseph Binamira SORIAGA
  • Publication number: 20250077313
    Abstract: A processor-implemented method for generating a default adapter for context switching includes analyzing a first neural network model and one or more adapters. The first neural network model is pre-trained and each of the adapters is configured with an architecture and parameters for performing a different downstream task of a set of downstream tasks. A default adapter is defined based on a capacity of the one or more adapters. The default adapter is applied to one or more layers of the first neural network model during a context switch to a replace one of the adapters for a different task. A graph corresponding to the first neural network model is unchanged.
    Type: Application
    Filed: August 31, 2023
    Publication date: March 6, 2025
    Inventors: Simyung CHANG, Kyu Woong HWANG, Juntae LEE, Kyuhong SHIM, Jihwan BANG, Seunghan YANG, Jaeseong YOU, Minseop PARK, Christopher LOTT
  • Patent number: 12229192
    Abstract: Certain aspects of the present disclosure provide techniques and apparatus for generating a response to a query input in a generative artificial intelligence model. An example method generally includes receiving a plurality of sets of tokens generated based on an input prompt and a first generative artificial intelligence model, each set of tokens in the plurality of sets of tokens corresponding to a candidate response to the input prompt; selecting, using a second generative artificial intelligence model and recursive adjustment of a target distribution associated with the received plurality of sets of tokens, a set of tokens from the plurality of sets of tokens; and outputting the selected set of tokens as a response to the input prompt.
    Type: Grant
    Filed: December 13, 2023
    Date of Patent: February 18, 2025
    Assignee: QUALCOMM Incorporated
    Inventors: Christopher Lott, Mingu Lee, Wonseok Jeon, Roland Memisevic
  • Publication number: 20250021761
    Abstract: Techniques and apparatus for generating a response to a query input into a generative artificial intelligence model. An example method generally includes generating, based on an input query and a first generative artificial intelligence model, a sequence of tokens corresponding to a candidate response to the input query. The sequence of tokens and the input query are output to a second generative artificial intelligence model for verification. One or more first guidance signals for the generated sequence of tokens are received from the second generative artificial intelligence model. The candidate response to the input query is revised based on the generated sequence of tokens and the one or more first guidance signals, and the revised candidate response is output as a response to the received input query.
    Type: Application
    Filed: December 19, 2023
    Publication date: January 16, 2025
    Inventors: Arvind Vardarajan SANTHANAM, Joseph Binamira SORIAGA, Roland MEMISEVIC, Mingu LEE, Christopher LOTT
  • Publication number: 20240428576
    Abstract: Certain aspects of the present disclosure provide techniques and apparatus for improved machine learning. A transformed version of image pixels is accessed as input to an attention layer of a machine learning model. A number of local attention operations to apply, in one transformer, to the transformed version of image pixels is selected based at least in part on a size of the transformed version of image pixels. A transformer output for the attention layer of the machine learning model is generated based on applying the number of local attention operations and at least one global attention operation to the transformed version of image pixels.
    Type: Application
    Filed: March 22, 2024
    Publication date: December 26, 2024
    Inventors: Tianyu JIANG, Manish Kumar SINGH, Hsin-Pai CHENG, Hong CAI, Mingu LEE, Kartikeya BHARDWAJ, Christopher LOTT, Fatih Murat PORIKLI
  • Publication number: 20240386237
    Abstract: A processor-implemented method includes receiving a graph representing an artificial neural network (ANN). The graph includes multiple nodes connected by edges and each node represents an operation. Retention intervals are determined for the multiple node outputs based on rematerialization constraints and paging constraints. The retention intervals correspond to a time interval for retaining each node output in at least one local memory. A sequence of tasks for executing the multiple nodes of the graph representing the ANN is determined based on the retention intervals.
    Type: Application
    Filed: October 26, 2023
    Publication date: November 21, 2024
    Inventors: Burak BARTAN, Edward TEAGUE, Christopher LOTT
  • Publication number: 20240362468
    Abstract: Certain aspects of the present disclosure provide techniques and apparatus for generating a response to an input query using a generative artificial intelligence model. An example method generally includes receiving an input for processing. A prompt representing the received input is generated based on the received input, contextual information associated with the received prompt, and a prompt-generating artificial intelligence model. The generated prompt is output to a generative artificial intelligence model for processing. A response to the generated prompt is received from the generative artificial intelligence model and output as a response to the received input.
    Type: Application
    Filed: December 18, 2023
    Publication date: October 31, 2024
    Inventors: Mingu LEE, Christopher LOTT, Joseph Binamira SORIAGA, Jilei HOU, Muralidhar Reddy AKULA, Jeffrey Baginsky GEHLHAAR
  • Patent number: 12131130
    Abstract: A method of exploiting activation sparsity in deep neural networks is described. The method includes retrieving an activation tensor and a weight tensor where the activation tensor is a sparse activation tensor. The method also includes generating a compressed activation tensor comprising non-zero activations of the activation tensor, where the compressed activation tensor has fewer columns than the activation tensor. The method further includes processing the compressed activation tensor and the weight tensor to generate an output tensor.
    Type: Grant
    Filed: February 2, 2023
    Date of Patent: October 29, 2024
    Assignee: QUALCOMM Incorporated
    Inventors: Rexford Alan Hill, Aaron Douglass Lamb, Michael Goldfarb, Amin Ansari, Christopher Lott
  • Publication number: 20240354345
    Abstract: Certain aspects of the present disclosure provide techniques and apparatus for generating a response to a query input in a generative artificial intelligence model. An example method generally includes receiving a plurality of sets of tokens generated based on an input prompt and a first generative artificial intelligence model, each set of tokens in the plurality of sets of tokens corresponding to a candidate response to the input prompt; selecting, using a second generative artificial intelligence model and recursive adjustment of a target distribution associated with the received plurality of sets of tokens, a set of tokens from the plurality of sets of tokens; and outputting the selected set of tokens as a response to the input prompt.
    Type: Application
    Filed: December 13, 2023
    Publication date: October 24, 2024
    Inventors: Christopher LOTT, Mingu LEE, Wonseok JEON, Roland MEMISEVIC
  • Publication number: 20240354346
    Abstract: Certain aspects of the present disclosure provide techniques and apparatus for generating a response to a query input in a generative artificial intelligence model. An example method generally includes receiving a plurality of sets of tokens generated based on an input prompt and a first generative artificial intelligence model, each set of tokens in the plurality of sets of tokens corresponding to a candidate response to the input prompt; selecting, using a second generative artificial intelligence model and recursive adjustment of a target distribution associated with the received plurality of sets of tokens, a set of tokens from the plurality of sets of tokens; and outputting the selected set of tokens as a response to the input prompt.
    Type: Application
    Filed: December 13, 2023
    Publication date: October 24, 2024
    Inventors: Christopher LOTT, Mingu LEE, Wonseok JEON, Roland MEMISEVIC
  • Publication number: 20240320433
    Abstract: Certain aspects of the present disclosure provide techniques and apparatus for generating a response to an input query using generative models. The method generally includes generating, based on an input query and a first generative model, a first plurality of sets of tokens. The first plurality of sets of tokens are output to a second generative model for verification. While waiting to receive an indication of a selected set of tokens from the first plurality of sets of tokens, a second plurality of sets of tokens are speculatively generated. The indication of a selected set of tokens from the first plurality of sets of tokens is received. Tokens from the second plurality of sets of tokens associated with the selected set of tokens are output to the second generative model for verification, and the selected set of tokens is output as a response to the input query.
    Type: Application
    Filed: October 2, 2023
    Publication date: September 26, 2024
    Inventors: Christopher LOTT, Mingu LEE, Joseph Binamira SORIAGA, Jilei HOU
  • Publication number: 20240249128
    Abstract: A processor-implemented method for rematerialization for an artificial neural network (ANN) includes receiving a graph representing the ANN. The graph includes multiple nodes connected by edges and each node represents an operation. Retention intervals for the nodes are determined based on a precedence constraint for the nodes. The retention intervals correspond to a time interval for retaining each node output in a local memory. One of the nodes to recompute is determined based on the retention intervals.
    Type: Application
    Filed: July 17, 2023
    Publication date: July 25, 2024
    Inventors: Burak BARTAN, Edward TEAGUE, Christopher LOTT
  • Publication number: 20240211312
    Abstract: A processor-implemented method for compiler optimization using node symmetry includes receiving a representation of an artificial neural network (ANN) include multiple nodes coupled via multiple edges. One or more symmetric sets of nodes are determined based on one or more of a set of attributes for each node or a connectivity of the nodes via the edges. One or more of an order or a schedule for executing the nodes is generated based on the one or more symmetric sets of nodes.
    Type: Application
    Filed: December 21, 2022
    Publication date: June 27, 2024
    Inventors: Weiliang ZENG, Christopher LOTT, Edward TEAGUE, Yang YANG, Wonseok JEON, Muntasir Amin MALLICK, Mukul GAGRANI, Piero ZAPPI, Joseph Binamira SORIAGA
  • Publication number: 20240119301
    Abstract: A processor-implemented method includes sampling, according to a priority sampling policy, a set of node priorities from a computation graph. Each node priority of the set of node priorities may be associated with a respective node on the computation graph. Additionally, each node may represent an operation of a task performed by an artificial neural network. The method also includes converting, via a list scheduling function, the node priorities to a schedule that associates each node of the computation graph with a processor of a group of processors of a device associated with the artificial neural network, the schedule associated with a makespan. The method further includes performing the task in accordance with the schedule.
    Type: Application
    Filed: September 11, 2023
    Publication date: April 11, 2024
    Inventors: Wonseok JEON, Mukul GAGRANI, Weiliang ZENG, Edward TEAGUE, Burak BARTAN, Piero ZAPPI, Christopher LOTT
  • Publication number: 20240118923
    Abstract: A processor-implemented method includes generating, by a scheduling model, a group of schedules from a computation graph associated with a task, each node on the computation graph being associated with an operation of an artificial neural network, each schedule of the group of schedules associating each node of the computation graph with a processor of a group of processors of a hardware device. The processor-implemented method also includes testing one or more schedules of the group of schedules on the hardware device or a model of the hardware device. The processor-implemented method further includes selecting a schedule of the one or more schedules based on testing the one or more schedules, the selected schedule satisfying a selection condition.
    Type: Application
    Filed: August 31, 2023
    Publication date: April 11, 2024
    Inventors: Corrado RAINONE, Wei David ZHANG, Roberto BONDESAN, Markus PESCHL, Mukul GAGRANI, Wonseok JEON, Edward TEAGUE, Piero ZAPPI, Weiliang ZENG, Christopher LOTT
  • Patent number: 11948559
    Abstract: Various embodiments include methods and devices for implementing automatic grammar augmentation for improving voice command recognition accuracy in systems with a small footprint acoustic model. Alternative expressions that may capture acoustic model decoding variations may be added to a grammar set. An acoustic model-specific statistical pronunciation dictionary may be derived by running the acoustic model through a large general speech dataset and constructing a command-specific candidate set containing potential grammar expressions. Greedy based and cross-entropy-method (CEM) based algorithms may be utilized to search the candidate set for augmentations with improved recognition accuracy.
    Type: Grant
    Filed: March 21, 2022
    Date of Patent: April 2, 2024
    Assignee: QUALCOMM Incorporated
    Inventors: Yang Yang, Anusha Lalitha, Jin Won Lee, Christopher Lott
  • Publication number: 20240037150
    Abstract: A processor-implemented method for generating a schedule for executing operations of a compute graph includes receiving a graph including multiples nodes connected by edges. Each of the multiple nodes represents an operation to be executed. A set of sequences for executing the nodes is determined based on one or more precedence constraints. One or more sequences are selected from the set of sequences based on a memory constraint associated with a device for executing the nodes. A schedule for executing the nodes on the device is generated based on the selected one or more sequences.
    Type: Application
    Filed: August 1, 2022
    Publication date: February 1, 2024
    Inventors: Yang YANG, Mukul GAGRANI, Wonseok JEON, Edward TEAGUE, Weiliang ZENG, Piero ZAPPI, Corrado RAINONE, Christopher LOTT