Patents by Inventor Karamvir CHATHA

Karamvir CHATHA has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11861467
    Abstract: Certain aspects of the present disclosure provide techniques for adaptively executing machine learning models on a computing device. An example method generally includes receiving weight information for a machine learning model to be executed on a computing device. The received weight information is reduced into quantized weight information having a reduced bit size relative to the received weight information. First inferences using the machine learning model and the received weight information, and second inferences are performed using the machine learning model and the quantized weight information. Results of the first and second inferences are compared, it is determined that results of the second inferences are within a threshold performance level of results of the first inferences, and based on the determination, one or more subsequent inferences are performed using the machine learning model and the quantized weight information.
    Type: Grant
    Filed: March 5, 2020
    Date of Patent: January 2, 2024
    Assignee: QUALCOMM Incorporated
    Inventors: Serag Gadelrab, Karamvir Chatha, Ofer Rosenberg
  • Patent number: 11614941
    Abstract: An apparatus for hardware acceleration for use in operating a computational network is configured for determining that a loop structure including one or more loops is to be executed by a first processor. Each of the one or more loops includes a set of operations. The loop structure may be configured as a nested loop, a cascaded or a combination of the two. A second processor may be configured to decouple overhead operations of the loop structure from compute operations of the loop structure. The apparatus accelerates processing of the loop structure by simultaneously processing the overhead operations using the second processor separately from processing the compute operations based on the configuration to operate the computational network.
    Type: Grant
    Filed: March 30, 2018
    Date of Patent: March 28, 2023
    Assignee: QUALCOMM Incorporated
    Inventors: Amrit Panda, Francisco Perez, Karamvir Chatha
  • Publication number: 20210279635
    Abstract: Certain aspects of the present disclosure provide techniques for adaptively executing machine learning models on a computing device. An example method generally includes receiving weight information for a machine learning model to be executed on a computing device. The received weight information is reduced into quantized weight information having a reduced bit size relative to the received weight information. First inferences using the machine learning model and the received weight information, and second inferences are performed using the machine learning model and the quantized weight information. Results of the first and second inferences are compared, it is determined that results of the second inferences are within a threshold performance level of results of the first inferences, and based on the determination, one or more subsequent inferences are performed using the machine learning model and the quantized weight information.
    Type: Application
    Filed: March 5, 2020
    Publication date: September 9, 2021
    Inventors: Serag GADELRAB, Karamvir CHATHA, Ofer ROSENBERG
  • Patent number: 10871964
    Abstract: A method, a computer-readable medium, and an apparatus for a sparse neural network are provided. The apparatus may include a hardware accelerator. The apparatus may determine, for each pair of operands to be processed by a MAR unit, whether both operands of the pair are non-zero. The apparatus may prevent a pair of operands to be processed by the MAR unit from being loaded to a multiplier of the MAR unit when an operand of the pair of operands is zero. The apparatus may place the pair of operands into one of a plurality of queues when both operands of the pair of operands are non-zero.
    Type: Grant
    Filed: December 29, 2016
    Date of Patent: December 22, 2020
    Assignee: Qualcomm Incorporated
    Inventors: Yatish Girish Turakhia, Javid Jaffari, Amrit Panda, Karamvir Chatha
  • Publication number: 20200089497
    Abstract: Systems and methods for of minimizing control variance overhead in a dataflow processor include receiving a generating instruction specifying at least an acknowledge predicate based on a first number, a second number, and a first value, wherein a true branch comprises the first number of consumer instructions of the generating instruction based on the first value, used as a first predicate, being true; and a false branch comprises a second number of consumer instructions of the generating instruction based on the first value, used as the first predicate, being false. The acknowledge predicate is evaluated to be a selected number, which is the first number if the first value is true, or the second number if the first value is false. The generating instruction is fired upon the selected number of acknowledge arcs being received from the true branch or the false branch.
    Type: Application
    Filed: September 18, 2018
    Publication date: March 19, 2020
    Inventors: Rakesh KOMURAVELLI, Amin ANSARI, Ramesh Chandra CHAUHAN, Karamvir CHATHA
  • Publication number: 20190303156
    Abstract: An apparatus for hardware acceleration for use in operating a computational network is configured for determining that a loop structure including one or more loops is to be executed by a first processor. Each of the one or more loops includes a set of operations. The loop structure may be configured as a nested loop, a cascaded or a combination of the two. A second processor may be configured to decouple overhead operations of the loop structure from compute operations of the loop structure. The apparatus accelerates processing of the loop structure by simultaneously processing the overhead operations using the second processor separately from processing the compute operations based on the configuration to operate the computational network.
    Type: Application
    Filed: March 30, 2018
    Publication date: October 3, 2019
    Inventors: Amrit PANDA, Francisco PEREZ, Karamvir CHATHA
  • Patent number: 10037306
    Abstract: Computing a non-linear function ƒ(x) in hardware or embedded systems can be complex and resource intensive. In one or more aspects of the disclosure, a method, a computer-readable medium, and an apparatus are provided for computing a non-linear function ƒ(x) accurately and efficiently in hardware using look-up tables (LUTs) and interpolation or extrapolation. The apparatus may be a processor. The processor computes a non-linear function ƒ(x) for an input variable x, where ƒ(x)=g(y(x),z(x)). The processor determines an integer n by determining a position of a most significant bit (MSB) of an input variable x. In addition, the processor determines a value for y(x) based on a first look-up table and the determined integer n. Also, the processor determines a value for z(x) based on n and the input variable x, and based on a second look-up table. Further, the processor computes ƒ(x) based on the determined values for y(x) and z(x).
    Type: Grant
    Filed: September 1, 2016
    Date of Patent: July 31, 2018
    Assignee: QUALCOMM Incorporated
    Inventors: Dexu Lin, Edward Liao, Somdeb Majumdar, Aaron Lamb, Karamvir Chatha
  • Publication number: 20180189056
    Abstract: A method, a computer-readable medium, and an apparatus for a sparse neural network are provided. The apparatus may include a hardware accelerator. The apparatus may determine, for each pair of operands to be processed by a MAR unit, whether both operands of the pair are non-zero. The apparatus may prevent a pair of operands to be processed by the MAR unit from being loaded to a multiplier of the MAR unit when an operand of the pair of operands is zero. The apparatus may place the pair of operands into one of a plurality of queues when both operands of the pair of operands are non-zero.
    Type: Application
    Filed: December 29, 2016
    Publication date: July 5, 2018
    Inventors: Yatish Girish TURAKHIA, Javid JAFFARI, Amrit PANDA, Karamvir CHATHA
  • Publication number: 20180164866
    Abstract: A method, a computer-readable medium, and an apparatus for reducing power consumption of a neural network are provided. The apparatus may retrieve, from a tag storage, at least one tag value of a first tag value for a weight in the neural network or a second tag value for an activation in the neural network. The first tag value may indicate whether the weight is zero and the second tag value may indicate whether the activation is zero. The weight and the activation are to be loaded to a multiplier of a multiplier-accumulator unit as a pair of operands. The apparatus may determine whether the at least one tag value indicates a zero value. The apparatus may disable loading the weight and the activation to the multiplier when the at least one tag value indicates a zero value. The apparatus may disable updating of zero-value activations.
    Type: Application
    Filed: December 13, 2016
    Publication date: June 14, 2018
    Inventors: Yatish Girish TURAKHIA, Javid JAFFARI, Amrit PANDA, Karamvir CHATHA
  • Publication number: 20180060278
    Abstract: Computing a non-linear function ƒ(x) in hardware or embedded systems can be complex and resource intensive. In one or more aspects of the disclosure, a method, a computer-readable medium, and an apparatus are provided for computing a non-linear function ƒ(x) accurately and efficiently in hardware using look-up tables (LUTs) and interpolation or extrapolation. The apparatus may be a processor. The processor computes a non-linear function ƒ(x) for an input variable x, where ƒ(x)=g(y(x),z(x)). The processor determines an integer n by determining a position of a most significant bit (MSB) of an input variable x. In addition, the processor determines a value for y(x) based on a first look-up table and the determined integer n. Also, the processor determines a value for z(x) based on n and the input variable x, and based on a second look-up table. Further, the processor computes ƒ(x) based on the determined values for y(x) and z(x).
    Type: Application
    Filed: September 1, 2016
    Publication date: March 1, 2018
    Inventors: Dexu LIN, Edward LIAO, Somdeb MAJUMDAR, Aaron LAMB, Karamvir CHATHA