Patents by Inventor Balaji CALIDAS

Balaji CALIDAS has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Methods and apparatus for tensor object support in machine learning workloads

Patent number: 11481865

Abstract: The present disclosure relates to methods and devices for graphics processing including an apparatus, e.g., a GPU. The apparatus may modify at least one texture memory object to support a data structure for one or more tensor objects. The apparatus may also determine one or more supported memory layouts for the one or more tensor objects based on the modified at least one texture memory object. Additionally, the apparatus may access data associated with the one or more tensor objects based on the one or more supported memory layouts, the data for each of the one or more tensor objects corresponding to at least one data instruction. The apparatus may also execute the at least one data instruction based on the accessed data associated with the one or more tensor objects.

Type: Grant

Filed: February 11, 2021

Date of Patent: October 25, 2022

Assignee: QUALCOMM Incorporated

Inventors: Elina Kamenetskaya, Liang Li, Andrew Evan Gruber, Jeffrey Leger, Balaji Calidas, Ruihao Zhang
METHODS AND APPARATUS FOR TENSOR OBJECT SUPPORT IN MACHINE LEARNING WORKLOADS

Publication number: 20220253969

Abstract: The present disclosure relates to methods and devices for graphics processing including an apparatus, e.g., a GPU. The apparatus may modify at least one texture memory object to support a data structure for one or more tensor objects. The apparatus may also determine one or more supported memory layouts for the one or more tensor objects based on the modified at least one texture memory object. Additionally, the apparatus may access data associated with the one or more tensor objects based on the one or more supported memory layouts, the data for each of the one or more tensor objects corresponding to at least one data instruction. The apparatus may also execute the at least one data instruction based on the accessed data associated with the one or more tensor objects.

Type: Application

Filed: February 11, 2021

Publication date: August 11, 2022

Inventors: Elina KAMENETSKAYA, Liang LI, Andrew Evan GRUBER, Jeffrey LEGER, Balaji CALIDAS, Ruihao ZHANG
Methods and apparatus to facilitate improving processing of machine learning primitives

Patent number: 11263064

Abstract: The present disclosure relates to methods and apparatus for machine learning processing. For example, disclosed techniques facilitate improving execution of machine learning primitives. Aspects of the present disclosure may store a command stream generated by an application in a buffer, the command stream including a plurality of machine learning primitives for execution by a graphics processor. Further, aspects of the present disclosure identify, after receiving a request from the application to finalize the buffer, two or more machine learning primitives of the buffer that may be replaced with a fused shader kernel. Additionally, aspects of the present disclosure may store the fused shader kernel in the buffer to generate a fused command buffer.

Type: Grant

Filed: December 30, 2019

Date of Patent: March 1, 2022

Assignee: QUALCOMM Incorporated

Inventors: Hitendra Gangani, Balaji Calidas, Jeremy Williams
METHODS AND APPARATUS FOR DYNAMIC SHADER SELECTION FOR MACHINE LEARNING

Publication number: 20220058476

Abstract: The present disclosure relates to methods and apparatus for selecting a sequence of shaders for performing a machine-learning operation on a graphics processing unit (GPU). The apparatus can receive a request to perform a machine-learning operation. The apparatus can determine a plurality of sequences of shaders that are capable of performing the machine-learning operation. The apparatus can determine a cost for each sequence of the plurality of sequences of shaders based on a cost function associated with each shader. The apparatus can execute a selected sequence of shaders of the plurality of sequences of shaders having a lowest cost.

Type: Application

Filed: August 19, 2020

Publication date: February 24, 2022

Inventors: Balaji CALIDAS, Michael Collins GALLASPY, Diego MARTINEZ
Adaptive dispatch for acceleration of deep neural networks on graphic processing units

Patent number: 11145024

Abstract: Methods, systems, and devices for processing are described. A device may parse a set of layers of a deep neural network. The set of layers may be associated with a set of machine learning operations of the deep neural network. The device may determine one or more layer parameters based on the determined set of layers. In some aspects, the device may determine an execution time associated with executing a shader dispatch based on the one or more layer parameters. The device may batch the shader dispatch to a command buffer based on the execution time and process the command buffer based on the batching. The device may determine a target execution time based on an assembly time associated with the command buffer, a processing time associated with the command buffer, a frequency level associated with processing the command buffer, the one or more layer parameters, or some combination thereof.

Type: Grant

Filed: December 27, 2019

Date of Patent: October 12, 2021

Assignee: QUALCOMM Incorporated

Inventors: Balaji Calidas, Joshua Walter Kelly, Avinash Seetharamaiah, Jonnala Gadda Nagendra Kumar, Hitendra Mohan Gangani
METHODS AND APPARATUS TO FACILITATE TILE-BASED GPU MACHINE LEARNING ACCELERATION

Publication number: 20210240524

Abstract: The present disclosure relates to methods and apparatus for machine learning processing. For example, disclosed techniques facilitate tile-based GPU machine learning acceleration. Aspects of the present disclosure can determine a tile size based on a memory size of a first memory and a job input size associated with executing a computational job. In some examples, the computational job may be one of a quantity of computational jobs configured to execute a machine learning primitive. Aspects of the present disclosure can also load, based on the tile size, input data associated with a batch of computational jobs from a second memory to the first memory. Further, aspects of the present disclosure can generate batch output data by executing the batch of computational jobs using the input data loaded to the first memory. Additionally, aspects of the present disclosure can store the generated batch output data to the second memory.

Type: Application

Filed: January 31, 2020

Publication date: August 5, 2021

Inventors: Hitendra Mohan GANGANI, Balaji CALIDAS, Murat BALCI
ADAPTIVE DISPATCH FOR ACCELERATION OF DEEP NEURAL NETWORKS ON GRAPHIC PROCESSING UNITS

Publication number: 20210201433

Abstract: Methods, systems, and devices for processing are described. A device may parse a set of layers of a deep neural network. The set of layers may be associated with a set of machine learning operations of the deep neural network. The device may determine one or more layer parameters based on the determined set of layers. In some aspects, the device may determine an execution time associated with executing a shader dispatch based on the one or more layer parameters. The device may batch the shader dispatch to a command buffer based on the execution time and process the command buffer based on the batching. The device may determine a target execution time based on an assembly time associated with the command buffer, a processing time associated with the command buffer, a frequency level associated with processing the command buffer, the one or more layer parameters, or some combination thereof.

Type: Application

Filed: December 27, 2019

Publication date: July 1, 2021

Inventors: Balaji CALIDAS, Joshua Walter Kelly, Avinash Seetharamaiah, Jonnala Gadda Nagendra Kumar, Hitendra Mohan Gangani
METHODS AND APPARATUS TO FACILITATE IMPROVING PROCESSING OF MACHINE LEARNING PRIMITIVES

Publication number: 20210200608

Abstract: The present disclosure relates to methods and apparatus for machine learning processing. For example, disclosed techniques facilitate improving execution of machine learning primitives. Aspects of the present disclosure may store a command stream generated by an application in a buffer, the command stream including a plurality of machine learning primitives for execution by a graphics processor. Further, aspects of the present disclosure identify, after receiving a request from the application to finalize the buffer, two or more machine learning primitives of the buffer that may be replaced with a fused shader kernel. Additionally, aspects of the present disclosure may store the fused shader kernel in the buffer to generate a fused command buffer.

Type: Application

Filed: December 30, 2019

Publication date: July 1, 2021

Inventors: Hitendra GANGANI, Balaji CALIDAS, Jeremy WILLIAMS