Patents by Inventor Sumesh Udayakumaran

Sumesh Udayakumaran has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11243752
    Abstract: Described herein are techniques for generating a stitched shader program. The techniques include identifying a set of shader programs to include in the stitched shader program, wherein the set includes at least one multiversion shader program that includes a first version of instructions and a second version of instructions, wherein the first version of instructions uses a first number of resources that is different than a second number of resources used by the second version of instructions. The techniques also include combining the set of shader programs to form the stitched shader program. The techniques further include determining a number of resources for the stitched shader program. The techniques also include based on the determined number of resources, modifying the instructions corresponding to the multiversion shader program to, when executed, execute either the first version of instructions, or the second version of instructions.
    Type: Grant
    Filed: July 11, 2019
    Date of Patent: February 8, 2022
    Assignee: Advanced Micro Devices, Inc.
    Inventor: Sumesh Udayakumaran
  • Publication number: 20210011697
    Abstract: Described herein are techniques for generating a stitched shader program. The techniques include identifying a set of shader programs to include in the stitched shader program, wherein the set includes at least one multiversion shader program that includes a first version of instructions and a second version of instructions, wherein the first version of instructions uses a first number of resources that is different than a second number of resources used by the second version of instructions. The techniques also include combining the set of shader programs to form the stitched shader program. The techniques further include determining a number of resources for the stitched shader program. The techniques also include based on the determined number of resources, modifying the instructions corresponding to the multiversion shader program to, when executed, execute either the first version of instructions, or the second version of instructions.
    Type: Application
    Filed: July 11, 2019
    Publication date: January 14, 2021
    Applicant: Advanced Micro Devices, Inc.
    Inventor: Sumesh Udayakumaran
  • Patent number: 10026145
    Abstract: Techniques for allowing for concurrent execution of multiple different tasks and preempted prioritized execution of tasks on a shader processor. In an example operation, a driver executed by a central processing unit (CPU) configures GPU resources based on needs of a first “host” shader to allow the first shader to execute “normally” on the GPU. The GPU may observe two sets of tasks, “guest” tasks. Based on, for example, detecting an availability of resources, the GPU may determine a “guest” task may be run while the “host” task is running. A second “guest” shader executes on a GPU by using resources that were configured for the first “host” shader if there are available resources and, in some examples, additional resources are obtained through software-programmable means.
    Type: Grant
    Filed: December 13, 2016
    Date of Patent: July 17, 2018
    Assignee: QUALCOMM Incorporated
    Inventors: Alexei Vladimirovich Bourd, Maxim Kazakov, Chunhui Mei, Sumesh Udayakumaran
  • Publication number: 20180165786
    Abstract: Techniques for allowing for concurrent execution of multiple different tasks and preempted prioritized execution of tasks on a shader processor. In an example operation, a driver executed by a central processing unit (CPU) configures GPU resources based on needs of a first “host” shader to allow the first shader to execute “normally” on the GPU. The GPU may observe two sets of tasks, “guest” tasks. Based on, for example, detecting an availability of resources, the GPU may determine a “guest” task may be run while the “host” task is running. A second “guest” shader executes on a GPU by using resources that were configured for the first “host” shader if there are available resources and, in some examples, additional resources are obtained through software-programmable means.
    Type: Application
    Filed: December 13, 2016
    Publication date: June 14, 2018
    Inventors: Alexei Vladimirovich Bourd, Maxim Kazakov, Chunhui Mei, Sumesh Udayakumaran
  • Patent number: 9747104
    Abstract: In one example, a method includes responsive to receiving, by a processing unit, one or more instructions requesting that a first value be moved from a first general purpose register (GPR) to a third GPR and that a second value be moved from a second GPR to a fourth GPR, copying, by an initial logic unit and during a first clock cycle, the first value to an initial pipeline register, copying, by the initial logic and during a second clock cycle, the second value to the initial pipeline register, copying, by a final logic unit and during a third clock cycle, the first value from a final pipeline register to the third GPR, and copying, by the final logic unit and during a fourth clock cycle, the second value from the final pipeline register to the fourth GPR.
    Type: Grant
    Filed: May 12, 2014
    Date of Patent: August 29, 2017
    Assignee: QUALCOMM Incorporated
    Inventors: Lin Chen, Yun Du, Sumesh Udayakumaran, Chihong Zhang, Andrew Evan Gruber
  • Patent number: 9329867
    Abstract: This disclosure describes techniques for allocating registers in a computing system that supports vector physical registers. The techniques for allocating registers may allocate physical registers to vector virtual registers based on priority information that is indicative of a relative importance of allocating respective vector virtual registers as vectors rather than scalars. The techniques for allocating registers may involve allocating physical registers to the vector virtual registers in an order that is determined based on the priority information.
    Type: Grant
    Filed: September 23, 2014
    Date of Patent: May 3, 2016
    Assignee: QUALCOMM Incorporated
    Inventors: Sumesh Udayakumaran, Se Jong Oh
  • Publication number: 20150324196
    Abstract: In one example, a method includes responsive to receiving, by a processing unit, one or more instructions requesting that a first value be moved from a first general purpose register (GPR) to a third GPR and that a second value be moved from a second GPR to a fourth GPR, copying, by an initial logic unit and during a first clock cycle, the first value to an initial pipeline register, copying, by the initial logic and during a second clock cycle, the second value to the initial pipeline register, copying, by a final logic unit and during a third clock cycle, the first value from a final pipeline register to the third GPR, and copying, by the final logic unit and during a fourth clock cycle, the second value from the final pipeline register to the fourth GPR.
    Type: Application
    Filed: May 12, 2014
    Publication date: November 12, 2015
    Applicant: QUALCOMM Incorporated
    Inventors: Lin Chen, Yun Du, Sumesh Udayakumaran, Chihong Zhang, Andrew Evan Gruber
  • Publication number: 20150193234
    Abstract: This disclosure describes techniques for allocating registers in a computing system that supports vector physical registers. The techniques for allocating registers may allocate physical registers to vector virtual registers based on priority information that is indicative of a relative importance of allocating respective vector virtual registers as vectors rather than scalars. The techniques for allocating registers may involve allocating physical registers to the vector virtual registers in an order that is determined based on the priority information.
    Type: Application
    Filed: September 23, 2014
    Publication date: July 9, 2015
    Inventors: Sumesh Udayakumaran, Se Jong Oh
  • Patent number: 8933954
    Abstract: In general, aspects of this disclosure describe a compiler for allocation of physical registers for storing constituent scalar values of a non-scalar value. In some example, the compiler, executing on a processor, may receive an instruction for operation on a non-scalar value. The compiler may divide the instruction into a plurality of instructions for operation on constituent scalar values of the non-scalar value. The compiler may allocate a plurality of physical registers to store the constituent scalar values.
    Type: Grant
    Filed: March 23, 2011
    Date of Patent: January 13, 2015
    Assignee: QUALCOMM Incorporated
    Inventor: Sumesh Udayakumaran
  • Patent number: 8732679
    Abstract: A new computer-compiler architecture includes code analysis processes in which loops present in an intermediate instruction set are transformed into more efficient loops prior to fully executing the intermediate instruction set. The compiler architecture starts by generating the equivalent intermediate instructions for the original high level source code. For each loop in the intermediate instructions, a total cycle cost is calculated using a cycle cost table associated with the compiler. The compiler then generates intermediate code for replacement loops in which all conversion instructions are removed. The cycle costs for these new transformed loops are then compared against the total cycle cost for the original loops. If the total cycle costs exceed the new cycle costs, the compiler will replace the original loops in the intermediate instructions with the new transformed loops prior to generation of final code using the instruction set of the processor.
    Type: Grant
    Filed: March 16, 2010
    Date of Patent: May 20, 2014
    Assignee: QUALCOMM Incorporated
    Inventors: Sumesh Udayakumaran, Chihong Zhang
  • Patent number: 8494770
    Abstract: A method and system for calculating savings routes for display on a portable computing device (PCD) are described. The method includes receiving at least one of a product category and a service category from an operator of a PCD. The PCD may also receive a destination address. With this information, circle of influence data based on an offer for at least one product or service corresponding to the product category or service category may be generated and provided to the PCD. The circle of influence data may impact edge weights of a graph search algorithm. The graph search algorithm solves a single-source shortest path problem for a graph with non-negative edge path costs. The circles of influence in combination with the graph search algorithm allow a PCD to calculate one or more savings routes based on a start point and the desired destination address provided by the operator of the PCD.
    Type: Grant
    Filed: March 15, 2011
    Date of Patent: July 23, 2013
    Assignee: QUALCOMM Incorporated
    Inventors: Babak Forutanpour, Wolfgang G. Frank, Sumesh Udayakumaran
  • Publication number: 20120242673
    Abstract: In general, aspects of this disclosure describe a compiler for allocation of physical registers for storing constituent scalar values of a non-scalar value. In some example, the compiler, executing on a processor, may receive an instruction for operation on a non-scalar value. The compiler may divide the instruction into a plurality of instructions for operation on constituent scalar values of the non-scalar value. The compiler may allocate a plurality of physical registers to store the constituent scalar values.
    Type: Application
    Filed: March 23, 2011
    Publication date: September 27, 2012
    Applicant: QUALCOMM Incorporated
    Inventor: Sumesh Udayakumaran
  • Publication number: 20120239288
    Abstract: A method and system for calculating savings routes for display on a portable computing device (PCD) are described. The method includes receiving at least one of a product category and a service category from an operator of a PCD. The PCD may also receive a destination address. With this information, circle of influence data based on an offer for at least one product or service corresponding to the product category or service category may be generated and provided to the PCD. The circle of influence data may impact edge weights of a graph search algorithm. The graph search algorithm solves a single-source shortest path problem for a graph with non-negative edge path costs. The circles of influence in combination with the graph search algorithm allow a PCD to calculate one or more savings routes based on a start point and the desired destination address provided by the operator of the PCD.
    Type: Application
    Filed: March 15, 2011
    Publication date: September 20, 2012
    Inventors: Babak FORUTANPOUR, Wolfgang G. Frank, Sumesh Udayakumaran
  • Publication number: 20110231830
    Abstract: A new computer-compiler architecture includes code analysis processes in which loops present in an intermediate instruction set are transformed into more efficient loops prior to fully executing the intermediate instruction set. The compiler architecture starts by generating the equivalent intermediate instructions for the original high level source code. For each loop in the intermediate instructions, a total cycle cost is calculated using a cycle cost table associated with the compiler. The compiler then generates intermediate code for replacement loops in which all conversion instructions are removed. The cycle costs for these new transformed loops are then compared against the total cycle cost for the original loops. If the total cycle costs exceed the new cycle costs, the compiler will replace the original loops in the intermediate instructions with the new transformed loops prior to generation of final code using the instruction set of the processor.
    Type: Application
    Filed: March 16, 2010
    Publication date: September 22, 2011
    Applicant: QUALCOMM INCORPORATED
    Inventors: Sumesh Udayakumaran, Chihong Zhang
  • Patent number: 7367024
    Abstract: A highly predictable, low overhead and yet dynamic, memory allocation methodology for embedded systems with scratch-pad memory is presented. The dynamic memory allocation methodology for global and stack data (i) accounts for changing program requirements at runtime; (ii) has no software-caching tags; (iii) requires no run-time checks; (iv) has extremely low overheads; and (v) yields 100% predictable memory access times. The methodology provides that for data that is about to be accessed frequently is copied into the SRAM using compiler-inserted code at fixed and infrequent points in the program. Earlier data is evicted if necessary.
    Type: Grant
    Filed: September 21, 2004
    Date of Patent: April 29, 2008
    Assignee: University of Maryland
    Inventors: Rajeev Kumar Barua, Sumesh Udayakumaran
  • Publication number: 20060080372
    Abstract: A highly predictable, low overhead and yet dynamic, memory allocation methodology for embedded systems with scratch-pad memory is presented. The dynamic memory allocation methodology for global and stack data (i) accounts for changing program requirements at runtime; (ii) has no software-caching tags; (iii) requires no run-time checks; (iv) has extremely low overheads; and (v) yields 100% predictable memory access times. The methodology provides that for data that is about to be accessed frequently is copied into the SRAM using compiler-inserted code at fixed and infrequent points in the program. Earlier data is evicted if necessary.
    Type: Application
    Filed: September 21, 2004
    Publication date: April 13, 2006
    Inventors: Rajeey Barua, Sumesh Udayakumaran