Patents by Inventor Sooraj Puthoor

Sooraj Puthoor has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

HARDWARE ACCELERATED DYNAMIC WORK CREATION ON A GRAPHICS PROCESSING UNIT

Publication number: 20250028554

Abstract: A processor core is configured to execute a parent task that is described by a data structure stored in a memory. A coprocessor is configured to dispatch a child task to the at least one processor core in response to the coprocessor receiving a request from the parent task concurrently with the parent task executing on the at least one processor core. In some cases, the parent task registers the child task in a task pool and the child task is a future task that is configured to monitor a completion object and enqueue another task associated with the future task in response to detecting the completion object. The future task is configured to self-enqueue by adding a continuation future task to a continuation queue for subsequent execution in response to the future task failing to detect the completion object.

Type: Application

Filed: October 8, 2024

Publication date: January 23, 2025

Inventors: Anthony GUTIERREZ, Sooraj PUTHOOR
Allocation of resources when processing at memory level through memory request scheduling

Patent number: 12204774

Abstract: An apparatus includes a memory controller that includes logic to receive a first memory request having a first request type and a second memory request having a second request type. The apparatus also includes a scheduling unit that includes logic to schedule an order of the first and second memory requests for execution based upon a first parameter value and a second parameter value. The first parameter value corresponds to a utility and energy cost for the first memory request and the second parameter value corresponds to a utility and energy cost for the second memory request.

Type: Grant

Filed: November 14, 2022

Date of Patent: January 21, 2025

Assignee: ADVANCED MICRO DEVICES, INC.

Inventors: Alexandru Dutu, Nuwan Jayasena, Yasuko Eckert, Niti Madan, Sooraj Puthoor
Atomic Execution of Processing-in-Memory Operations

Publication number: 20240419330

Abstract: Scheduling processing-in-memory transactions in systems with multiple memory controllers is described. In accordance with the described techniques, an addressing system segments operations of a transaction into multiple microtransactions, where each microtransaction includes a subset of the transaction operations that are scheduled by a corresponding one of the multiple memory controllers. Each transaction, and its associated microtransactions, is assigned a transaction identifier based on a current counter value maintained at the multiple memory controllers, and the multiple memory controllers schedule execution of microtransactions based on associated transaction identifiers to ensure atomic execution of operations for a transaction without interruption by operations of a different transaction.

Type: Application

Filed: June 19, 2023

Publication date: December 19, 2024

Applicant: Advanced Micro Devices, Inc.

Inventors: Alexandru Dutu, Sooraj Puthoor
VIRTUAL PARTITIONING A PROCESSOR-IN-MEMORY ("PIM")

Publication number: 20240394199

Abstract: Process isolation for a PIM device includes: receiving, from a process, a call to allocate a virtual address space where the process stores a PIM configuration context; allocating the virtual address space including mapping a physical address space including PIM device configuration registers to the virtual address space only if the physical address space is not mapped to another process's virtual address space; and programming the PIM device configuration space according to the configuration context. When a PIM command is executed, a translation mechanism determines whether there is a valid mapping of a virtual address of the PIM command to a physical address of a PIM resource, such as a LIS entry. If a valid mapping exists, the translation is completed and the resource is accessed, but if there is not a valid mapping, the translation fails and the process is blocked from accessing the PIM resource.

Type: Application

Filed: June 17, 2024

Publication date: November 28, 2024

Inventors: SOORAJ PUTHOOR, MUHAMMAD AMBER HASSAAN, ASHWIN AJI, MICHAEL L. CHU, NUWAN JAYASENA
Hardware accelerated dynamic work creation on a graphics processing unit

Patent number: 12131186

Abstract: A processor core is configured to execute a parent task that is described by a data structure stored in a memory. A coprocessor is configured to dispatch a child task to the at least one processor core in response to the coprocessor receiving a request from the parent task concurrently with the parent task executing on the at least one processor core. In some cases, the parent task registers the child task in a task pool and the child task is a future task that is configured to monitor a completion object and enqueue another task associated with the future task in response to detecting the completion object. The future task is configured to self-enqueue by adding a continuation future task to a continuation queue for subsequent execution in response to the future task failing to detect the completion object.

Type: Grant

Filed: November 23, 2022

Date of Patent: October 29, 2024

Assignee: Advanced Micro Devices, Inc.

Inventors: Anthony Gutierrez, Sooraj Puthoor
Multi-kernel wavefront scheduler

Patent number: 12099867

Abstract: Systems, apparatuses, and methods for implementing a multi-kernel wavefront scheduler are disclosed. A system includes at least a parallel processor coupled to one or more memories, wherein the parallel processor includes a command processor and a plurality of compute units. The command processor launches multiple kernels for execution on the compute units. Each compute unit includes a multi-level scheduler for scheduling wavefronts from multiple kernels for execution on its execution units. A first level scheduler creates scheduling groups by grouping together wavefronts based on the priority of their kernels. Accordingly, wavefronts from kernels with the same priority are grouped together in the same scheduling group by the first level scheduler. Next, the first level scheduler selects, from a plurality of scheduling groups, the highest priority scheduling group for execution. Then, a second level scheduler schedules wavefronts for execution from the scheduling group selected by the first level scheduler.

Type: Grant

Filed: May 30, 2018

Date of Patent: September 24, 2024

Assignee: Advanced Micro Devices, Inc.

Inventors: Sooraj Puthoor, Joseph Gross, Xulong Tang, Bradford Michael Beckmann
FPGA-based programmable data analysis and compression front end for GPU

Patent number: 12099789

Abstract: Methods, devices, and systems for information communication. Information transmitted from a host to a graphics processing unit (GPU) is received by information analysis circuitry of a field-programmable gate array (FPGA). A pattern in the information is determined by the information analysis circuitry. A predicted information pattern is determined, by the information analysis circuitry, based on the information. An indication of the predicted information pattern is transmitted to the host. Responsive to a signal from the host based on the predicted information pattern, the FPGA is reprogrammed to implement decompression circuitry based on the predicted information pattern. In some implementations, the information includes a plurality of packets. In some implementations, the predicted information pattern includes a pattern in a plurality of packets. In some implementations, the predicted information pattern includes a zero data pattern.

Type: Grant

Filed: December 10, 2020

Date of Patent: September 24, 2024

Assignee: Advanced Micro Devices, Inc.

Inventors: Kevin Y. Cheng, Sooraj Puthoor, Onur Kayiran
Resource Access Control

Publication number: 20240220265

Abstract: Resource access control is described. In accordance with the described techniques, a process (e.g., an application process, a system process, etc.) issues an instruction seeking access to a computation resource (e.g., a processor resource, a memory resource, etc.) to perform a computation task. An execution context for the instruction is checked to determine whether the execution context includes a resource indicator indicating permission to access the processor resource. Alternatively or additionally, the instruction is checked against an access table which identifies processes that are permitted and/or not permitted to access the computation resource.

Type: Application

Filed: December 28, 2022

Publication date: July 4, 2024

Applicant: Advanced Micro Devices, Inc.

Inventors: Sooraj Puthoor, Nuwan S. Jayasena
Scheduling Processing-in-Memory Transactions

Publication number: 20240220160

Abstract: Scheduling processing-in-memory transactions is described. In accordance with the described techniques, a memory controller receives a transaction header from a host, where the transaction header describes a number of operations to be executed by a processing-in-memory component as part of performing the transaction. The memory controller adds the transaction header to a buffer and sends either an acknowledgement message or a negative acknowledgement message to the host, based on a current load of the processing-in-memory component. The acknowledgement message causes the host to send operations of the transaction for execution by the processing-in-memory component and the negative acknowledgement message causes the host to refrain from sending the operations of the transaction for execution by the processing-in-memory component.

Type: Application

Filed: December 29, 2022

Publication date: July 4, 2024

Applicant: Advanced Micro Devices, Inc.

Inventors: Alexandru Dutu, Sooraj Puthoor
PROCESS ISOLATION FOR A PROCESSOR-IN-MEMORY ("PIM") DEVICE

Publication number: 20240220164

Abstract: Process isolation for a PIM device through exclusive locking includes receiving, from a process, a call requesting ownership of a PIM device. The request includes one or more PIM configuration parameters. The exclusive locking technique also includes granting the process ownership of the PIM device responsive to determining that ownership is available. The PIM device is configured according to the PIM configuration parameters.

Type: Application

Filed: March 14, 2024

Publication date: July 4, 2024

Inventors: SOORAJ PUTHOOR, MUHAMMAD AMBER HASSAAN, ASHWIN AJI, MICHAEL L. CHU, NUWAN JAYASENA
PARTITION AND ISOLATION OF A PROCESSING-IN-MEMORY (PIM) DEVICE

Publication number: 20240211256

Abstract: An apparatus that manages multi-process execution in a processing-in-memory (“PIM”) device includes a gatekeeper configured to: receive an identification of one or more registered PIM processes; receive, from a process, a memory request that includes a PIM command; if the requesting process is a registered PIM process and another registered PIM process is active on the PIM device, perform a context switch of PIM state between the registered PIM processes; and issue the PIM command of the requesting process to the PIM device.

Type: Application

Filed: March 11, 2024

Publication date: June 27, 2024

Inventors: SOORAJ PUTHOOR, MUHAMMAD AMBER HASSAAN, ASHWIN AJI, MICHAEL L. CHU, NUWAN JAYASENA
Virtual partitioning a processor-in-memory (“PIM”)

Patent number: 12019560

Abstract: Process isolation for a PIM device includes: receiving, from a process, a call to allocate a virtual address space where the process stores a PIM configuration context; allocating the virtual address space including mapping a physical address space including PIM device configuration registers to the virtual address space only if the physical address space is not mapped to another process's virtual address space; and programming the PIM device configuration space according to the configuration context. When a PIM command is executed, a translation mechanism determines whether there is a valid mapping of a virtual address of the PIM command to a physical address of a PIM resource, such as a LIS entry. If a valid mapping exists, the translation is completed and the resource is accessed, but if there is not a valid mapping, the translation fails and the process is blocked from accessing the PIM resource.

Type: Grant

Filed: December 20, 2021

Date of Patent: June 25, 2024

Assignee: ADVANCED MICRO DEVICES, INC.

Inventors: Sooraj Puthoor, Muhammad Amber Hassaan, Ashwin Aji, Michael L. Chu, Nuwan Jayasena
ALLOCATION OF RESOURCES WHEN PROCESSING AT MEMORY LEVEL THROUGH MEMORY REQUEST SCHEDULING

Publication number: 20240160364

Abstract: An apparatus includes a memory controller that includes logic to receive a first memory request having a first request type and a second memory request having a second request type. The apparatus also includes a scheduling unit that includes logic to schedule an order of the first and second memory requests for execution based upon a first parameter value and a second parameter value. The first parameter value corresponds to a utility and energy cost for the first memory request and the second parameter value corresponds to a utility and energy cost for the second memory request.

Type: Application

Filed: November 14, 2022

Publication date: May 16, 2024

Inventors: ALEXANDRU DUTU, NUWAN JAYASENA, YASUKO ECKERT, NITI MADAN, SOORAJ PUTHOOR
Executing Kernel Workgroups Across Multiple Compute Unit Types

Publication number: 20240111591

Abstract: Portions of programs, oftentimes referred to as kernels, are written by programmers to target a particular type of compute unit, such as a central processing unit (CPU) core or a graphics processing unit (GPU) core. When executing a kernel, the kernel is separated into multiple parts referred to as workgroups, and each workgroup is provided to a compute unit for execution. Usage of one type of compute unit is monitored and, in response to the one type of compute unit being idle, one or more workgroups targeting another type of compute unit are executed on the one type of compute unit. For example, usage of CPU cores is monitored, and in response to the CPU cores being idle, one or more workgroups targeting GPU cores are executed on the CPU cores.

Type: Application

Filed: September 30, 2022

Publication date: April 4, 2024

Applicant: Advanced Micro Devices, Inc.

Inventors: Bradford Michael Beckmann, Sooraj Puthoor
Process isolation for a processor-in-memory (“PIM”) device

Patent number: 11934698

Abstract: Process isolation for a PIM device through exclusive locking includes receiving, from a process, a call requesting ownership of a PIM device. The request includes one or more PIM configuration parameters. The exclusive locking technique also includes granting the process ownership of the PIM device responsive to determining that ownership is available. The PIM device is configured according to the PIM configuration parameters.

Type: Grant

Filed: December 20, 2021

Date of Patent: March 19, 2024

Assignee: ADVANCED MICRO DEVICES, INC.

Inventors: Sooraj Puthoor, Muhammad Amber Hassaan, Ashwin Aji, Michael L. Chu, Nuwan Jayasena
Partition and isolation of a processing-in-memory (PIM) device

Patent number: 11934827

Abstract: An apparatus that manages multi-process execution in a processing-in-memory (“PIM”) device includes a gatekeeper configured to: receive an identification of one or more registered PIM processes; receive, from a process, a memory request that includes a PIM command; if the requesting process is a registered PIM process and another registered PIM process is active on the PIM device, perform a context switch of PIM state between the registered PIM processes; and issue the PIM command of the requesting process to the PIM device.

Type: Grant

Filed: December 20, 2021

Date of Patent: March 19, 2024

Assignee: ADVANCED MICRO DEVICES, INC.

Inventors: Sooraj Puthoor, Muhammad Amber Hassaan, Ashwin Aji, Michael L. Chu, Nuwan Jayasena
Implementing heterogeneous wavefronts on a graphics processing unit (GPU)

Patent number: 11875425

Abstract: Implementing heterogeneous wavefronts on a graphics processing unit (GPU) is disclosed. A scheduler assigns heterogeneous wavefronts for execution on a compute unit of a processing device. The heterogeneous wavefronts include different types of wavefronts such as vector compute wavefronts and service-level wavefronts that vary in resource requirements and instruction sets. As one example, heterogeneous wavefronts may include scalar wavefronts and vector compute wavefronts that execute on scalar units and vector units, respectively. Distinct sets of instructions are executed for the heterogeneous wavefronts on the compute unit. Heterogeneous wavefronts are processed in the same pipeline of the processing device.

Type: Grant

Filed: December 28, 2020

Date of Patent: January 16, 2024

Assignee: ADVANCED MICRO DEVICES, INC.

Inventors: Sooraj Puthoor, Bradford Beckmann, Nuwan Jayasena, Anthony Gutierrez
Hardware assisted fine-grained data movement

Patent number: 11868809

Abstract: A processor includes a task scheduling unit and a compute unit coupled to the task scheduling unit. The task scheduling unit performs a task dependency assessment of a task dependency graph and task data requirements that correspond to each task of the plurality of tasks. Based on the task dependency assessment, the task scheduling unit schedules a first task of the plurality of tasks and a second proxy object of a plurality of proxy objects specified by the task data requirements such that a memory transfer of the second proxy object of the plurality of proxy objects occurs while the first task is being executed.

Type: Grant

Filed: January 11, 2023

Date of Patent: January 9, 2024

Assignee: Advanced Micro Devices, Inc.

Inventors: Muhammad Amber Hassaan, Anirudh Mohan Kaushik, Sooraj Puthoor, Gokul Subramanian Ravi, Bradford Beckmann, Ashwin Aji
Hardware assisted fine-grained data movement

Patent number: 11734059

Abstract: A processor includes a task scheduling unit and a compute unit coupled to the task scheduling unit. The task scheduling unit performs a task dependency assessment of a task dependency graph and task data requirements that correspond to each task of the plurality of tasks. Based on the task dependency assessment, the task scheduling unit schedules a first task of the plurality of tasks and a second proxy object of a plurality of proxy objects specified by the task data requirements such that a memory transfer of the second proxy object of the plurality of proxy objects occurs while the first task is being executed.

Type: Grant

Filed: March 19, 2020

Date of Patent: August 22, 2023

Assignee: Advanced Micro Devices, Inc.

Inventors: Muhammad Amber Hassaan, Anirudh Mohan Kaushik, Sooraj Puthoor, Gokul Subramanian Ravi, Bradford Beckmann, Ashwin Aji
Dynamic kernel memory space allocation

Patent number: 11720993

Abstract: A processing unit includes one or more processor cores and a set of registers to store configuration information for the processing unit. The processing unit also includes a coprocessor configured to receive a request to modify a memory allocation for a kernel concurrently with the kernel executing on the at least one processor core. The coprocessor is configured to modify the memory allocation by modifying the configuration information stored in the set of registers. In some cases, initial configuration information is provided to the set of registers by a different processing unit. The initial configuration information is stored in the set of registers prior to the coprocessor modifying the configuration information.

Type: Grant

Filed: September 21, 2018

Date of Patent: August 8, 2023

Assignee: Advanced Micro Devices, Inc.

Inventors: Anthony Gutierrez, Muhammad Amber Hassaan, Sooraj Puthoor

1 2 3 4 next