Patents by Inventor Ajay Simha Modugala

Ajay Simha Modugala has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Compute Kernel Parsing with Limits in one or more Dimensions

Publication number: 20240345892

Abstract: Techniques are disclosed relating to dispatching compute work from a compute stream. In some embodiments, a graphics processor executes instructions of compute kernels. Workload parser circuitry may determine, for distribution to the graphics processor circuitry, a set of workgroups from a compute kernel that includes workgroups organized in multiple dimensions, including a first number of workgroups in a first dimension and a second number of workgroups in a second dimension. This may include determining multiple sub-kernels for the compute kernel, wherein a first sub-kernel includes, in the first dimension, a limited number of workgroups that is smaller than the first number of workgroups. The parser circuitry may iterate through workgroups in both the first and second dimensions to generate the set of workgroups, proceeding through the first sub-kernel before iterating through any of the other sub-kernels. Disclosed techniques may provide desirable shapes for batches of workgroups.

Type: Application

Filed: May 24, 2024

Publication date: October 17, 2024

Inventors: Andrew M. Havlir, Ajay Simha Modugala, Karl D. Mann
Compute kernel parsing with limits in one or more dimensions with iterating through workgroups in the one or more dimensions for execution

Patent number: 12020075

Abstract: Techniques are disclosed relating to dispatching compute work from a compute stream. In some embodiments, a graphics processor executes instructions of compute kernels. Workload parser circuitry may determine, for distribution to the graphics processor circuitry, a set of workgroups from a compute kernel that includes workgroups organized in multiple dimensions, including a first number of workgroups in a first dimension and a second number of workgroups in a second dimension. This may include determining multiple sub-kernels for the compute kernel, wherein a first sub-kernel includes, in the first dimension, a limited number of workgroups that is smaller than the first number of workgroups. The parser circuitry may iterate through workgroups in both the first and second dimensions to generate the set of workgroups, proceeding through the first sub-kernel before iterating through any of the other sub-kernels. Disclosed techniques may provide desirable shapes for batches of workgroups.

Type: Grant

Filed: September 11, 2020

Date of Patent: June 25, 2024

Assignee: Apple Inc.

Inventors: Andrew M. Havlir, Ajay Simha Modugala, Karl D. Mann
Affinity-based Graphics Scheduling

Publication number: 20230047481

Abstract: Techniques are disclosed relating to affinity-based scheduling of graphics work. In disclosed embodiments, first and second groups of graphics processor sub-units may share respective first and second caches. Distribution circuitry may receive a software-specified set of graphics work and a software-indicated mapping of portions of the set of graphics work to groups of graphics processor sub-units. The distribution circuitry may assign subsets of the set of graphics work based on the mapping. This may improve cache efficiency, in some embodiments, by allowing graphics work that accesses the same memory areas to be assigned to the same group of sub-units that share a cache.

Type: Application

Filed: August 11, 2021

Publication date: February 16, 2023

Inventors: Andrew M. Havlir, Ajay Simha Modugala, Benjamin Bowman, Yunjun Zhang
Compute Kernel Parsing with Limits in one or more Dimensions

Publication number: 20220083377

Abstract: Techniques are disclosed relating to dispatching compute work from a compute stream. In some embodiments, a graphics processor executes instructions of compute kernels. Workload parser circuitry may determine, for distribution to the graphics processor circuitry, a set of workgroups from a compute kernel that includes workgroups organized in multiple dimensions, including a first number of workgroups in a first dimension and a second number of workgroups in a second dimension. This may include determining multiple sub-kernels for the compute kernel, wherein a first sub-kernel includes, in the first dimension, a limited number of workgroups that is smaller than the first number of workgroups. The parser circuitry may iterate through workgroups in both the first and second dimensions to generate the set of workgroups, proceeding through the first sub-kernel before iterating through any of the other sub-kernels. Disclosed techniques may provide desirable shapes for batches of workgroups.

Type: Application

Filed: September 11, 2020

Publication date: March 17, 2022

Inventors: Andrew M. Havlir, Ajay Simha Modugala, Karl D. Mann
Completion signaling techniques in distributed processor

Patent number: 11250538

Abstract: Techniques are disclosed relating to tracking compute workgroup completions in a distributed processor. In some embodiments, an apparatus includes a plurality of shader processors configured to perform operations for compute workgroups included in compute kernels, a master workload parser circuit, a plurality of distributed workload parser circuits, and a communications fabric connected to the plurality of distributed workload parser circuits and the master workload parser circuit.

Type: Grant

Filed: March 9, 2020

Date of Patent: February 15, 2022

Assignee: Apple Inc.

Inventors: Andrew M. Havlir, Ajay Simha Modugala
Completion Signaling Techniques in Distributed Processor

Publication number: 20210279832

Abstract: Techniques are disclosed relating to tracking compute workgroup completions in a distributed processor. In some embodiments, an apparatus includes a plurality of shader processors configured to perform operations for compute workgroups included in compute kernels, a master workload parser circuit, a plurality of distributed workload parser circuits, and a communications fabric connected to the plurality of distributed workload parser circuits and the master workload parser circuit.

Type: Application

Filed: March 9, 2020

Publication date: September 9, 2021

Inventors: Andrew M. Havlir, Ajay Simha Modugala

Compute Kernel Parsing with Limits in one or more Dimensions

Compute kernel parsing with limits in one or more dimensions with iterating through workgroups in the one or more dimensions for execution

Affinity-based Graphics Scheduling

Compute Kernel Parsing with Limits in one or more Dimensions

Completion signaling techniques in distributed processor

Completion Signaling Techniques in Distributed Processor