Patents by Inventor James A. Valerio

James A. Valerio has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

THREAD SCHEDULING OVER COMPUTE BLOCKS FOR POWER OPTIMIZATION

Publication number: 20240362741

Abstract: One embodiment provides for a general-purpose graphics processing unit comprising a processing array including multiple compute blocks, each compute block including multiple processing clusters and a thread dispatch unit to dispatch threads of a workload to the multiple compute blocks based on a parallelism metric, wherein the thread dispatch unit, based on the parallelism metric, is to perform one of a first operation and a second operation, the first operation to distribute threads across the multiple compute blocks and the second operation is to concentrate threads within one of the multiple compute blocks.

Type: Application

Filed: May 13, 2024

Publication date: October 31, 2024

Applicant: Intel Corporation

Inventors: Altug Koker, Balaji Vembu, Joydeep Ray, James A. Valerio, Abhishek R. Appu
Thread scheduling over compute blocks for power optimization

Patent number: 11995737

Abstract: Thread dispatch circuitry is configured to dispatch threads of a two-dimensional (2D) thread group based on data access locality associated with the threads. The thread dispatch circuitry can dispatch a first 2D sub-group of the 2D thread group to a compute block of the multiple compute blocks, the first 2D sub-group associated with a first 2D tile of memory and dispatch a second 2D sub-group of the 2D thread group to the compute block of the multiple compute blocks, the second 2D sub-group associated with a second 2D tile of memory.

Type: Grant

Filed: November 16, 2021

Date of Patent: May 28, 2024

Assignee: Intel Corporation

Inventors: Altug Koker, Balaji Vembu, Joydeep Ray, James A. Valerio, Abhishek R. Appu
Register spill/fill using shared local memory space

Patent number: 11508338

Abstract: A mechanism is described for facilitating using of a shared local memory for register spilling/filling relating to graphics processors at computing devices. A method of embodiments, as described herein, includes reserving one or more spaces of a shared local memory (SLM) to perform one or more of spilling and filling relating to registers associated with a graphics processor of a computing device.

Type: Grant

Filed: October 5, 2020

Date of Patent: November 22, 2022

Assignee: Intel Corporation

Inventors: Joydeep Ray, Altug Koker, Balaji Vembu, Murali Ramadoss, Guei-Yuan Lueh, James A. Valerio, Prasoonkumar Surti, Abhishek R. Appu, Vasanth Ranganathan, Kalyan K. Bhiravabhatla, Arthur D. Hunter, Jr., Wei-Yu Chen, Subramaniam M. Maiyuran
Memory-based software barriers

Patent number: 11494232

Abstract: A mechanism is described for facilitating memory-based software barriers to emulate hardware barriers at graphics processors in computing devices. A method of embodiments, as described herein, includes facilitating converting thread scheduling at a processor from hardware barriers to software barriers, where the software barriers emulate the hardware barriers.

Type: Grant

Filed: November 24, 2020

Date of Patent: November 8, 2022

Assignee: Intel Corporation

Inventors: Altug Koker, Joydeep Ray, Balaji Vembu, James A. Valerio, Abhishek R. Appu
Coarse grain coherency

Patent number: 11436695

Abstract: One embodiment provides for a general-purpose graphics processing device comprising a general-purpose graphics processing compute block to process a workload including graphics or compute operations, a first cache memory, and a coherency module enable the first cache memory to coherently cache data for the workload, the data stored in memory within a virtual address space, wherein the virtual address space shared with a separate general-purpose processor including a second cache memory that is coherent with the first cache memory.

Type: Grant

Filed: March 12, 2021

Date of Patent: September 6, 2022

Assignee: Intel Corporation

Inventors: Joydeep Ray, Altug Koker, James A. Valerio, David Puffer, Abhishek R. Appu, Stephen Junkins
Intelligent graphics dispatching mechanism

Patent number: 11354768

Abstract: An apparatus to facilitate data intelligent dispatching is disclosed. The apparatus includes one or more processing units including a plurality of execution units (EUs) to execute a plurality of processing threads and collection logic to collect statistics data for threads executed at the processing unit during execution of an application, and dispatch logic to dispatch the threads to be executed at a subset of the plurality of EUs during a subsequent execution of the application based on the statistics data.

Type: Grant

Filed: February 14, 2020

Date of Patent: June 7, 2022

Assignee: Intel Corporation

Inventors: Balaji Vembu, Murali Ramadoss, Guei-Yuan Lueh, Subramaniam M. Maiyuran, Abhishek R. Appu, Joydeep Ray, Altug Koker, James A. Valerio, Eric J. Hoekstra, Arthur D. Hunter, Jr.
THREAD SCHEDULING OVER COMPUTE BLOCKS FOR POWER OPTIMIZATION

Publication number: 20220156875

Abstract: Thread dispatch circuitry is configured to dispatch threads of a two-dimensional (2D) thread group based on data access locality associated with the threads. The thread dispatch circuitry can dispatch a first 2D sub-group of the 2D thread group to a compute block of the multiple compute blocks, the first 2D sub-group associated with a first 2D tile of memory and dispatch a second 2D sub-group of the 2D thread group to the compute block of the multiple compute blocks, the second 2D sub-group associated with a second 2D tile of memory.

Type: Application

Filed: November 16, 2021

Publication date: May 19, 2022

Applicant: Intel Corporation

Inventors: Altug Koker, Balaji Vembu, Joydeep Ray, James A. Valerio, Abhishek R. Appu
Replacement policies for a hybrid hierarchical cache

Patent number: 11263152

Abstract: A hybrid hierarchical cache is implemented at the same level in the access pipeline, to get the faster access behavior of a smaller cache and, at the same time, a higher hit rate at lower power for a larger cache, in some embodiments. A split cache at the same level in the access pipeline includes two caches that work together. In the hybrid, split, low level cache (e.g., L1) evictions are coordinated locally between the two L1 portions, and on a miss to both L1 portions, a line is allocated from a larger L2 cache to the smallest L1 cache.

Type: Grant

Filed: May 22, 2020

Date of Patent: March 1, 2022

Assignee: Intel Corporation

Inventors: Abhishek R. Appu, Joydeep Ray, James A. Valerio, Altug Koker, Prasoonkumar Surti, Balaji Vembu, Wenyin Fu, Bhushan M. Borole, Kamal Sinha
Thread scheduling over compute blocks for power optimization

Patent number: 11227360

Abstract: One embodiment provides for a general-purpose graphics processing unit comprising a processing array including multiple compute blocks, each compute block including multiple processing clusters and a thread dispatch unit to dispatch threads of a workload to the multiple compute blocks based on a parallelism metric, wherein the thread dispatch unit, based on the parallelism metric, is to perform one of a first operation and a second operation, the first operation to distribute threads across the multiple compute blocks and the second operation is to concentrate threads within one of the multiple compute blocks.

Type: Grant

Filed: December 16, 2019

Date of Patent: January 18, 2022

Assignee: Intel Corporation

Inventors: Altug Koker, Balaji Vembu, Joydeep Ray, James A. Valerio, Abhishek R. Appu
Switching crossbar for graphics pipeline

Patent number: 11176083

Abstract: A shared local memory data crossbar may be implemented in multiple stages. With this approach, the number of multiplexer cells can be reduced by fifty percent (50%) or more in some embodiments.

Type: Grant

Filed: April 14, 2020

Date of Patent: November 16, 2021

Assignee: Intel Corporation

Inventors: Joydeep Ray, James A. Valerio, Altug Koker, Abhishek R. Appu, Vasanth Ranganathan
COARSE GRAIN COHERENCY

Publication number: 20210272231

Abstract: One embodiment provides for a general-purpose graphics processing device comprising a general-purpose graphics processing compute block to process a workload including graphics or compute operations, a first cache memory, and a coherency module enable the first cache memory to coherently cache data for the workload, the data stored in memory within a virtual address space, wherein the virtual address space shared with a separate general-purpose processor including a second cache memory that is coherent with the first cache memory.

Type: Application

Filed: March 12, 2021

Publication date: September 2, 2021

Applicant: Intel Corporation

Inventors: Joydeep Ray, Altug Koker, James A. Valerio, David Puffer, Abhishek R. Appu, Stephen Junkins
MEMORY-BASED SOFTWARE BARRIERS

Publication number: 20210149724

Abstract: A mechanism is described for facilitating memory-based software barriers to emulate hardware barriers at graphics processors in computing devices. A method of embodiments, as described herein, includes facilitating converting thread scheduling at a processor from hardware barriers to software barriers, where the software barriers emulate the hardware barriers.

Type: Application

Filed: November 24, 2020

Publication date: May 20, 2021

Applicant: Intel Corporation

Inventors: Altug Koker, Joydeep Ray, Balaji Vembu, James A. Valerio, Abhishek R. Appu
REGISTER SPILL/FILL USING SHARED LOCAL MEMORY SPACE

Publication number: 20210125581

Abstract: A mechanism is described for facilitating using of a shared local memory for register spilling/filling relating to graphics processors at computing devices. A method of embodiments, as described herein, includes reserving one or more spaces of a shared local memory (SLM) to perform one or more of spilling and filling relating to registers associated with a graphics processor of a computing device.

Type: Application

Filed: October 5, 2020

Publication date: April 29, 2021

Applicant: Intel Corporation

Inventors: Joydeep Ray, Altug Koker, Balaji Vembu, Murali Ramadoss, Guei-Yuan Lueh, James A. Valerio, Prasoonkumar Surti, Abhishek R. Appu, Vasanth Ranganathan, Kalyan K. Bhiravabhatla, Arthur D. Hunter, JR., Wei-Yu Chen, Subramaniam M. Maiyuran
Coarse grain coherency

Patent number: 10949945

Abstract: One embodiment provides for a general-purpose graphics processing device comprising a general-purpose graphics processing compute block to process a workload including graphics or compute operations, a first cache memory, and a coherency module enable the first cache memory to coherently cache data for the workload, the data stored in memory within a virtual address space, wherein the virtual address space shared with a separate general-purpose processor including a second cache memory that is coherent with the first cache memory.

Type: Grant

Filed: May 11, 2020

Date of Patent: March 16, 2021

Assignee: Intel Corporation

Inventors: Joydeep Ray, Altug Koker, James A. Valerio, David Puffer, Abhishek R. Appu, Stephen Junkins
Optimizing memory address compression

Patent number: 10909037

Abstract: A mechanism is described for facilitating memory address compression at computing devices. A method of embodiments, as described herein, includes coalescing slot addresses across multiple messages received from an execution unit, where the slot addresses are coalesced in groups based on memory cacheline addresses such that each of a set of slot addresses in a group have a memory cacheline address in common between them. The method may further include outputting the memory cacheline addresses.

Type: Grant

Filed: April 21, 2017

Date of Patent: February 2, 2021

Assignee: INTEL CORPOR ATION

Inventors: Joydeep Ray, Abhishek R. Appu, Altug Koker, James A. Valerio, Prasoonkumar Surti
Independent and separate entity-based cache

Patent number: 10884932

Abstract: A mechanism is described for facilitating independent and separate entity-based graphics cache at computing devices. A method of embodiments, as described herein, includes facilitate hosting of a plurality of cache at a plurality of entities associated with a graphics processor, wherein each entity hosts at least one cache, and wherein an entity includes a dual sub-slice (DSS) or a streaming multiprocessor (SM).

Type: Grant

Filed: April 25, 2019

Date of Patent: January 5, 2021

Assignee: INTEL CORPORATION

Inventors: Altug Koker, Joydeep Ray, James A. Valerio, Abhishek R. Appu, Vasanth Ranganathan
Memory-based software barriers

Patent number: 10853132

Abstract: A mechanism is described for facilitating memory-based software barriers to emulate hardware barriers at graphics processors in computing devices. A method of embodiments, as described herein, includes facilitating converting thread scheduling at a processor from hardware barriers to software barriers, where the software barriers emulate the hardware barriers.

Type: Grant

Filed: April 9, 2019

Date of Patent: December 1, 2020

Assignee: INTEL CORPORATION

Inventors: Altug Koker, Joydeep Ray, Balaji Vembu, James A. Valerio, Abhishek R. Appu
Replacement Policies for a Hybrid Hierarchical Cache

Publication number: 20200349091

Abstract: A hybrid hierarchical cache is implemented at the same level in the access pipeline, to get the faster access behavior of a smaller cache and, at the same time, a higher hit rate at lower power for a larger cache, in some embodiments. A split cache at the same level in the access pipeline includes two caches that work together. In the hybrid, split, low level cache (e.g., L1) evictions are coordinated locally between the two L1 portions, and on a miss to both L1 portions, a line is allocated from a larger L2 cache to the smallest L1 cache.

Type: Application

Filed: May 22, 2020

Publication date: November 5, 2020

Inventors: Abhishek R. Appu, Joydeep Ray, James A. Valerio, Altug Koker, Prasoonkumar Surti, Balaji Vembu, Wenyin Fu, Bhushan M. Borole, Kamal Sinha
COARSE GRAIN COHERENCY

Publication number: 20200342564

Abstract: One embodiment provides for a general-purpose graphics processing device comprising a general-purpose graphics processing compute block to process a workload including graphics or compute operations, a first cache memory, and a coherency module enable the first cache memory to coherently cache data for the workload, the data stored in memory within a virtual address space, wherein the virtual address space shared with a separate general-purpose processor including a second cache memory that is coherent with the first cache memory.

Type: Application

Filed: May 11, 2020

Publication date: October 29, 2020

Applicant: Intel Corporation

Inventors: Joydeep Ray, Altug Koker, James A. Valerio, David Puffer, Abhishek R. Appu, Stephen Junkins
Switching Crossbar for Graphics Pipeline

Publication number: 20200341942

Abstract: A shared local memory data crossbar may be implemented in multiple stages. With this approach, the number of multiplexer cells can be reduced by fifty percent (50%) or more in some embodiments.

Type: Application

Filed: April 14, 2020

Publication date: October 29, 2020

Inventors: Joydeep Ray, James A. Valerio, Altug Koker, Abhishek R. Appu, Vasanth Ranganathan

1 2 3 next