Patents by Inventor James A. Valerio
James A. Valerio has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20240362741Abstract: One embodiment provides for a general-purpose graphics processing unit comprising a processing array including multiple compute blocks, each compute block including multiple processing clusters and a thread dispatch unit to dispatch threads of a workload to the multiple compute blocks based on a parallelism metric, wherein the thread dispatch unit, based on the parallelism metric, is to perform one of a first operation and a second operation, the first operation to distribute threads across the multiple compute blocks and the second operation is to concentrate threads within one of the multiple compute blocks.Type: ApplicationFiled: May 13, 2024Publication date: October 31, 2024Applicant: Intel CorporationInventors: Altug Koker, Balaji Vembu, Joydeep Ray, James A. Valerio, Abhishek R. Appu
-
Patent number: 11995737Abstract: Thread dispatch circuitry is configured to dispatch threads of a two-dimensional (2D) thread group based on data access locality associated with the threads. The thread dispatch circuitry can dispatch a first 2D sub-group of the 2D thread group to a compute block of the multiple compute blocks, the first 2D sub-group associated with a first 2D tile of memory and dispatch a second 2D sub-group of the 2D thread group to the compute block of the multiple compute blocks, the second 2D sub-group associated with a second 2D tile of memory.Type: GrantFiled: November 16, 2021Date of Patent: May 28, 2024Assignee: Intel CorporationInventors: Altug Koker, Balaji Vembu, Joydeep Ray, James A. Valerio, Abhishek R. Appu
-
Patent number: 11508338Abstract: A mechanism is described for facilitating using of a shared local memory for register spilling/filling relating to graphics processors at computing devices. A method of embodiments, as described herein, includes reserving one or more spaces of a shared local memory (SLM) to perform one or more of spilling and filling relating to registers associated with a graphics processor of a computing device.Type: GrantFiled: October 5, 2020Date of Patent: November 22, 2022Assignee: Intel CorporationInventors: Joydeep Ray, Altug Koker, Balaji Vembu, Murali Ramadoss, Guei-Yuan Lueh, James A. Valerio, Prasoonkumar Surti, Abhishek R. Appu, Vasanth Ranganathan, Kalyan K. Bhiravabhatla, Arthur D. Hunter, Jr., Wei-Yu Chen, Subramaniam M. Maiyuran
-
Patent number: 11494232Abstract: A mechanism is described for facilitating memory-based software barriers to emulate hardware barriers at graphics processors in computing devices. A method of embodiments, as described herein, includes facilitating converting thread scheduling at a processor from hardware barriers to software barriers, where the software barriers emulate the hardware barriers.Type: GrantFiled: November 24, 2020Date of Patent: November 8, 2022Assignee: Intel CorporationInventors: Altug Koker, Joydeep Ray, Balaji Vembu, James A. Valerio, Abhishek R. Appu
-
Patent number: 11436695Abstract: One embodiment provides for a general-purpose graphics processing device comprising a general-purpose graphics processing compute block to process a workload including graphics or compute operations, a first cache memory, and a coherency module enable the first cache memory to coherently cache data for the workload, the data stored in memory within a virtual address space, wherein the virtual address space shared with a separate general-purpose processor including a second cache memory that is coherent with the first cache memory.Type: GrantFiled: March 12, 2021Date of Patent: September 6, 2022Assignee: Intel CorporationInventors: Joydeep Ray, Altug Koker, James A. Valerio, David Puffer, Abhishek R. Appu, Stephen Junkins
-
Patent number: 11354768Abstract: An apparatus to facilitate data intelligent dispatching is disclosed. The apparatus includes one or more processing units including a plurality of execution units (EUs) to execute a plurality of processing threads and collection logic to collect statistics data for threads executed at the processing unit during execution of an application, and dispatch logic to dispatch the threads to be executed at a subset of the plurality of EUs during a subsequent execution of the application based on the statistics data.Type: GrantFiled: February 14, 2020Date of Patent: June 7, 2022Assignee: Intel CorporationInventors: Balaji Vembu, Murali Ramadoss, Guei-Yuan Lueh, Subramaniam M. Maiyuran, Abhishek R. Appu, Joydeep Ray, Altug Koker, James A. Valerio, Eric J. Hoekstra, Arthur D. Hunter, Jr.
-
Publication number: 20220156875Abstract: Thread dispatch circuitry is configured to dispatch threads of a two-dimensional (2D) thread group based on data access locality associated with the threads. The thread dispatch circuitry can dispatch a first 2D sub-group of the 2D thread group to a compute block of the multiple compute blocks, the first 2D sub-group associated with a first 2D tile of memory and dispatch a second 2D sub-group of the 2D thread group to the compute block of the multiple compute blocks, the second 2D sub-group associated with a second 2D tile of memory.Type: ApplicationFiled: November 16, 2021Publication date: May 19, 2022Applicant: Intel CorporationInventors: Altug Koker, Balaji Vembu, Joydeep Ray, James A. Valerio, Abhishek R. Appu
-
Patent number: 11263152Abstract: A hybrid hierarchical cache is implemented at the same level in the access pipeline, to get the faster access behavior of a smaller cache and, at the same time, a higher hit rate at lower power for a larger cache, in some embodiments. A split cache at the same level in the access pipeline includes two caches that work together. In the hybrid, split, low level cache (e.g., L1) evictions are coordinated locally between the two L1 portions, and on a miss to both L1 portions, a line is allocated from a larger L2 cache to the smallest L1 cache.Type: GrantFiled: May 22, 2020Date of Patent: March 1, 2022Assignee: Intel CorporationInventors: Abhishek R. Appu, Joydeep Ray, James A. Valerio, Altug Koker, Prasoonkumar Surti, Balaji Vembu, Wenyin Fu, Bhushan M. Borole, Kamal Sinha
-
Patent number: 11227360Abstract: One embodiment provides for a general-purpose graphics processing unit comprising a processing array including multiple compute blocks, each compute block including multiple processing clusters and a thread dispatch unit to dispatch threads of a workload to the multiple compute blocks based on a parallelism metric, wherein the thread dispatch unit, based on the parallelism metric, is to perform one of a first operation and a second operation, the first operation to distribute threads across the multiple compute blocks and the second operation is to concentrate threads within one of the multiple compute blocks.Type: GrantFiled: December 16, 2019Date of Patent: January 18, 2022Assignee: Intel CorporationInventors: Altug Koker, Balaji Vembu, Joydeep Ray, James A. Valerio, Abhishek R. Appu
-
Patent number: 11176083Abstract: A shared local memory data crossbar may be implemented in multiple stages. With this approach, the number of multiplexer cells can be reduced by fifty percent (50%) or more in some embodiments.Type: GrantFiled: April 14, 2020Date of Patent: November 16, 2021Assignee: Intel CorporationInventors: Joydeep Ray, James A. Valerio, Altug Koker, Abhishek R. Appu, Vasanth Ranganathan
-
Publication number: 20210272231Abstract: One embodiment provides for a general-purpose graphics processing device comprising a general-purpose graphics processing compute block to process a workload including graphics or compute operations, a first cache memory, and a coherency module enable the first cache memory to coherently cache data for the workload, the data stored in memory within a virtual address space, wherein the virtual address space shared with a separate general-purpose processor including a second cache memory that is coherent with the first cache memory.Type: ApplicationFiled: March 12, 2021Publication date: September 2, 2021Applicant: Intel CorporationInventors: Joydeep Ray, Altug Koker, James A. Valerio, David Puffer, Abhishek R. Appu, Stephen Junkins
-
Publication number: 20210149724Abstract: A mechanism is described for facilitating memory-based software barriers to emulate hardware barriers at graphics processors in computing devices. A method of embodiments, as described herein, includes facilitating converting thread scheduling at a processor from hardware barriers to software barriers, where the software barriers emulate the hardware barriers.Type: ApplicationFiled: November 24, 2020Publication date: May 20, 2021Applicant: Intel CorporationInventors: Altug Koker, Joydeep Ray, Balaji Vembu, James A. Valerio, Abhishek R. Appu
-
Publication number: 20210125581Abstract: A mechanism is described for facilitating using of a shared local memory for register spilling/filling relating to graphics processors at computing devices. A method of embodiments, as described herein, includes reserving one or more spaces of a shared local memory (SLM) to perform one or more of spilling and filling relating to registers associated with a graphics processor of a computing device.Type: ApplicationFiled: October 5, 2020Publication date: April 29, 2021Applicant: Intel CorporationInventors: Joydeep Ray, Altug Koker, Balaji Vembu, Murali Ramadoss, Guei-Yuan Lueh, James A. Valerio, Prasoonkumar Surti, Abhishek R. Appu, Vasanth Ranganathan, Kalyan K. Bhiravabhatla, Arthur D. Hunter, JR., Wei-Yu Chen, Subramaniam M. Maiyuran
-
Patent number: 10949945Abstract: One embodiment provides for a general-purpose graphics processing device comprising a general-purpose graphics processing compute block to process a workload including graphics or compute operations, a first cache memory, and a coherency module enable the first cache memory to coherently cache data for the workload, the data stored in memory within a virtual address space, wherein the virtual address space shared with a separate general-purpose processor including a second cache memory that is coherent with the first cache memory.Type: GrantFiled: May 11, 2020Date of Patent: March 16, 2021Assignee: Intel CorporationInventors: Joydeep Ray, Altug Koker, James A. Valerio, David Puffer, Abhishek R. Appu, Stephen Junkins
-
Patent number: 10909037Abstract: A mechanism is described for facilitating memory address compression at computing devices. A method of embodiments, as described herein, includes coalescing slot addresses across multiple messages received from an execution unit, where the slot addresses are coalesced in groups based on memory cacheline addresses such that each of a set of slot addresses in a group have a memory cacheline address in common between them. The method may further include outputting the memory cacheline addresses.Type: GrantFiled: April 21, 2017Date of Patent: February 2, 2021Assignee: INTEL CORPOR ATIONInventors: Joydeep Ray, Abhishek R. Appu, Altug Koker, James A. Valerio, Prasoonkumar Surti
-
Patent number: 10884932Abstract: A mechanism is described for facilitating independent and separate entity-based graphics cache at computing devices. A method of embodiments, as described herein, includes facilitate hosting of a plurality of cache at a plurality of entities associated with a graphics processor, wherein each entity hosts at least one cache, and wherein an entity includes a dual sub-slice (DSS) or a streaming multiprocessor (SM).Type: GrantFiled: April 25, 2019Date of Patent: January 5, 2021Assignee: INTEL CORPORATIONInventors: Altug Koker, Joydeep Ray, James A. Valerio, Abhishek R. Appu, Vasanth Ranganathan
-
Patent number: 10853132Abstract: A mechanism is described for facilitating memory-based software barriers to emulate hardware barriers at graphics processors in computing devices. A method of embodiments, as described herein, includes facilitating converting thread scheduling at a processor from hardware barriers to software barriers, where the software barriers emulate the hardware barriers.Type: GrantFiled: April 9, 2019Date of Patent: December 1, 2020Assignee: INTEL CORPORATIONInventors: Altug Koker, Joydeep Ray, Balaji Vembu, James A. Valerio, Abhishek R. Appu
-
Publication number: 20200349091Abstract: A hybrid hierarchical cache is implemented at the same level in the access pipeline, to get the faster access behavior of a smaller cache and, at the same time, a higher hit rate at lower power for a larger cache, in some embodiments. A split cache at the same level in the access pipeline includes two caches that work together. In the hybrid, split, low level cache (e.g., L1) evictions are coordinated locally between the two L1 portions, and on a miss to both L1 portions, a line is allocated from a larger L2 cache to the smallest L1 cache.Type: ApplicationFiled: May 22, 2020Publication date: November 5, 2020Inventors: Abhishek R. Appu, Joydeep Ray, James A. Valerio, Altug Koker, Prasoonkumar Surti, Balaji Vembu, Wenyin Fu, Bhushan M. Borole, Kamal Sinha
-
Publication number: 20200342564Abstract: One embodiment provides for a general-purpose graphics processing device comprising a general-purpose graphics processing compute block to process a workload including graphics or compute operations, a first cache memory, and a coherency module enable the first cache memory to coherently cache data for the workload, the data stored in memory within a virtual address space, wherein the virtual address space shared with a separate general-purpose processor including a second cache memory that is coherent with the first cache memory.Type: ApplicationFiled: May 11, 2020Publication date: October 29, 2020Applicant: Intel CorporationInventors: Joydeep Ray, Altug Koker, James A. Valerio, David Puffer, Abhishek R. Appu, Stephen Junkins
-
Publication number: 20200341942Abstract: A shared local memory data crossbar may be implemented in multiple stages. With this approach, the number of multiplexer cells can be reduced by fifty percent (50%) or more in some embodiments.Type: ApplicationFiled: April 14, 2020Publication date: October 29, 2020Inventors: Joydeep Ray, James A. Valerio, Altug Koker, Abhishek R. Appu, Vasanth Ranganathan