Patents by Inventor James A. Valerio

James A. Valerio has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240362741
    Abstract: One embodiment provides for a general-purpose graphics processing unit comprising a processing array including multiple compute blocks, each compute block including multiple processing clusters and a thread dispatch unit to dispatch threads of a workload to the multiple compute blocks based on a parallelism metric, wherein the thread dispatch unit, based on the parallelism metric, is to perform one of a first operation and a second operation, the first operation to distribute threads across the multiple compute blocks and the second operation is to concentrate threads within one of the multiple compute blocks.
    Type: Application
    Filed: May 13, 2024
    Publication date: October 31, 2024
    Applicant: Intel Corporation
    Inventors: Altug Koker, Balaji Vembu, Joydeep Ray, James A. Valerio, Abhishek R. Appu
  • Patent number: 11995737
    Abstract: Thread dispatch circuitry is configured to dispatch threads of a two-dimensional (2D) thread group based on data access locality associated with the threads. The thread dispatch circuitry can dispatch a first 2D sub-group of the 2D thread group to a compute block of the multiple compute blocks, the first 2D sub-group associated with a first 2D tile of memory and dispatch a second 2D sub-group of the 2D thread group to the compute block of the multiple compute blocks, the second 2D sub-group associated with a second 2D tile of memory.
    Type: Grant
    Filed: November 16, 2021
    Date of Patent: May 28, 2024
    Assignee: Intel Corporation
    Inventors: Altug Koker, Balaji Vembu, Joydeep Ray, James A. Valerio, Abhishek R. Appu
  • Patent number: 11508338
    Abstract: A mechanism is described for facilitating using of a shared local memory for register spilling/filling relating to graphics processors at computing devices. A method of embodiments, as described herein, includes reserving one or more spaces of a shared local memory (SLM) to perform one or more of spilling and filling relating to registers associated with a graphics processor of a computing device.
    Type: Grant
    Filed: October 5, 2020
    Date of Patent: November 22, 2022
    Assignee: Intel Corporation
    Inventors: Joydeep Ray, Altug Koker, Balaji Vembu, Murali Ramadoss, Guei-Yuan Lueh, James A. Valerio, Prasoonkumar Surti, Abhishek R. Appu, Vasanth Ranganathan, Kalyan K. Bhiravabhatla, Arthur D. Hunter, Jr., Wei-Yu Chen, Subramaniam M. Maiyuran
  • Patent number: 11494232
    Abstract: A mechanism is described for facilitating memory-based software barriers to emulate hardware barriers at graphics processors in computing devices. A method of embodiments, as described herein, includes facilitating converting thread scheduling at a processor from hardware barriers to software barriers, where the software barriers emulate the hardware barriers.
    Type: Grant
    Filed: November 24, 2020
    Date of Patent: November 8, 2022
    Assignee: Intel Corporation
    Inventors: Altug Koker, Joydeep Ray, Balaji Vembu, James A. Valerio, Abhishek R. Appu
  • Patent number: 11436695
    Abstract: One embodiment provides for a general-purpose graphics processing device comprising a general-purpose graphics processing compute block to process a workload including graphics or compute operations, a first cache memory, and a coherency module enable the first cache memory to coherently cache data for the workload, the data stored in memory within a virtual address space, wherein the virtual address space shared with a separate general-purpose processor including a second cache memory that is coherent with the first cache memory.
    Type: Grant
    Filed: March 12, 2021
    Date of Patent: September 6, 2022
    Assignee: Intel Corporation
    Inventors: Joydeep Ray, Altug Koker, James A. Valerio, David Puffer, Abhishek R. Appu, Stephen Junkins
  • Patent number: 11354768
    Abstract: An apparatus to facilitate data intelligent dispatching is disclosed. The apparatus includes one or more processing units including a plurality of execution units (EUs) to execute a plurality of processing threads and collection logic to collect statistics data for threads executed at the processing unit during execution of an application, and dispatch logic to dispatch the threads to be executed at a subset of the plurality of EUs during a subsequent execution of the application based on the statistics data.
    Type: Grant
    Filed: February 14, 2020
    Date of Patent: June 7, 2022
    Assignee: Intel Corporation
    Inventors: Balaji Vembu, Murali Ramadoss, Guei-Yuan Lueh, Subramaniam M. Maiyuran, Abhishek R. Appu, Joydeep Ray, Altug Koker, James A. Valerio, Eric J. Hoekstra, Arthur D. Hunter, Jr.
  • Publication number: 20220156875
    Abstract: Thread dispatch circuitry is configured to dispatch threads of a two-dimensional (2D) thread group based on data access locality associated with the threads. The thread dispatch circuitry can dispatch a first 2D sub-group of the 2D thread group to a compute block of the multiple compute blocks, the first 2D sub-group associated with a first 2D tile of memory and dispatch a second 2D sub-group of the 2D thread group to the compute block of the multiple compute blocks, the second 2D sub-group associated with a second 2D tile of memory.
    Type: Application
    Filed: November 16, 2021
    Publication date: May 19, 2022
    Applicant: Intel Corporation
    Inventors: Altug Koker, Balaji Vembu, Joydeep Ray, James A. Valerio, Abhishek R. Appu
  • Patent number: 11263152
    Abstract: A hybrid hierarchical cache is implemented at the same level in the access pipeline, to get the faster access behavior of a smaller cache and, at the same time, a higher hit rate at lower power for a larger cache, in some embodiments. A split cache at the same level in the access pipeline includes two caches that work together. In the hybrid, split, low level cache (e.g., L1) evictions are coordinated locally between the two L1 portions, and on a miss to both L1 portions, a line is allocated from a larger L2 cache to the smallest L1 cache.
    Type: Grant
    Filed: May 22, 2020
    Date of Patent: March 1, 2022
    Assignee: Intel Corporation
    Inventors: Abhishek R. Appu, Joydeep Ray, James A. Valerio, Altug Koker, Prasoonkumar Surti, Balaji Vembu, Wenyin Fu, Bhushan M. Borole, Kamal Sinha
  • Patent number: 11227360
    Abstract: One embodiment provides for a general-purpose graphics processing unit comprising a processing array including multiple compute blocks, each compute block including multiple processing clusters and a thread dispatch unit to dispatch threads of a workload to the multiple compute blocks based on a parallelism metric, wherein the thread dispatch unit, based on the parallelism metric, is to perform one of a first operation and a second operation, the first operation to distribute threads across the multiple compute blocks and the second operation is to concentrate threads within one of the multiple compute blocks.
    Type: Grant
    Filed: December 16, 2019
    Date of Patent: January 18, 2022
    Assignee: Intel Corporation
    Inventors: Altug Koker, Balaji Vembu, Joydeep Ray, James A. Valerio, Abhishek R. Appu
  • Patent number: 11176083
    Abstract: A shared local memory data crossbar may be implemented in multiple stages. With this approach, the number of multiplexer cells can be reduced by fifty percent (50%) or more in some embodiments.
    Type: Grant
    Filed: April 14, 2020
    Date of Patent: November 16, 2021
    Assignee: Intel Corporation
    Inventors: Joydeep Ray, James A. Valerio, Altug Koker, Abhishek R. Appu, Vasanth Ranganathan
  • Publication number: 20210272231
    Abstract: One embodiment provides for a general-purpose graphics processing device comprising a general-purpose graphics processing compute block to process a workload including graphics or compute operations, a first cache memory, and a coherency module enable the first cache memory to coherently cache data for the workload, the data stored in memory within a virtual address space, wherein the virtual address space shared with a separate general-purpose processor including a second cache memory that is coherent with the first cache memory.
    Type: Application
    Filed: March 12, 2021
    Publication date: September 2, 2021
    Applicant: Intel Corporation
    Inventors: Joydeep Ray, Altug Koker, James A. Valerio, David Puffer, Abhishek R. Appu, Stephen Junkins
  • Publication number: 20210149724
    Abstract: A mechanism is described for facilitating memory-based software barriers to emulate hardware barriers at graphics processors in computing devices. A method of embodiments, as described herein, includes facilitating converting thread scheduling at a processor from hardware barriers to software barriers, where the software barriers emulate the hardware barriers.
    Type: Application
    Filed: November 24, 2020
    Publication date: May 20, 2021
    Applicant: Intel Corporation
    Inventors: Altug Koker, Joydeep Ray, Balaji Vembu, James A. Valerio, Abhishek R. Appu
  • Publication number: 20210125581
    Abstract: A mechanism is described for facilitating using of a shared local memory for register spilling/filling relating to graphics processors at computing devices. A method of embodiments, as described herein, includes reserving one or more spaces of a shared local memory (SLM) to perform one or more of spilling and filling relating to registers associated with a graphics processor of a computing device.
    Type: Application
    Filed: October 5, 2020
    Publication date: April 29, 2021
    Applicant: Intel Corporation
    Inventors: Joydeep Ray, Altug Koker, Balaji Vembu, Murali Ramadoss, Guei-Yuan Lueh, James A. Valerio, Prasoonkumar Surti, Abhishek R. Appu, Vasanth Ranganathan, Kalyan K. Bhiravabhatla, Arthur D. Hunter, JR., Wei-Yu Chen, Subramaniam M. Maiyuran
  • Patent number: 10949945
    Abstract: One embodiment provides for a general-purpose graphics processing device comprising a general-purpose graphics processing compute block to process a workload including graphics or compute operations, a first cache memory, and a coherency module enable the first cache memory to coherently cache data for the workload, the data stored in memory within a virtual address space, wherein the virtual address space shared with a separate general-purpose processor including a second cache memory that is coherent with the first cache memory.
    Type: Grant
    Filed: May 11, 2020
    Date of Patent: March 16, 2021
    Assignee: Intel Corporation
    Inventors: Joydeep Ray, Altug Koker, James A. Valerio, David Puffer, Abhishek R. Appu, Stephen Junkins
  • Patent number: 10909037
    Abstract: A mechanism is described for facilitating memory address compression at computing devices. A method of embodiments, as described herein, includes coalescing slot addresses across multiple messages received from an execution unit, where the slot addresses are coalesced in groups based on memory cacheline addresses such that each of a set of slot addresses in a group have a memory cacheline address in common between them. The method may further include outputting the memory cacheline addresses.
    Type: Grant
    Filed: April 21, 2017
    Date of Patent: February 2, 2021
    Assignee: INTEL CORPOR ATION
    Inventors: Joydeep Ray, Abhishek R. Appu, Altug Koker, James A. Valerio, Prasoonkumar Surti
  • Patent number: 10884932
    Abstract: A mechanism is described for facilitating independent and separate entity-based graphics cache at computing devices. A method of embodiments, as described herein, includes facilitate hosting of a plurality of cache at a plurality of entities associated with a graphics processor, wherein each entity hosts at least one cache, and wherein an entity includes a dual sub-slice (DSS) or a streaming multiprocessor (SM).
    Type: Grant
    Filed: April 25, 2019
    Date of Patent: January 5, 2021
    Assignee: INTEL CORPORATION
    Inventors: Altug Koker, Joydeep Ray, James A. Valerio, Abhishek R. Appu, Vasanth Ranganathan
  • Patent number: 10853132
    Abstract: A mechanism is described for facilitating memory-based software barriers to emulate hardware barriers at graphics processors in computing devices. A method of embodiments, as described herein, includes facilitating converting thread scheduling at a processor from hardware barriers to software barriers, where the software barriers emulate the hardware barriers.
    Type: Grant
    Filed: April 9, 2019
    Date of Patent: December 1, 2020
    Assignee: INTEL CORPORATION
    Inventors: Altug Koker, Joydeep Ray, Balaji Vembu, James A. Valerio, Abhishek R. Appu
  • Publication number: 20200349091
    Abstract: A hybrid hierarchical cache is implemented at the same level in the access pipeline, to get the faster access behavior of a smaller cache and, at the same time, a higher hit rate at lower power for a larger cache, in some embodiments. A split cache at the same level in the access pipeline includes two caches that work together. In the hybrid, split, low level cache (e.g., L1) evictions are coordinated locally between the two L1 portions, and on a miss to both L1 portions, a line is allocated from a larger L2 cache to the smallest L1 cache.
    Type: Application
    Filed: May 22, 2020
    Publication date: November 5, 2020
    Inventors: Abhishek R. Appu, Joydeep Ray, James A. Valerio, Altug Koker, Prasoonkumar Surti, Balaji Vembu, Wenyin Fu, Bhushan M. Borole, Kamal Sinha
  • Publication number: 20200342564
    Abstract: One embodiment provides for a general-purpose graphics processing device comprising a general-purpose graphics processing compute block to process a workload including graphics or compute operations, a first cache memory, and a coherency module enable the first cache memory to coherently cache data for the workload, the data stored in memory within a virtual address space, wherein the virtual address space shared with a separate general-purpose processor including a second cache memory that is coherent with the first cache memory.
    Type: Application
    Filed: May 11, 2020
    Publication date: October 29, 2020
    Applicant: Intel Corporation
    Inventors: Joydeep Ray, Altug Koker, James A. Valerio, David Puffer, Abhishek R. Appu, Stephen Junkins
  • Publication number: 20200341942
    Abstract: A shared local memory data crossbar may be implemented in multiple stages. With this approach, the number of multiplexer cells can be reduced by fifty percent (50%) or more in some embodiments.
    Type: Application
    Filed: April 14, 2020
    Publication date: October 29, 2020
    Inventors: Joydeep Ray, James A. Valerio, Altug Koker, Abhishek R. Appu, Vasanth Ranganathan