Patents by Inventor Hema Chand Nalluri

Hema Chand Nalluri has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240095201
    Abstract: Embodiments described herein provide techniques to facilitate scalable interrupts and workload submission for a virtualized graphics processor. The techniques include memory-based interrupt reporting and shared work queue submission for multiple software domains.
    Type: Application
    Filed: August 31, 2023
    Publication date: March 21, 2024
    Applicant: Intel Corporation
    Inventors: David Puffer, Ankur Shah, Niranjan Cooray, Bryan White, Balaji Vembu, Hema Chand Nalluri, Kritika Bala
  • Publication number: 20240054595
    Abstract: Embodiments described herein provide a system of concurrent compute queues that enable the scheduling of a large number of compute contexts simultaneously on graphics processor hardware. One embodiment provides an apparatus comprising a system interface and a general-purpose graphics processor coupled with the system interface. The general-purpose graphics processor comprises a plurality of graphics processor hardware resources configured to be partitioned into a plurality of isolated partitions, each of the plurality of isolated partitions including a first command streamer, a second command streamer, and circuitry configured to schedule general-purpose graphics compute workloads submitted to a first plurality of command queues associated with the first command streamer and a second plurality of command queues associated with the second command streamer.
    Type: Application
    Filed: August 10, 2022
    Publication date: February 15, 2024
    Applicant: Intel Corporation
    Inventors: Joydeep Ray, Vasanth Ranganathan, James Valerio, Jeffery S. Boles, Hema Chand Nalluri, Aditya Navale, Ben J. Ashbaugh, Michal Mrozek, Murali Ramadoss, Hong Jiang, Ankur Shah
  • Publication number: 20230297440
    Abstract: Described herein is a partitionable graphics processor having a plurality of flexibly partitioned processing resources. One embodiment provides a graphics processor comprising a plurality of processing resources configurable to be flexibly partitioned into a plurality of resource partitions and circuitry to compose multiple graphics processor device partitions from the plurality of resource partitions. The multiple graphics processor device partitions are configurable to be asymmetrically composed of different types of functional units.
    Type: Application
    Filed: May 27, 2022
    Publication date: September 21, 2023
    Applicant: Intel Corporation
    Inventors: David Cowperthwaite, Kenneth Daxer, Jeffery S. Boles, Hema Chand Nalluri, Aditya Navale, Prasoonkumar Surti, Arthur Hunter, Vasanth Ranganathan, Joydeep Ray, David Puffer, Aravindh Anantaraman, Ankur Shah, Vidhya Krishnan, Kritika Bala
  • Publication number: 20230297526
    Abstract: Embodiments described herein provide techniques to facilitate scalable interrupts and workload submission for a virtualized graphics processor. The techniques include memory-based interrupt reporting and shared work queue submission for multiple software domains.
    Type: Application
    Filed: June 3, 2022
    Publication date: September 21, 2023
    Applicant: Intel Corporation
    Inventors: David Puffer, Ankur Shah, Niranjan Cooray, Bryan White, Balaji Vembu, Hema Chand Nalluri, Kritika Bala
  • Publication number: 20230298125
    Abstract: Described herein is a partitionable graphics processor having multiple render front ends. The partitions of the graphics processor maintain render functionality when partitioned and enable fault isolation and independent multi-client rendering.
    Type: Application
    Filed: May 27, 2022
    Publication date: September 21, 2023
    Applicant: Intel Corporation
    Inventors: Hema Chand Nalluri, Jeffery S. Boles, David Cowperthwaite, Aditya Navale, Prasoonkumar Surti, Arthur Hunter, Vasanth Ranganathan, Joydeep Ray, David Puffer, Ankur Shah, Vidhya Krishnan, Kritika Bala, Aravindh Anantaraman, Michael Apodaca, Kenneth Daxer
  • Publication number: 20230297421
    Abstract: Described herein is a partitional graphics processor having multiple hard partitions with separate software execution and fault domains. One embodiment provides a graphics processor comprising a system interface and a plurality of graphics processing resources coupled with the system interface. The plurality of graphics processing resources is configurable to be partitioned into a plurality of isolated device partitions, each isolated device partition configured for fault isolation and independent concurrent execution of workloads associated with a plurality of clients, and the system interface is configured to present each of the plurality of isolated device partitions as a virtual function.
    Type: Application
    Filed: May 27, 2022
    Publication date: September 21, 2023
    Applicant: Intel Corporation
    Inventors: David Cowperthwaite, Kenneth Daxer, Aditya Navale, Prasoonkumar Surti, Arthur Hunter, Hema Chand Nalluri, Jeffery S. Boles, Vasanth Ranganathan, Joydeep Ray, David Puffer, Aravindh Anantaraman, Ankur Shah, Vidhya Krishnan, Kritika Bala, Michael Apodaca
  • Patent number: 11748283
    Abstract: Embodiments described herein provide techniques to facilitate scalable interrupts and workload submission for a virtualized graphics processor. The techniques include memory-based interrupt reporting and shared work queue submission for multiple software domains.
    Type: Grant
    Filed: June 3, 2022
    Date of Patent: September 5, 2023
    Assignee: Intel Corporation
    Inventors: David Puffer, Ankur Shah, Niranjan Cooray, Bryan White, Balaji Vembu, Hema Chand Nalluri, Kritika Bala
  • Publication number: 20230094002
    Abstract: Dynamic routing of texture-load in graphics processing is described. An example of an apparatus includes a graphics processor including a plurality of processing engines of a class of processing engines of the graphic processor; a set of queues for the plurality of processing engines; and a unified submit port for the plurality of processing engines, wherein the unified submit port is to notify a scheduler regarding availability of slots in the set of queues for receipt of workload contexts; and wherein, upon the unified submit port receiving a workload context for processing by the plurality of processing engines, the unified submit port is to detect an available processing engine of the plurality of processing engines and direct the received context to a slot of the set of queues for processing by the available processing engine.
    Type: Application
    Filed: September 24, 2021
    Publication date: March 30, 2023
    Applicant: Intel Corporation
    Inventors: Hema Chand Nalluri, Jeffery S. Boles, Joseph Koston, Ankur N. Shah, Vidhya Krishnan, Vasanth Ranganathan, Joydeep Ray, Aditya Navale, Murali Ramadoss, James Valerio
  • Publication number: 20220413704
    Abstract: An apparatus to facilitate a dynamically scalable and partitioned copy engine is disclosed.
    Type: Application
    Filed: June 25, 2021
    Publication date: December 29, 2022
    Applicant: Intel Corporation
    Inventors: Nilay Mistry, David Puffer, Prasoonkumar Surti, Hema Chand Nalluri
  • Publication number: 20220284539
    Abstract: Various embodiments enable loop processing in a command processing block of the graphics hardware. Such hardware may include a processor including a command buffer, and a graphics command parser. The graphics command parser to load graphics commands from the command buffer, parse a first graphics command, store a loop count value associated with the first graphics command, parse a second graphics command and store a loop wrap address based on the second graphics command. The graphics command parser may execute a command sequence identified by the second graphics command, parse a third graphics command, the third graphics command identifying an end of the command sequence, set a new loop count value, and iteratively execute the command sequence using the loop wrap address based on the new loop count value.
    Type: Application
    Filed: January 18, 2022
    Publication date: September 8, 2022
    Inventors: Hema Chand NALLURI, Balaji VEMBU, Peter DOYLE, Michael APODACA
  • Patent number: 11321262
    Abstract: An apparatus to facilitate memory barriers is disclosed. The apparatus comprises an interconnect, a device memory, a plurality of processing resources, coupled to the device memory, to execute a plurality of execution threads as memory data producers and memory data consumers to a device memory and a system memory and fence hardware to generate fence operations to enforce data ordering on memory operations issued to the device memory and a system memory coupled via the interconnect.
    Type: Grant
    Filed: September 8, 2020
    Date of Patent: May 3, 2022
    Assignee: Intel Corporation
    Inventors: Hema Chand Nalluri, Ankur Shah, Joydeep Ray, Aditya Navale, Altug Koker, Murali Ramadoss, Niranjan L. Cooray, Jeffery S. Boles, Aravindh Anantaraman, David Puffer, James Valerio, Vasanth Ranganathan
  • Patent number: 11288191
    Abstract: An apparatus to facilitate memory flushing is disclosed. The apparatus comprises a cache memory, one or more processing resources, tracker hardware to dispatch workloads for execution at the processing resources and to monitor the workloads to track completion of the execution, range based flush (RBF) hardware to process RBF commands and generate a flush indication to flush data from the cache memory and a flush controller to receive the flush indication and perform a flush operation to discard data from the cache memory at an address range provided in the flush indication.
    Type: Grant
    Filed: December 23, 2020
    Date of Patent: March 29, 2022
    Assignee: Intel Corporation
    Inventors: Hema Chand Nalluri, Aditya Navale, Altug Koker, Brandon Fliflet, Jeffery S. Boles, James Valerio, Vasanth Ranganathan, Anirban Kundu, Pattabhiraman K
  • Patent number: 11281837
    Abstract: In accordance with embodiments disclosed herein, there is provided systems and methods for router-based transaction routing for toggle reduction. An integrated circuit includes a transmitter circuit, receiver circuits, and a multicast bus coupled between the transmitter circuit and the receiver circuits. The multicast bus includes a first flow router circuit to route a multicast signal to a first receiver circuit of the plurality of receiver circuits and not route the multicast signal to a second receiver circuit of the plurality of receiver circuits.
    Type: Grant
    Filed: December 18, 2017
    Date of Patent: March 22, 2022
    Assignee: Intel Corporation
    Inventors: Hema Chand Nalluri, Balaji Vembu, Santosh Tripathy, Altug Koker, Pattabhiraman K
  • Publication number: 20220075746
    Abstract: An apparatus to facilitate memory barriers is disclosed. The apparatus comprises an interconnect, a device memory, a plurality of processing resources, coupled to the device memory, to execute a plurality of execution threads as memory data producers and memory data consumers to a device memory and a system memory and fence hardware to generate fence operations to enforce data ordering on memory operations issued to the device memory and a system memory coupled via the interconnect.
    Type: Application
    Filed: September 8, 2020
    Publication date: March 10, 2022
    Applicant: Intel Corporation
    Inventors: Hema Chand Nalluri, Ankur Shah, Joydeep Ray, Aditya Navale, Altug Koker, Murali Ramadoss, Niranjan L. Cooray, Jeffery S. Boles, Aravindh Anantaraman, David Puffer, James Valerio, Vasanth Ranganathan
  • Patent number: 11232531
    Abstract: Various embodiments enable loop processing in a command processing block of the graphics hardware. Such hardware may include a processor including a command buffer, and a graphics command parser. The graphics command parser to load graphics commands from the command buffer, parse a first graphics command, store a loop count value associated with the first graphics command, parse a second graphics command and store a loop wrap address based on the second graphics command. The graphics command parser may execute a command sequence identified by the second graphics command, parse a third graphics command, the third graphics command identifying an end of the command sequence, set a new loop count value, and iteratively execute the command sequence using the loop wrap address based on the new loop count value.
    Type: Grant
    Filed: August 29, 2017
    Date of Patent: January 25, 2022
    Assignee: INTEL CORPORATION
    Inventors: Hema Chand Nalluri, Balaji Vembu, Peter Doyle, Michael Apodaca
  • Patent number: 10613972
    Abstract: Graphics processing systems and methods are described. For example, one embodiment of a graphics processing apparatus comprises a graphics processing unit (GPU), the GPU including an on-die cache and a cache configuration circuitry to dynamically configure the on-die cache for a plurality of contexts executed by the GPU. The cache configuration block is to receive a cache configuration request, the cache configuration request including context-specific cache requirements for a new context, and determine a priority associated with the context-specific cache requirements. The CCB can compare the context-specific cache requirements with pre-existing cache requirements based on the priority, and reallocate the cache based on the context-specific cache requirements and the priority.
    Type: Grant
    Filed: December 29, 2017
    Date of Patent: April 7, 2020
    Assignee: Intel Corporation
    Inventors: Hema Chand Nalluri, Balaji Vembu, Pattabhiraman K, Altug Koker
  • Publication number: 20190066255
    Abstract: Various embodiments enable loop processing in a command processing block of the graphics hardware. Such hardware may include a processor including a command buffer, and a graphics command parser. The graphics command parser to load graphics commands from the command buffer, parse a first graphics command, store a loop count value associated with the first graphics command, parse a second graphics command and store a loop wrap address based on the second graphics command. The graphics command parser may execute a command sequence identified by the second graphics command, parse a third graphics command, the third graphics command identifying an end of the command sequence, set a new loop count value, and iteratively execute the command sequence using the loop wrap address based on the new loop count value.
    Type: Application
    Filed: August 29, 2017
    Publication date: February 28, 2019
    Inventors: Hema Chand NALLURI, Balaji VEMBU, Peter DOYLE, Michael APODACA
  • Publication number: 20190034326
    Abstract: Graphics processing systems and methods are described. For example, one embodiment of a graphics processing apparatus comprises a graphics processing unit (GPU), the GPU including an on-die cache and a cache configuration circuitry to dynamically configure the on-die cache for a plurality of contexts executed by the GPU. The cache configuration block is to receive a cache configuration request, the cache configuration request including context-specific cache requirements for a new context, and determine a priority associated with the context-specific cache requirements. The CCB can compare the context-specific cache requirements with pre-existing cache requirements based on the priority, and reallocate the cache based on the context-specific cache requirements and the priority.
    Type: Application
    Filed: December 29, 2017
    Publication date: January 31, 2019
    Inventors: HEMA CHAND NALLURI, BALAJI VEMBU, PATTABHIRAMAN K, ALTUG KOKER
  • Publication number: 20190034576
    Abstract: In accordance with embodiments disclosed herein, there is provided systems and methods for router-based transaction routing for toggle reduction. An integrated circuit includes a transmitter circuit, receiver circuits, and a multicast bus coupled between the transmitter circuit and the receiver circuits. The multicast bus includes a first flow router circuit to route a multicast signal to a first receiver circuit of the plurality of receiver circuits and not route the multicast signal to a second receiver circuit of the plurality of receiver circuits.
    Type: Application
    Filed: December 18, 2017
    Publication date: January 31, 2019
    Inventors: Hema Chand Nalluri, Balaji Vembu, Santosh Tripathy, Altug Koker, Pattabhiraman K
  • Patent number: 10078879
    Abstract: Memory-based semaphores are described that are useful for synchronizing processes between different processing engines. In one example, operations include executing a first process at a first processing engine, the executing including updating a memory register, sending a signal from the first processing engine to a second processing engine that the memory register has been updated, the signal including a memory register address to identify the updated memory register inline data and a dataword, fetching data from the memory register by the second processing engine, comparing the fetched data to the received dataword, and conditionally executing a next command of a second process at the second processing engine based on the comparison.
    Type: Grant
    Filed: April 22, 2015
    Date of Patent: September 18, 2018
    Assignee: INTEL CORPORATION
    Inventors: Hema Chand Nalluri, Aditya Navale