Patents by Inventor Hema Chand Nalluri
Hema Chand Nalluri has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20240095201Abstract: Embodiments described herein provide techniques to facilitate scalable interrupts and workload submission for a virtualized graphics processor. The techniques include memory-based interrupt reporting and shared work queue submission for multiple software domains.Type: ApplicationFiled: August 31, 2023Publication date: March 21, 2024Applicant: Intel CorporationInventors: David Puffer, Ankur Shah, Niranjan Cooray, Bryan White, Balaji Vembu, Hema Chand Nalluri, Kritika Bala
-
Publication number: 20240054595Abstract: Embodiments described herein provide a system of concurrent compute queues that enable the scheduling of a large number of compute contexts simultaneously on graphics processor hardware. One embodiment provides an apparatus comprising a system interface and a general-purpose graphics processor coupled with the system interface. The general-purpose graphics processor comprises a plurality of graphics processor hardware resources configured to be partitioned into a plurality of isolated partitions, each of the plurality of isolated partitions including a first command streamer, a second command streamer, and circuitry configured to schedule general-purpose graphics compute workloads submitted to a first plurality of command queues associated with the first command streamer and a second plurality of command queues associated with the second command streamer.Type: ApplicationFiled: August 10, 2022Publication date: February 15, 2024Applicant: Intel CorporationInventors: Joydeep Ray, Vasanth Ranganathan, James Valerio, Jeffery S. Boles, Hema Chand Nalluri, Aditya Navale, Ben J. Ashbaugh, Michal Mrozek, Murali Ramadoss, Hong Jiang, Ankur Shah
-
Publication number: 20230297440Abstract: Described herein is a partitionable graphics processor having a plurality of flexibly partitioned processing resources. One embodiment provides a graphics processor comprising a plurality of processing resources configurable to be flexibly partitioned into a plurality of resource partitions and circuitry to compose multiple graphics processor device partitions from the plurality of resource partitions. The multiple graphics processor device partitions are configurable to be asymmetrically composed of different types of functional units.Type: ApplicationFiled: May 27, 2022Publication date: September 21, 2023Applicant: Intel CorporationInventors: David Cowperthwaite, Kenneth Daxer, Jeffery S. Boles, Hema Chand Nalluri, Aditya Navale, Prasoonkumar Surti, Arthur Hunter, Vasanth Ranganathan, Joydeep Ray, David Puffer, Aravindh Anantaraman, Ankur Shah, Vidhya Krishnan, Kritika Bala
-
Publication number: 20230297526Abstract: Embodiments described herein provide techniques to facilitate scalable interrupts and workload submission for a virtualized graphics processor. The techniques include memory-based interrupt reporting and shared work queue submission for multiple software domains.Type: ApplicationFiled: June 3, 2022Publication date: September 21, 2023Applicant: Intel CorporationInventors: David Puffer, Ankur Shah, Niranjan Cooray, Bryan White, Balaji Vembu, Hema Chand Nalluri, Kritika Bala
-
Publication number: 20230298125Abstract: Described herein is a partitionable graphics processor having multiple render front ends. The partitions of the graphics processor maintain render functionality when partitioned and enable fault isolation and independent multi-client rendering.Type: ApplicationFiled: May 27, 2022Publication date: September 21, 2023Applicant: Intel CorporationInventors: Hema Chand Nalluri, Jeffery S. Boles, David Cowperthwaite, Aditya Navale, Prasoonkumar Surti, Arthur Hunter, Vasanth Ranganathan, Joydeep Ray, David Puffer, Ankur Shah, Vidhya Krishnan, Kritika Bala, Aravindh Anantaraman, Michael Apodaca, Kenneth Daxer
-
Publication number: 20230297421Abstract: Described herein is a partitional graphics processor having multiple hard partitions with separate software execution and fault domains. One embodiment provides a graphics processor comprising a system interface and a plurality of graphics processing resources coupled with the system interface. The plurality of graphics processing resources is configurable to be partitioned into a plurality of isolated device partitions, each isolated device partition configured for fault isolation and independent concurrent execution of workloads associated with a plurality of clients, and the system interface is configured to present each of the plurality of isolated device partitions as a virtual function.Type: ApplicationFiled: May 27, 2022Publication date: September 21, 2023Applicant: Intel CorporationInventors: David Cowperthwaite, Kenneth Daxer, Aditya Navale, Prasoonkumar Surti, Arthur Hunter, Hema Chand Nalluri, Jeffery S. Boles, Vasanth Ranganathan, Joydeep Ray, David Puffer, Aravindh Anantaraman, Ankur Shah, Vidhya Krishnan, Kritika Bala, Michael Apodaca
-
Patent number: 11748283Abstract: Embodiments described herein provide techniques to facilitate scalable interrupts and workload submission for a virtualized graphics processor. The techniques include memory-based interrupt reporting and shared work queue submission for multiple software domains.Type: GrantFiled: June 3, 2022Date of Patent: September 5, 2023Assignee: Intel CorporationInventors: David Puffer, Ankur Shah, Niranjan Cooray, Bryan White, Balaji Vembu, Hema Chand Nalluri, Kritika Bala
-
Publication number: 20230094002Abstract: Dynamic routing of texture-load in graphics processing is described. An example of an apparatus includes a graphics processor including a plurality of processing engines of a class of processing engines of the graphic processor; a set of queues for the plurality of processing engines; and a unified submit port for the plurality of processing engines, wherein the unified submit port is to notify a scheduler regarding availability of slots in the set of queues for receipt of workload contexts; and wherein, upon the unified submit port receiving a workload context for processing by the plurality of processing engines, the unified submit port is to detect an available processing engine of the plurality of processing engines and direct the received context to a slot of the set of queues for processing by the available processing engine.Type: ApplicationFiled: September 24, 2021Publication date: March 30, 2023Applicant: Intel CorporationInventors: Hema Chand Nalluri, Jeffery S. Boles, Joseph Koston, Ankur N. Shah, Vidhya Krishnan, Vasanth Ranganathan, Joydeep Ray, Aditya Navale, Murali Ramadoss, James Valerio
-
Publication number: 20220413704Abstract: An apparatus to facilitate a dynamically scalable and partitioned copy engine is disclosed.Type: ApplicationFiled: June 25, 2021Publication date: December 29, 2022Applicant: Intel CorporationInventors: Nilay Mistry, David Puffer, Prasoonkumar Surti, Hema Chand Nalluri
-
Publication number: 20220284539Abstract: Various embodiments enable loop processing in a command processing block of the graphics hardware. Such hardware may include a processor including a command buffer, and a graphics command parser. The graphics command parser to load graphics commands from the command buffer, parse a first graphics command, store a loop count value associated with the first graphics command, parse a second graphics command and store a loop wrap address based on the second graphics command. The graphics command parser may execute a command sequence identified by the second graphics command, parse a third graphics command, the third graphics command identifying an end of the command sequence, set a new loop count value, and iteratively execute the command sequence using the loop wrap address based on the new loop count value.Type: ApplicationFiled: January 18, 2022Publication date: September 8, 2022Inventors: Hema Chand NALLURI, Balaji VEMBU, Peter DOYLE, Michael APODACA
-
Patent number: 11321262Abstract: An apparatus to facilitate memory barriers is disclosed. The apparatus comprises an interconnect, a device memory, a plurality of processing resources, coupled to the device memory, to execute a plurality of execution threads as memory data producers and memory data consumers to a device memory and a system memory and fence hardware to generate fence operations to enforce data ordering on memory operations issued to the device memory and a system memory coupled via the interconnect.Type: GrantFiled: September 8, 2020Date of Patent: May 3, 2022Assignee: Intel CorporationInventors: Hema Chand Nalluri, Ankur Shah, Joydeep Ray, Aditya Navale, Altug Koker, Murali Ramadoss, Niranjan L. Cooray, Jeffery S. Boles, Aravindh Anantaraman, David Puffer, James Valerio, Vasanth Ranganathan
-
Patent number: 11288191Abstract: An apparatus to facilitate memory flushing is disclosed. The apparatus comprises a cache memory, one or more processing resources, tracker hardware to dispatch workloads for execution at the processing resources and to monitor the workloads to track completion of the execution, range based flush (RBF) hardware to process RBF commands and generate a flush indication to flush data from the cache memory and a flush controller to receive the flush indication and perform a flush operation to discard data from the cache memory at an address range provided in the flush indication.Type: GrantFiled: December 23, 2020Date of Patent: March 29, 2022Assignee: Intel CorporationInventors: Hema Chand Nalluri, Aditya Navale, Altug Koker, Brandon Fliflet, Jeffery S. Boles, James Valerio, Vasanth Ranganathan, Anirban Kundu, Pattabhiraman K
-
Patent number: 11281837Abstract: In accordance with embodiments disclosed herein, there is provided systems and methods for router-based transaction routing for toggle reduction. An integrated circuit includes a transmitter circuit, receiver circuits, and a multicast bus coupled between the transmitter circuit and the receiver circuits. The multicast bus includes a first flow router circuit to route a multicast signal to a first receiver circuit of the plurality of receiver circuits and not route the multicast signal to a second receiver circuit of the plurality of receiver circuits.Type: GrantFiled: December 18, 2017Date of Patent: March 22, 2022Assignee: Intel CorporationInventors: Hema Chand Nalluri, Balaji Vembu, Santosh Tripathy, Altug Koker, Pattabhiraman K
-
Publication number: 20220075746Abstract: An apparatus to facilitate memory barriers is disclosed. The apparatus comprises an interconnect, a device memory, a plurality of processing resources, coupled to the device memory, to execute a plurality of execution threads as memory data producers and memory data consumers to a device memory and a system memory and fence hardware to generate fence operations to enforce data ordering on memory operations issued to the device memory and a system memory coupled via the interconnect.Type: ApplicationFiled: September 8, 2020Publication date: March 10, 2022Applicant: Intel CorporationInventors: Hema Chand Nalluri, Ankur Shah, Joydeep Ray, Aditya Navale, Altug Koker, Murali Ramadoss, Niranjan L. Cooray, Jeffery S. Boles, Aravindh Anantaraman, David Puffer, James Valerio, Vasanth Ranganathan
-
Patent number: 11232531Abstract: Various embodiments enable loop processing in a command processing block of the graphics hardware. Such hardware may include a processor including a command buffer, and a graphics command parser. The graphics command parser to load graphics commands from the command buffer, parse a first graphics command, store a loop count value associated with the first graphics command, parse a second graphics command and store a loop wrap address based on the second graphics command. The graphics command parser may execute a command sequence identified by the second graphics command, parse a third graphics command, the third graphics command identifying an end of the command sequence, set a new loop count value, and iteratively execute the command sequence using the loop wrap address based on the new loop count value.Type: GrantFiled: August 29, 2017Date of Patent: January 25, 2022Assignee: INTEL CORPORATIONInventors: Hema Chand Nalluri, Balaji Vembu, Peter Doyle, Michael Apodaca
-
Patent number: 10613972Abstract: Graphics processing systems and methods are described. For example, one embodiment of a graphics processing apparatus comprises a graphics processing unit (GPU), the GPU including an on-die cache and a cache configuration circuitry to dynamically configure the on-die cache for a plurality of contexts executed by the GPU. The cache configuration block is to receive a cache configuration request, the cache configuration request including context-specific cache requirements for a new context, and determine a priority associated with the context-specific cache requirements. The CCB can compare the context-specific cache requirements with pre-existing cache requirements based on the priority, and reallocate the cache based on the context-specific cache requirements and the priority.Type: GrantFiled: December 29, 2017Date of Patent: April 7, 2020Assignee: Intel CorporationInventors: Hema Chand Nalluri, Balaji Vembu, Pattabhiraman K, Altug Koker
-
Publication number: 20190066255Abstract: Various embodiments enable loop processing in a command processing block of the graphics hardware. Such hardware may include a processor including a command buffer, and a graphics command parser. The graphics command parser to load graphics commands from the command buffer, parse a first graphics command, store a loop count value associated with the first graphics command, parse a second graphics command and store a loop wrap address based on the second graphics command. The graphics command parser may execute a command sequence identified by the second graphics command, parse a third graphics command, the third graphics command identifying an end of the command sequence, set a new loop count value, and iteratively execute the command sequence using the loop wrap address based on the new loop count value.Type: ApplicationFiled: August 29, 2017Publication date: February 28, 2019Inventors: Hema Chand NALLURI, Balaji VEMBU, Peter DOYLE, Michael APODACA
-
Publication number: 20190034326Abstract: Graphics processing systems and methods are described. For example, one embodiment of a graphics processing apparatus comprises a graphics processing unit (GPU), the GPU including an on-die cache and a cache configuration circuitry to dynamically configure the on-die cache for a plurality of contexts executed by the GPU. The cache configuration block is to receive a cache configuration request, the cache configuration request including context-specific cache requirements for a new context, and determine a priority associated with the context-specific cache requirements. The CCB can compare the context-specific cache requirements with pre-existing cache requirements based on the priority, and reallocate the cache based on the context-specific cache requirements and the priority.Type: ApplicationFiled: December 29, 2017Publication date: January 31, 2019Inventors: HEMA CHAND NALLURI, BALAJI VEMBU, PATTABHIRAMAN K, ALTUG KOKER
-
Publication number: 20190034576Abstract: In accordance with embodiments disclosed herein, there is provided systems and methods for router-based transaction routing for toggle reduction. An integrated circuit includes a transmitter circuit, receiver circuits, and a multicast bus coupled between the transmitter circuit and the receiver circuits. The multicast bus includes a first flow router circuit to route a multicast signal to a first receiver circuit of the plurality of receiver circuits and not route the multicast signal to a second receiver circuit of the plurality of receiver circuits.Type: ApplicationFiled: December 18, 2017Publication date: January 31, 2019Inventors: Hema Chand Nalluri, Balaji Vembu, Santosh Tripathy, Altug Koker, Pattabhiraman K
-
Patent number: 10078879Abstract: Memory-based semaphores are described that are useful for synchronizing processes between different processing engines. In one example, operations include executing a first process at a first processing engine, the executing including updating a memory register, sending a signal from the first processing engine to a second processing engine that the memory register has been updated, the signal including a memory register address to identify the updated memory register inline data and a dataword, fetching data from the memory register by the second processing engine, comparing the fetched data to the received dataword, and conditionally executing a next command of a second process at the second processing engine based on the comparison.Type: GrantFiled: April 22, 2015Date of Patent: September 18, 2018Assignee: INTEL CORPORATIONInventors: Hema Chand Nalluri, Aditya Navale