Patents by Inventor Prasoonkumar Surti
Prasoonkumar Surti has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 12293431Abstract: Embodiments described herein include, software, firmware, and hardware logic that provides techniques to perform arithmetic on sparse data via a systolic processing unit. Embodiment described herein provided techniques to detect zero value elements within a vector or a set of packed data elements output by a processing resource and generate metadata to indicate a location of the zero value elements within the plurality of data elements.Type: GrantFiled: May 2, 2023Date of Patent: May 6, 2025Assignee: Intel CorporationInventors: Joydeep Ray, Scott Janus, Varghese George, Subramaniam Maiyuran, Altug Koker, Abhishek Appu, Prasoonkumar Surti, Vasanth Ranganathan, Valentin Andrei, Ashutosh Garg, Yoav Harel, Arthur Hunter, Jr., SungYe Kim, Mike Macpherson, Elmoustapha Ould-Ahmed-Vall, William Sadler, Lakshminarayanan Striramassarma, Vikranth Vemulapalli
-
Patent number: 12293430Abstract: Methods, systems and apparatuses provide for graphics processor technology that routes untyped unordered access view (UAV) messages to a next level memory cache, routes typed UAV messages and render target messages to a pixel pipeline, and processes, via the pixel pipeline, the typed UAV messages. The technology can also provide for the pixel pipeline to perform a format conversion of one or more pixels associated with a typed UAV message based on a surface format of a UAV resource, calculate a memory address for each pixel associated with the typed UAV message, and collect a plurality of fragments from processed typed UAV messages.Type: GrantFiled: June 23, 2021Date of Patent: May 6, 2025Assignee: Intel CorporationInventors: Jay Jardosh, Prasoonkumar Surti, Abhishek R. Appu
-
Patent number: 12288283Abstract: Methods, systems and apparatuses may provide for technology that determines that a state of a plurality of primitives is associated with out-of-order execution. The plurality of primitives is associated with a raster order. The technology reorders the plurality of primitives from a raster order, and distributes one or more of pixel processing operations or rasterization operations associated with the plurality of primitives to load balance across one or more of a plurality of execution units of a graphics processor or a graphics pipeline of the graphics processor.Type: GrantFiled: June 24, 2021Date of Patent: April 29, 2025Assignee: Intel CorporationInventors: Prasoonkumar Surti, Jorge Garcia Pabon, John Gierach
-
Patent number: 12288287Abstract: An embodiment of an electronic processing system may include an application processor, persistent storage media communicatively coupled to the application processor, and a graphics subsystem communicatively coupled to the application processor. The graphics subsystem may include a first graphics engine to process a graphics workload, and a second graphics engine to offload at least a portion of the graphics workload from the first graphics engine. The second graphics engine may include a low precision compute engine. The system may further include a wearable display housing the second graphics engine. Other embodiments are disclosed and claimed.Type: GrantFiled: February 8, 2024Date of Patent: April 29, 2025Assignee: Intel CorporationInventors: Atsuo Kuwahara, Deepak S. Vembar, Chandrasekaran Sakthivel, Radhakrishnan Venkataraman, Brent E. Insko, Anupreet S. Kalra, Hugues Labbe, Abhishek R. Appu, Ankur N. Shah, Joydeep Ray, Elmoustapha Ould-Ahmed-Vall, Prasoonkumar Surti, Murali Ramadoss
-
Patent number: 12282809Abstract: A graphics processing apparatus includes graphics processors connected by a network connection, where the graphics processors pass compressed data. A first graphics processor stores data blocks as compressed data in a memory. The compressed data has data blocks of variable size, where a size of a block of compressed data depends on a compression ratio of the block of compressed data. A second graphics processor also stores data blocks as compressed data. The first graphics processor concatenates a variable number of blocks of compressed data into a packet of fixed size to send to the second graphics processor. The packet has a variable number of blocks of compressed data depending on the compression ratios of the multiple blocks of compressed data.Type: GrantFiled: September 24, 2021Date of Patent: April 22, 2025Assignee: Intel CorporationInventors: Karol A. Szerszen, Prasoonkumar Surti, Abhishek R. Appu
-
Publication number: 20250117356Abstract: Methods and apparatus relating to techniques for multi-tile memory management. In an example, a graphics processor includes an interposer, a first chiplet coupled with the interposer, the first chiplet including a graphics processing resource and an interconnect network coupled with the graphics processing resource, cache circuitry coupled with the graphics processing resource via the interconnect network, and a second chiplet coupled with the first chiplet via the interposer, the second chiplet including a memory-side cache and a memory controller coupled with the memory-side cache. The memory controller is configured to enable access to a high-bandwidth memory (HBM) device, the memory-side cache is configured to cache data associated with a memory access performed via the memory controller, and the cache circuitry is logically positioned between the graphics processing resource and a chiplet interface.Type: ApplicationFiled: October 15, 2024Publication date: April 10, 2025Applicant: Intel CorporationInventors: Abhishek R. Appu, Altug Koker, Aravindh Anantaraman, Elmoustapha Ould-Ahmed-Vall, Valentin Andrei, Nicolas Galoppo Von Borries, Varghese George, Mike Macpherson, Subramaniam Maiyuran, Joydeep Ray, Lakshminarayanan Striramassarma, Scott Janus, Brent Insko, Vasanth Ranganathan, Kamal Sinha, Arthur Hunter, Prasoonkumar Surti, David Puffer, James Valerio, Ankur N. Shah
-
Publication number: 20250117060Abstract: Methods and apparatus relating to techniques for avoiding cache lookup for cold cache. In an example, an apparatus comprises logic, at least partially comprising hardware logic, to collect user information for a user of a data processing device, generate a user profile for the user of the data processing device from the user information, and set a power profile a processor in the data processing device using the user profile. Other embodiments are also disclosed and claimed.Type: ApplicationFiled: September 12, 2024Publication date: April 10, 2025Applicant: INTEL CORPORATIONInventors: Altug Koker, Abhishek R. Appu, Kiran C. Veernapu, Joydeep Ray, Balaji Vembu, Prasoonkumar Surti, Kamal Sinha, Eric J. Hoekstra, Wenyin Fu, Nikos Kaburlasos, Bhushan M. Borole, Travis T. Schluessler, Ankur N. Shah, Jonathan Kennedy
-
Publication number: 20250103548Abstract: Systems and methods for improving cache efficiency and utilization are disclosed. In one embodiment, a graphics processor includes processing resources to perform graphics operations and a cache controller of a cache coupled to the processing resources. The cache controller is configured to control cache priority by determining whether default settings or an instruction will control cache operations for the cache.Type: ApplicationFiled: November 14, 2024Publication date: March 27, 2025Applicant: Intel CorporationInventors: Altug Koker, Joydeep Ray, Ben Ashbaugh, Jonathan Pearce, Abhishek Appu, Vasanth Ranganathan, Lakshminarayanan Striramassarma, Elmoustapha Ould-Ahmed-Vall, Aravindh Anantaraman, Valentin Andrei, Nicolas Galoppo Von Borries, Varghese George, Yoav Harel, Arthur Hunter, JR., Brent Insko, Scott Janus, Pattabhiraman K, Mike Macpherson, Subramaniam Maiyuran, Marian Alin Petre, Murali Ramadoss, Shailesh Shah, Kamal Sinha, Prasoonkumar Surti, Vikranth Vemulapalli
-
Publication number: 20250104326Abstract: One embodiment provides a graphics processor comprising an interface to a system interconnect and a graphics processor coupled to the interface, the graphics processor comprising circuitry configured to compact sample data for multiple sample locations of a pixel, map the multiple sample locations to memory locations that store compacted sample data, the memory locations in a memory of the graphics processor, apply lossless compression to the compacted sample data, and update a compression control surface associated with the memory locations, the compression control surface to specify a compression status for the memory locationsType: ApplicationFiled: September 11, 2024Publication date: March 27, 2025Applicant: Intel CorporationInventors: Abhishek R. Appu, Prasoonkumar Surti, Joydeep Ray, Michael J. Norris
-
Publication number: 20250103546Abstract: Embodiments are generally directed to cache structure and utilization. An embodiment of an apparatus includes one or more processors including a graphics processor; a memory for storage of data for processing by the one or more processors; and a cache to cache data from the memory; wherein the apparatus is to provide for dynamic overfetching of cache lines for the cache, including receiving a read request and accessing the cache for the requested data, and upon a miss in the cache, overfetching data from memory or a higher level cache in addition to fetching the requested data, wherein the overfetching of data is based at least in part on a current overfetch boundary, and provides for data is to be prefetched extending to the current overfetch boundary.Type: ApplicationFiled: October 4, 2024Publication date: March 27, 2025Applicant: Intel CorporationInventors: Altug Koker, Lakshminarayanan Striramassarma, Aravindh Anantaraman, Valentin Andrei, Abhishek R. Appu, Sean Coleman, Varghese George, Pattabhiraman K, Mike MacPherson, Subramaniam Maiyuran, ElMoustapha Ould-Ahmed-Vall, Vasanth Ranganathan, Joydeep Ray, Jayakrishna P S, Prasoonkumar Surti
-
Publication number: 20250103547Abstract: Embodiments described herein include software, firmware, and hardware logic that provides techniques to perform arithmetic on sparse data via a systolic processing unit. One embodiment provides techniques to optimize training and inference on a systolic array when using sparse data. One embodiment provides techniques to use decompression information when performing sparse compute operations. One embodiment enables the disaggregation of special function compute arrays via a shared reg file. One embodiment enables packed data compress and expand operations on a GPGPU. One embodiment provides techniques to exploit block sparsity within the cache hierarchy of a GPGPU.Type: ApplicationFiled: October 4, 2024Publication date: March 27, 2025Applicant: INTEL CORPORATIONInventors: Prasoonkumar Surti, Subramaniam Maiyuran, Valentin Andrei, Abhishek Appu, Varghese George, Altug Koker, Mike Macpherson, Elmoustapha Ould-Ahmed-Vall, Vasanth Ranganathan, Joydeep Ray, Lakshminarayanan Striramassarma, SungYe Kim
-
Publication number: 20250103343Abstract: Embodiments described herein provide an apparatus comprising a plurality of processing resources including a first processing resource and a second processing resource, a memory communicatively coupled to the first processing resource and the second processing resource, and a processor to receive data dependencies for one or more tasks comprising one or more producer tasks executing on the first processing resource and one or more consumer tasks executing on the second processing resource and move a data output from one or more producer tasks executing on the first processing resource to a cache memory communicatively coupled to the second processing resource. Other embodiments may be described and claimed.Type: ApplicationFiled: November 21, 2024Publication date: March 27, 2025Applicant: INTEL CORPORATIONInventors: Christopher J. HUGHES, Prasoonkumar SURTI, Guei-Yuan LUEH, Adam T. LAKE, Jill BOYCE, Subramaniam MAIYURAN, Lidong XU, James M. HOLLAND, Vasanth RANGANATHAN, Nikos KABURLASOS, Altug KOKER, Abhishek R. Appu
-
Publication number: 20250095099Abstract: One embodiment provides a graphics processor comprising a system interface and circuitry coupled with the system interface. The circuitry includes an execution resource and a preemption status register. The execution resource is configured to execute an instruction. During execution of the instruction, the execution resource is to receive a request to preempt execution of a thread associated with the instruction and, based on a value stored in the preemption status register, execute at least one additional instruction after receipt of the request to preempt execution of the thread.Type: ApplicationFiled: September 23, 2024Publication date: March 20, 2025Applicant: Intel CorporationInventors: Altug Koker, Ingo Wald, David Puffer, Subramaniam M. Maiyuran, Prasoonkumar Surti, Balaji Vembu, Guei-Yuan Lueh, Murali Ramadoss, Abhishek R. Appu, Joydeep Ray
-
Patent number: 12254526Abstract: Apparatuses including general-purpose graphics processing units having on chip dense memory for temporal buffering are disclosed. In one embodiment, a graphics multiprocessor includes a plurality of compute engines to perform first computations to generate a first set of data, cache for storing data, and a high density memory that is integrated on chip with the plurality of compute engines and the cache. The high density memory to receive the first set of data, to temporarily store the first set of data, and to provide the first set of data to the cache during a first time period that is prior to a second time period when the plurality of compute engines will use the first set of data for second computations.Type: GrantFiled: March 15, 2019Date of Patent: March 18, 2025Assignee: Intel CorporationInventors: Varghese George, Altug Koker, Aravindh Anantaraman, Subramaniam Maiyuran, SungYe Kim, Valentin Andrei, Elmoustapha Ould-Ahmed-Vall, Joydeep Ray, Abhishek R. Appu, Nicolas C. Galoppo von Borries, Prasoonkumar Surti, Mike Macpherson
-
Patent number: 12243125Abstract: Systems, apparatuses and methods may provide for technology that determines a stencil value and uses the stencil value to control, via a stencil buffer, a coarse pixel size of a graphics pipeline. Additionally, the stencil value may include a first range of bits defining a first dimension of the coarse pixel size and a second range of bits defining a second dimension of the coarse pixel size. In one example, the coarse pixel size is controlled for a plurality of pixels on a per pixel basis.Type: GrantFiled: November 22, 2023Date of Patent: March 4, 2025Assignee: Intel CorporationInventors: Karthik Vaidyanathan, Prasoonkumar Surti, Hugues Labbe, Atsuo Kuwahara, Sameer KP, Jonathan Kennedy, Murali Ramadoss, Michael Apodaca, Abhishek Venkatesh
-
Patent number: 12243155Abstract: Methods, systems and apparatuses may provide for technology that identifies first graphics data that is associated with spatially proximate positions. The technology identifies second graphics data that is associated with spatially proximate positions, and interleaves the first and the second graphics data across a plurality of storage tiles.Type: GrantFiled: June 24, 2021Date of Patent: March 4, 2025Assignee: Intel CorporationInventors: Prasoonkumar Surti, Ronald Silvas, Karol A. Szerszen
-
Patent number: 12236498Abstract: Generation and storage of compressed z-planes in graphics processing is described. An example of a processor includes a rasterizer to generate a fragment of pixel data including blocks of pixel data; a depth pipeline to receive the fragment, the pipeline including a first and second depth test hardware, the first depth test hardware to perform a coarse depth test including determining minimum and maximum depths for each block; and a depth buffer, wherein the processor is to determine whether the fragment meets requirements that the fragment fully covers a tile of pixel data and passes a first depth test, and that each of the minimum and maximum depths of the fragment has a same sign and exponent, and, upon determining that the fragment meets the requirements, to generate a compressed depth plane utilizing the first depth test and update the depth buffer with the compressed depth plane.Type: GrantFiled: May 27, 2021Date of Patent: February 25, 2025Assignee: INTEL CORPORATIONInventors: Saikat Mandal, Karol Szerszen, Vasanth Ranganathan, Altug Koker, Michael Norris, Prasoonkumar Surti, Takahiro Murata
-
Patent number: 12229871Abstract: An embodiment of an electronic processing system may include an application processor, persistent storage media communicatively coupled to the application processor, a graphics subsystem communicatively coupled to the application processor, a sense engine communicatively coupled to the graphics subsystem to provide sensed information, a focus engine communicatively coupled to the sense engine and the graphics subsystem to provide focus information, a motion engine communicatively coupled to the sense engine, the focus engine, and the graphics subsystem to provide motion information, and a motion biased foveated renderer communicatively coupled to the motion engine, the focus engine, the sense engine to adjust one or more parameters of the graphics subsystem based on one or more of the sense information, the focus information, and the motion information. Other embodiments are disclosed and claimed.Type: GrantFiled: September 22, 2021Date of Patent: February 18, 2025Assignee: Intel CorporationInventors: Prasoonkumar Surti, Karthik Vaidyanathan, Atsuo Kuwahara, Hugues Labbe, Sameer K P, Jonathan Kennedy, Joydeep Ray, Travis T. Schluessler, John H. Feit, Nikos Kaburlasos, Jacek Kwiatkowski, Tomer Bar-On, Carsten Benthin, Adam T. Lake, Vasanth Ranganathan, Abhishek R. Appu
-
Patent number: 12229867Abstract: One embodiment provides a graphics processor comprising a block of execution resources, a cache memory, a cache memory prefetcher, and circuitry including a programmable neural network unit, the programmable neural network unit comprising a network hardware block including circuitry to perform neural network operations and activation operations for a layer of a neural network, the programmable neural network unit addressable by cores within the block of graphics cores and the neural network hardware block configured to perform operations associated with a neural network configured to determine a prefetch pattern for the cache memory prefetcher.Type: GrantFiled: May 1, 2023Date of Patent: February 18, 2025Assignee: Intel CorporationInventors: Hugues Labbe, Darrel Palke, Sherine Abdelhak, Jill Boyce, Varghese George, Scott Janus, Adam Lake, Zhijun Lei, Zhengmin Li, Mike MacPherson, Carl Marshall, Selvakumar Panneer, Prasoonkumar Surti, Karthik Veeramani, Deepak Vembar, Vallabhajosyula Srinivasa Somayazulu
-
Publication number: 20250053452Abstract: Apparatus and method for stack access throttling for synchronous ray tracing. For example, one embodiment of an apparatus comprises: ray tracing acceleration hardware to manage active ray tracing stack allocations to ensure that a size of the active ray tracing stack allocations remains within a threshold; and an execution unit to execute a thread to explicitly request a new ray tracing stack allocation from the ray tracing acceleration hardware, the ray tracing acceleration hardware to permit the new ray tracing stack allocation if the size of the active ray tracing stack allocations will remain within the threshold after permitting the new ray tracing stack allocation.Type: ApplicationFiled: July 16, 2024Publication date: February 13, 2025Inventors: Pawel MAJEWSKI, Prasoonkumar SURTI, Karthik VAIDYANATHAN, Joshua BARCZAK, Vasanth RANGANATHAN, Vikranth VEMULAPALLI