Patents by Inventor Prasoonkumar Surti

Prasoonkumar Surti has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11922557
    Abstract: An apparatus and method for merging primitives and coordinating between vertex and ray transformations on a shared transformation unit. For example, one embodiment of a graphics processor comprises: a queue comprising a plurality of entries; ordering circuitry/logic to order triangles front to back within the queue; pairing circuitry/logic to identify triangles in the queue sharing an edge and to merge the triangles sharing an edge to produce merged triangle pairs; and shared transformation circuitry to alternate between performing vertex transformations on vertices of the merged triangle pairs and to performing ray transformations on ray direction/origin data.
    Type: Grant
    Filed: May 17, 2022
    Date of Patent: March 5, 2024
    Assignee: Intel Corporation
    Inventors: Sven Woop, Prasoonkumar Surti, Karthik Vaidyanathan, Carsten Benthin, Joshua Barczak, Saikat Mandal
  • Patent number: 11922535
    Abstract: Embodiments provide mechanisms to facilitate compute operations for deep neural networks. One embodiment comprises a graphics processing unit comprising one or more multiprocessors, at least one of the one or more multiprocessors including a register file to store a plurality of different types of operands and a plurality of processing cores. The plurality of processing cores includes a first set of processing cores of a first type and a second set of processing cores of a second type. The first set of processing cores are associated with a first memory channel and the second set of processing cores are associated with a second memory channel.
    Type: Grant
    Filed: February 13, 2023
    Date of Patent: March 5, 2024
    Assignee: Intel Corporation
    Inventors: Prasoonkumar Surti, Narayan Srinivasa, Feng Chen, Joydeep Ray, Ben J. Ashbaugh, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Balaji Vembu, Tsung-Han Lin, Kamal Sinha, Rajkishore Barik, Sara S. Baghsorkhi, Justin E. Gottschlich, Altug Koker, Nadathur Rajagopalan Satish, Farshad Akhbari, Dukhwan Kim, Wenyin Fu, Travis T. Schluessler, Josh B. Mastronarde, Linda L. Hurd, John H. Feit, Jeffery S. Boles, Adam T. Lake, Karthik Vaidyanathan, Devan Burke, Subramaniam Maiyuran, Abhishek R. Appu
  • Publication number: 20240070926
    Abstract: Embodiments are generally directed to compression in machine learning and deep learning processing. An embodiment of an apparatus for compression of untyped data includes a graphical processing unit (GPU) including a data compression pipeline, the data compression pipeline including a data port coupled with one or more shader cores, wherein the data port is to allow transfer of untyped data without format conversion, and a 3D compression/decompression unit to provide for compression of untyped data to be stored to a memory subsystem and decompression of untyped data from the memory subsystem.
    Type: Application
    Filed: September 13, 2023
    Publication date: February 29, 2024
    Applicant: Intel Corporation
    Inventors: Joydeep Ray, Ben Ashbaugh, Prasoonkumar Surti, Pradeep Ramani, Rama Harihara, Jerin C. Justin, Jing Huang, Xiaoming Cui, Timothy B. Costa, Ting Gong, Elmoustapha Ould-ahmed-vall, Kumar Balasubramanian, Anil Thomas, Oguz H. Elibol, Jayaram Bobba, Guozhong Zhuang, Bhavani Subramanian, Gokce Keskin, Chandrasekaran Sakthivel, Rajesh Poornachandran
  • Patent number: 11915459
    Abstract: Apparatus and method for context-aware compression. For example, one embodiment of an apparatus comprises: ray traversal/intersection circuitry to traverse rays through a hierarchical acceleration data structure to identify intersections between rays and primitives of a graphics scene; matrix compression circuitry/logic to compress hierarchical transformation matrices to generate compressed hierarchical transformation matrices by quantizing N-bit floating point data elements associated with child transforms of the hierarchical transformation matrices to variable-bit floating point numbers or integers comprising offsets from a parent transform of the child transform; and an instance processor to generate a plurality of instances of one or more base geometric objects in accordance with the compressed hierarchical transformation matrices.
    Type: Grant
    Filed: May 10, 2022
    Date of Patent: February 27, 2024
    Assignee: INTEL CORPORATION
    Inventors: Carson Brownlee, Carsten Benthin, Joshua Barczak, Kai Xiao, Michael Apodaca, Prasoonkumar Surti, Thomas Raoux
  • Patent number: 11915357
    Abstract: Apparatus and method for stack throttling.
    Type: Grant
    Filed: March 16, 2020
    Date of Patent: February 27, 2024
    Assignee: Intel Corporation
    Inventors: Karthik Vaidyanathan, Abhishek Appu, Vasanth Ranganathan, Joydeep Ray, Prasoonkumar Surti
  • Patent number: 11900498
    Abstract: Apparatus and method for stable and short latency sorting.
    Type: Grant
    Filed: March 19, 2020
    Date of Patent: February 13, 2024
    Assignee: Intel Corporation
    Inventors: Saikat Mandal, Prasoonkumar Surti, Sven Woop
  • Patent number: 11892950
    Abstract: Embodiments are generally directed to data prefetching for graphics data processing. An embodiment of an apparatus includes one or more processors including one or more graphics processing units (GPUs); and a plurality of caches to provide storage for the one or more GPUs, the plurality of caches including at least an L1 cache and an L3 cache, wherein the apparatus to provide intelligent prefetching of data by a prefetcher of a first GPU of the one or more GPUs including measuring a hit rate for the L1 cache; upon determining that the hit rate for the L1 cache is equal to or greater than a threshold value, limiting a prefetch of data to storage in the L3 cache, and upon determining that the hit rate for the L1 cache is less than a threshold value, allowing the prefetch of data to the L1 cache.
    Type: Grant
    Filed: July 15, 2022
    Date of Patent: February 6, 2024
    Assignee: INTEL CORPORATION
    Inventors: Vikranth Vemulapalli, Lakshminarayanan Striramassarma, Mike MacPherson, Aravindh Anantaraman, Ben Ashbaugh, Murali Ramadoss, William B. Sadler, Jonathan Pearce, Scott Janus, Brent Insko, Vasanth Ranganathan, Kamal Sinha, Arthur Hunter, Jr., Prasoonkumar Surti, Nicolas Galoppo von Borries, Joydeep Ray, Abhishek R. Appu, ElMoustapha Ould-Ahmed-Vall, Altug Koker, Sungye Kim, Subramaniam Maiyuran, Valentin Andrei
  • Patent number: 11880934
    Abstract: An apparatus and method are described for performing an early depth test on graphics data. For example, one embodiment of a graphics processing apparatus comprises: early depth test circuitry to perform an early depth test on blocks of pixels to determine whether all pixels in the block of pixels can be resolved by the early depth test; a plurality of execution circuits to execute pixel shading operations on the blocks of pixels; and a scheduler circuit to schedule the blocks of pixels for the pixel shading operations, the scheduler circuit to prioritize the blocks of pixels in accordance with the determination as to whether all pixels in the block of pixels can be resolved by the early depth test.
    Type: Grant
    Filed: November 2, 2021
    Date of Patent: January 23, 2024
    Assignee: INTEL CORPORATION
    Inventors: Brent E. Insko, Prasoonkumar Surti
  • Patent number: 11880928
    Abstract: Apparatus and method for a hierarchical beam tracer.
    Type: Grant
    Filed: April 19, 2022
    Date of Patent: January 23, 2024
    Assignee: INTEL CORPORATION
    Inventors: Scott Janus, Prasoonkumar Surti, Karthik Vaidyanathan, Alexey Supikov, Gabor Liktor, Carsten Benthin, Philip Laws, Michael Doyle
  • Publication number: 20240020911
    Abstract: Apparatus and method for routing data from ray tracing cache banks For example, one embodiment of an apparatus comprises: ray traversal hardware logic to perform traversal operations to traverse rays through a bounding volume hierarchy (BVH) comprising a plurality of BVH nodes, the ray traversal hardware logic comprising a plurality of traversal storage banks to store traversal data associated with the BVH nodes and/or the rays as the ray traversal hardware logic performs the traversal operations; and a cache comprising a plurality of cache banks to store the traversal data prior to being moved into the traversal storage banks for processing by the ray traversal hardware logic; and an inter-bank interconnect comprising: a point-to-point switch matrix to couple any of the cache banks to any of the traversal storage banks; an arbiter/allocator to control the point-to-point switch matrix to establish a particular group of interconnections between the cache banks and the traversal storage banks in a given clock c
    Type: Application
    Filed: May 26, 2022
    Publication date: January 18, 2024
    Inventors: Michael NORRIS, Abhishek R. APPU, Prasoonkumar SURTI, Karthik VAIDYANATHAN
  • Publication number: 20240012767
    Abstract: An apparatus to facilitate efficient data sharing for graphics data processing operations is disclosed. The apparatus includes a processing resource to generate a stream of instructions, an L1 cache communicably coupled to the processing resource and comprising an on-page detector circuit to determine that a set of memory requests in the stream of instructions access a same memory page; and set a marker in a first request of the set of memory requests; and arbitration circuitry communicably coupled to the L1 cache, the arbitration circuitry to route the set of memory requests to memory comprising the memory page and to, in response to receiving the first request with the marker set, remain with the processing resource to process the set of memory requests.
    Type: Application
    Filed: July 25, 2023
    Publication date: January 11, 2024
    Applicant: Intel Corporation
    Inventors: Joydeep Ray, Altug Koker, Elmoustapha Ould-Ahmed-Vall, Michael Macpherson, Aravindh V. Anantaraman, Vasanth Ranganathan, Lakshminarayanan Striramassarma, Varghese George, Abhishek Appu, Prasoonkumar Surti
  • Publication number: 20240013337
    Abstract: A mechanism is described for detecting, at training time, information related to one or more tasks to be performed by the one or more processors according to a training dataset for a neural network, analyzing the information to determine one or more portions of hardware of a processor of the one or more processors that is configurable to support the one or more tasks, configuring the hardware to pre-select the one or more portions to perform the one or more tasks, while other portions of the hardware remain available for other tasks, and monitoring utilization of the hardware via a hardware unit of the graphics processor and, via a scheduler of the graphics processor, adjusting allocation of the one or more tasks to the one or more portions of the hardware based on the utilization.
    Type: Application
    Filed: July 13, 2023
    Publication date: January 11, 2024
    Applicant: Intel Corporation
    Inventors: Abhishek R. Appu, Altug Koker, John C. Weast, Mike B. Macpherson, Linda L. Hurd, Sara S. Baghsorkhi, Justin E. Gottschlich, Prasoonkumar Surti, Chandrasekaran Sakthivel, Liwei Ma, Elmoustapha Ould-Ahmed-Vall, Kamal Sinha, Joydeep Ray, Balaji Vembu, Sanjeev Jahagirdar, Vasanth Ranganathan, Dukhwan Kim
  • Patent number: 11868264
    Abstract: One embodiment provides circuitry coupled with cache memory and a memory interface, the circuitry to compress compute data at multiple cache line granularity, and a processing resource coupled with the memory interface and the cache memory. The processing resource is configured to perform a general-purpose compute operation on compute data associated with multiple cache lines of the cache memory. The circuitry is configured to compress the compute data before a write of the compute data via the memory interface to the memory bus, in association with a read of the compute data associated with the multiple cache lines via the memory interface, decompress the compute data, and provide the decompressed compute data to the processing resource.
    Type: Grant
    Filed: February 13, 2023
    Date of Patent: January 9, 2024
    Assignee: Intel Corporation
    Inventors: Abhishek R. Appu, Altug Koker, Joydeep Ray, David Puffer, Prasoonkumar Surti, Lakshminarayanan Striramassarma, Vasanth Ranganathan, Kiran C. Veernapu, Balaji Vembu, Pattabhiraman K
  • Patent number: 11871142
    Abstract: Systems, apparatuses and methods may provide for technology that determines a frame rate of video content, sets a blend amount parameter based on the frame rate, and temporally anti-aliases the video content based on the blend amount parameter. Additionally, the technology may detect a coarse pixel (CP) shading condition with respect to one or more frames in the video content and select, in response to the CP shading condition, a per frame jitter pattern that jitters across pixels, wherein the video content is temporally anti-aliased based on the per frame jitter pattern. The CP shading condition may also cause the technology to apply a gradient to a plurality of color planes on a per color plane basis and discard pixel level samples associated with a CP if all mip data corresponding to the CP is transparent or shadowed out.
    Type: Grant
    Filed: December 14, 2021
    Date of Patent: January 9, 2024
    Assignee: Intel Corporation
    Inventors: Travis T. Schluessler, Joydeep Ray, John H. Feit, Nikos Kaburlasos, Jacek Kwiatkowski, Karthik Vaidyanathan, Prasoonkumar Surti, Michael Apodaca, Murali Ramadoss, Abhishek Venkatesh
  • Patent number: 11869119
    Abstract: Systems, apparatuses and methods may provide for technology that determines a stencil value and uses the stencil value to control, via a stencil buffer, a coarse pixel size of a graphics pipeline. Additionally, the stencil value may include a first range of bits defining a first dimension of the coarse pixel size and a second range of bits defining a second dimension of the coarse pixel size. In one example, the coarse pixel size is controlled for a plurality of pixels on a per pixel basis.
    Type: Grant
    Filed: February 7, 2022
    Date of Patent: January 9, 2024
    Assignee: Intel Corporation
    Inventors: Karthik Vaidyanathan, Prasoonkumar Surti, Hugues Labbe, Atsuo Kuwahara, Sameer Kp, Jonathan Kennedy, Murali Ramadoss, Michael Apodaca, Abhishek Venkatesh
  • Publication number: 20240004833
    Abstract: In an example, an apparatus comprises a plurality of execution units, and a first memory communicatively couple to the plurality of execution units, wherein the first shared memory is shared by the plurality of execution units and a copy engine to copy context state data from at least a first of the plurality of execution units to the first shared memory. Other embodiments are also disclosed and claimed.
    Type: Application
    Filed: July 10, 2023
    Publication date: January 4, 2024
    Applicant: Intel Corporation
    Inventors: Altug Koker, Prasoonkumar Surti, David Puffer, Subramaniam Maiyuran, Guei-Yuan Lueh, Abhishek R. Appu, Joydeep Ray, Balaji Vembu, Tomer Bar-On, Andrew T. Lauritzen, Hugues Labbe, John G. Gierach, Gabor Liktor
  • Publication number: 20240004713
    Abstract: In an example, an apparatus comprises a plurality of execution units comprising at least a first type of execution unit and a second type of execution unit and logic, at least partially including hardware logic, to analyze a workload and assign the workload to one of the first type of execution unit or the second type of execution unit. Other embodiments are also disclosed and claimed.
    Type: Application
    Filed: August 1, 2023
    Publication date: January 4, 2024
    Applicant: Intel Corporation
    Inventors: Abhishek R. APPU, Altug KOKER, Balaji VEMBU, Joydeep RAY, Kamal SINHA, Prasoonkumar SURTI, Kiran C. VEERNAPU, Subramaniam MAIYURAN, Sanjeev S. Jahagirdar, Eric J. Asperheim, Guei-Yuan Lueh, David Puffer, Wenyin Fu, Nikos Kaburlasos, Bhushan M. Borole, Josh B. Mastronarde, Linda L. Hurd, Travis T. Schluessler, Tomasz Janczak, Abhishek Venkatesh, Kai Xiao, Slawomir Grajewski
  • Publication number: 20230421738
    Abstract: A mechanism is described for facilitating adaptive resolution and viewpoint-prediction for immersive media in computing environments. An apparatus of embodiments, as described herein, includes one or more processors to receive viewing positions associated with a user with respect to a display, and analyze relevance of media contents based on the viewing positions, where the media content includes immersive videos of scenes captured by one or more cameras. The one or more processors are further to predict portions of the media contents as relevant portions based on the viewing positions and transmit the relevant portions to be rendered and displayed.
    Type: Application
    Filed: July 5, 2023
    Publication date: December 28, 2023
    Applicant: Intel Corporation
    Inventors: MAYURESH VARERKAR, STANLEY BARAN, MICHAEL APODACA, PRASOONKUMAR SURTI, ATSUO KUWAHARA, NARAYAN BISWAL, JILL BOYCE, YI-JEN CHIU, GOKCEN CILINGIR, BARNAN DAS, ATUL DIVEKAR, SRIKANTH POTLURI, NILESH SHAH, ARCHIE SHARMA
  • Publication number: 20230418355
    Abstract: Methods and apparatus relating to techniques for avoiding cache lookup for cold cache. In an example, an apparatus comprises logic, at least partially comprising hardware logic, to collect user information for a user of a data processing device, generate a user profile for the user of the data processing device from the user information, and set a power profile a processor in the data processing device using the user profile. Other embodiments are also disclosed and claimed.
    Type: Application
    Filed: June 22, 2023
    Publication date: December 28, 2023
    Applicant: INTEL CORPORATION
    Inventors: Altug Koker, Abhishek R. Appu, Kiran C. Veernapu, Joydeep Ray, Balaji Vembu, Prasoonkumar Surti, Kamal Sinha, Eric J. Hoekstra, Wenyin Fu, Nikos Kaburlasos, Bhushan M. Borole, Travis T. Schluessler, Ankur N. Shah, Jonathan Kennedy
  • Publication number: 20230418617
    Abstract: Embodiments described herein provide an apparatus comprising a plurality of processing resources including a first processing resource and a second processing resource, a memory communicatively coupled to the first processing resource and the second processing resource, and a processor to receive data dependencies for one or more tasks comprising one or more producer tasks executing on the first processing resource and one or more consumer tasks executing on the second processing resource and move a data output from one or more producer tasks executing on the first processing resource to a cache memory communicatively coupled to the second processing resource. Other embodiments may be described and claimed.
    Type: Application
    Filed: June 22, 2023
    Publication date: December 28, 2023
    Applicant: INTEL CORPORATION
    Inventors: Christopher J. HUGHES, Prasoonkumar SURTI, Guei-Yuan LUEH, Adam T. LAKE, Jill BOYCE, Subramaniam MAIYURAN, Lidong XU, James M. HOLLAND, Vasanth RANGANATHAN, Nikos KABURLASOS, Altug KOKER, Abhishek R. Appu