Patents by Inventor Karthik Vaidyanathan

Karthik Vaidyanathan has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11688122
    Abstract: An embodiment of an electronic processing system may include an application processor, persistent storage media communicatively coupled to the application processor, and a graphics subsystem communicatively coupled to the application processor. The system may include one or more of a draw call re-orderer communicatively coupled to the application processor and the graphics subsystem to re-order two or more draw calls, a workload re-orderer communicatively coupled to the application processor and the graphics subsystem to re-order two or more work items in an order independent mode, a queue primitive included in at least one of the two or more draw calls to define a producer stage and a consumer stage, and an order-independent executor communicatively coupled to the application processor and the graphics subsystem to provide tile-based order independent execution of a compute stage. Other embodiments are disclosed and claimed.
    Type: Grant
    Filed: February 2, 2022
    Date of Patent: June 27, 2023
    Assignee: Intel Corporation
    Inventors: Devan Burke, Adam T. Lake, Jeffery S. Boles, John H. Feit, Karthik Vaidyanathan, Abhishek R. Appu, Joydeep Ray, Subramaniam Maiyuran, Altug Koker, Balaji Vembu, Murali Ramadoss, Prasoonkumar Surti, Eric J. Hoekstra, Gabor Liktor, Jonathan Kennedy, Slawomir Grajewski, Elmoustapha Ould-Ahmed-Vall
  • Patent number: 11670037
    Abstract: Apparatus and method for efficient BVH construction. For example, one embodiment of an apparatus comprises: a memory to store graphics data for a scene including a plurality of primitives in a scene at a first precision; a geometry quantizer to read vertices of the primitives at the first precision and to adaptively quantize the vertices of the primitives to a second precision associated with a first local coordinate grid of a first BVH node positioned within a global coordinate grid, the second precision lower than the first precision; a BVH builder to determine coordinates of child nodes of the first BVH node by performing non-spatial-split binning or spatial-split binning for the first BVH node using primitives associated with the first BVH node, the BVH builder to determine final coordinates for the child nodes based, at least in part, on an evaluation of surface areas of different bounding boxes generated for each of the child node.
    Type: Grant
    Filed: May 3, 2022
    Date of Patent: June 6, 2023
    Assignee: Intel Corporation
    Inventors: Michael Doyle, Karthik Vaidyanathan
  • Patent number: 11663777
    Abstract: Apparatus and method for processing motion blur operations. For example, one embodiment of a graphics processing apparatus comprises: a bounding volume hierarchy (BVH) generator to build a BVH comprising hierarchically-arranged BVH nodes based on input primitives, at least one BVH node comprising one or more child nodes; and motion blur processing hardware logic to determine motion values for a quantization grid based on motion values of the one or more child nodes of the at least one BVH node and to map linear bounds of each of the child nodes to the quantization grid.
    Type: Grant
    Filed: March 15, 2020
    Date of Patent: May 30, 2023
    Assignee: Intel Corporation
    Inventors: Sven Woop, Carsten Benthin, Karthik Vaidyanathan
  • Patent number: 11663774
    Abstract: Systems, apparatuses and methods may provide away to render edges of an object defined by multiple tessellation triangles. More particularly, systems, apparatuses and methods may provide a way to perform anti-aliasing at the edges of the object based on a coarse pixel rate, where the coarse pixels may be based on a coarse Z value indicate a resolution or granularity of detail of the coarse pixel. The systems, apparatuses and methods may use a shader dispatch engine to dispatch raster rules to a pixel shader to direct the pixel shader to include, in a tile and/or tessellation triangle, one more finer coarse pixels based on a percent of coverage provided by a finer coarse pixel of a tessellation triangle at or along the edge of the object.
    Type: Grant
    Filed: March 2, 2022
    Date of Patent: May 30, 2023
    Assignee: Intel Corporation
    Inventors: Prasoonkumar Surti, Karthik Vaidyanathan, Murali Ramadoss, Michael Apodaca, Abhishek Venkatesh, Joydeep Ray, Abhishek R. Appu
  • Publication number: 20230143192
    Abstract: Input filtering and sampler acceleration for supersampling is described. An example of a graphics processor comprises a set of processing resources configured to perform a supersampling operation via a convolutional neural network, the set of processing resources including circuitry configured to receive input data for supersampling processing, the input data including data sampled according to a jitter pattern that varies locations for data samples; apply an image filter to the received input data, wherein the image filter includes weighting for pixels that is based at least in part on the jitter pattern; process the input data to generate upsampled data; and apply supersampling processing to the upsampled data.
    Type: Application
    Filed: September 26, 2022
    Publication date: May 11, 2023
    Applicant: Intel Corporation
    Inventors: Gabor Liktor, Karthik Vaidyanathan
  • Publication number: 20230146259
    Abstract: Sampling across multiple views in supersampling operation is described. An example of an apparatus includes one or more processing resources configured to perform a supersampling operation for image data generated for multiple views utilizing one or more neural networks, the processing resources including at least a first circuitry to process a first current frame including first image data for a first view, and a second circuitry to process a second current frame including second image data for a second view, the first view and second view being displaced from each other, the processing resources to receive for processing the first current frame and the second current frame, and perform supersampling processing utilizing the one or more neural networks based on at least the first current frame and the second current frame and one or more prior generated frames for each of the views.
    Type: Application
    Filed: November 3, 2022
    Publication date: May 11, 2023
    Applicant: Intel Corporation
    Inventors: Gabor Liktor, Karthik Vaidyanathan, Tobias Zirr
  • Publication number: 20230148225
    Abstract: Joint denoising and supersampling of graphics data is described. An example of a graphics processor includes multiple processing resources, including a least a first processing resource including a pipeline to perform a supersampling operation; and the pipeline including circuitry to jointly perform denoising and supersampling of received ray tracing input data, the circuitry including first circuitry to receive input data associated with an input block for a neural network, second circuitry to perform operations associated with a feature extraction and kernel prediction network of the neural network, and third circuitry to perform operations associated with a filtering block of the neural network.
    Type: Application
    Filed: September 30, 2022
    Publication date: May 11, 2023
    Applicant: Intel Corporation
    Inventors: Manu Mathew Thomas, Karthik Vaidyanathan, Anton Kaplanyan, SungYe Kim, Gabor Liktor
  • Publication number: 20230137438
    Abstract: An apparatus and method to execute ray tracing instructions. For example, one embodiment of an apparatus comprises execution circuitry to execute a dequantize instruction to convert a plurality of quantized data values to a plurality of dequantized data values, the dequantize instruction including a first source operand to identify a plurality of packed quantized data values in a source register and a destination operand to identify a destination register in which to store a plurality of packed dequantized data values, wherein the execution circuitry is to convert each packed quantized data value in the source register to a floating point value, to multiply the floating point value by a first value to generate a first product and to add the first product to a second value to generate a dequantized data value, and to store the dequantized data value in a packed data element location in the destination register.
    Type: Application
    Filed: December 29, 2022
    Publication date: May 4, 2023
    Inventors: Karthik VAIDYANATHAN, Michael APODACA, Thomas RAOUX, Carsten BENTHIN, Kai XIAO, Carson BROWNLEE, Joshua BARCZAK
  • Patent number: 11636567
    Abstract: An embodiment of a graphics command coordinator apparatus may include a commonality identifier to identify a commonality between a first graphics command corresponding to a first frame and a second graphics command corresponding to a second frame, a commonality analyzer communicatively coupled to the commonality identifier to determine if the first graphics command and the second graphics command can be processed together based on the commonality identified by the commonality identifier, and a commonality indicator communicatively coupled to the commonality analyzer to provide an indication that the first graphics command and the second graphics command are to be processed together. Other embodiments are disclosed and claimed.
    Type: Grant
    Filed: September 27, 2021
    Date of Patent: April 25, 2023
    Assignee: Intel Corporation
    Inventors: Abhishek Venkatesh, Karthik Vaidyanathan, Murali Ramadoss, Michael Apodaca, Prasoonkumar Surti
  • Publication number: 20230090973
    Abstract: One embodiment provides a graphics processor including a processing resource including a register file, memory, a cache memory, and load/store/cache circuitry to process load, store, and prefetch messages from the processing resource. The circuitry includes support for an immediate address offset that will be used to adjust the address supplied for a memory access to be requested by the circuitry. Including support for the immediate address offset removes the need to execute additional instructions to adjust the address to be accessed prior to execution of the memory access instruction.
    Type: Application
    Filed: September 21, 2021
    Publication date: March 23, 2023
    Applicant: Intel Corporation
    Inventors: Joydeep Ray, Abhishek R. Appu, Timothy R. Bauer, James Valerio, Weiyu Chen, Subramaniam Maiyuran, Prasoonkumar Surti, Karthik Vaidyanathan, Carsten Benthin, Sven Woop, Jiasheng Chen
  • Publication number: 20230066626
    Abstract: One embodiment provides a graphics processor comprising a set of processing resources configured to perform a supersampling operation via a mixed precision convolutional neural network, the set of processing resources including circuitry configured to receive, at an input block of a neural network model, history data, velocity data, and current frame data, pre-process the history data, velocity data, and current frame data to generate pre-processed data, provide the pre-processed data to a feature extraction network of the neural network model, process the pre-processed data at the feature extraction network via one or more encoder stages and one or more decoder stages, and generate an output image via an output block of the neural network model via direct reconstruction or kernel prediction.
    Type: Application
    Filed: November 1, 2021
    Publication date: March 2, 2023
    Applicant: Intel Corporation
    Inventors: SungYe Kim, Karthik Vaidyanathan, Gabor Liktor, Manu Mathew Thomas
  • Publication number: 20230062540
    Abstract: Examples described herein relate to a manner of determining a number of bits to encode compression data. Some examples include: compressing pixel data of a region of pixels in a frame; determining a number of bits associated with at least two partitions; utilizing the determined number of bits to encode residual values generated from the compressing the pixel data; and storing the encoded residual values. In some examples, the at least two partitions comprise a first partition and a second partition. Some examples include: encoding residuals in the first partition using a number of bits associated with the first partition and encoding residuals in the second partition using a number of bits associated with the second partition.
    Type: Application
    Filed: August 18, 2021
    Publication date: March 2, 2023
    Inventors: Prasoonkumar SURTI, Abhishek R. APPU, Karol A. SZERSZEN, Karthik VAIDYANATHAN, Sreenivas KOTHANDARAMAN, Mohamed FAROOK
  • Patent number: 11593910
    Abstract: Embodiments provide mechanisms to facilitate compute operations for deep neural networks. One embodiment comprises a graphics processing unit comprising one or more multiprocessors, at least one of the one or more multiprocessors including a register file to store a plurality of different types of operands and a plurality of processing cores. The plurality of processing cores includes a first set of processing cores of a first type and a second set of processing cores of a second type. The first set of processing cores are associated with a first memory channel and the second set of processing cores are associated with a second memory channel.
    Type: Grant
    Filed: May 11, 2022
    Date of Patent: February 28, 2023
    Assignee: Intel Corporation
    Inventors: Prasoonkumar Surti, Narayan Srinivasa, Feng Chen, Joydeep Ray, Ben J. Ashbaugh, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Balaji Vembu, Tsung-Han Lin, Kamal Sinha, Rajkishore Barik, Sara S. Baghsorkhi, Justin E. Gottschlich, Altug Koker, Nadathur Rajagopalan Satish, Farshad Akhbari, Dukhwan Kim, Wenyin Fu, Travis T. Schluessler, Josh B. Mastronarde, Linda L. Hurd, John H. Feit, Jeffery S. Boles, Adam T. Lake, Karthik Vaidyanathan, Devan Burke, Subramaniam Maiyuran, Abhishek R. Appu
  • Patent number: 11574382
    Abstract: Examples described herein relate to a decompression engine that can request compressed data to be transferred over a memory bus. In some cases, the memory bus is a width that requires multiple data transfers to transfer the requested data. In a case that requested data is to be presented in-order to the decompression engine, a re-order buffer can be used to store entries of data. When a head-of-line entry is received, the entry can be provided to the decompression engine. When a last entry in a group of one or more entries is received, all entries in the group are presented in-order to the decompression engine. In some examples, a decompression engine can borrow memory resources allocated for use by another memory client to expand a size of re-order buffer available for use. For example, a memory client with excess capacity and a slowest growth rate can be chosen to borrow memory resources from.
    Type: Grant
    Filed: September 3, 2021
    Date of Patent: February 7, 2023
    Assignee: Intel Corporation
    Inventors: Abhishek R. Appu, Eric G. Liskay, Prasoonkumar Surti, Sudhakar Kamma, Karthik Vaidyanathan, Rajasekhar Pantangi, Altug Koker, Abhishek Rhisheekesan, Shashank Lakshminarayana, Priyanka Ladda, Karol A. Szerszen
  • Patent number: 11568591
    Abstract: An apparatus and method to execute ray tracing instructions. For example, one embodiment of an apparatus comprises execution circuitry to execute a dequantize instruction to convert a plurality of quantized data values to a plurality of dequantized data values, the dequantize instruction including a first source operand to identify a plurality of packed quantized data values in a source register and a destination operand to identify a destination register in which to store a plurality of packed dequantized data values, wherein the execution circuitry is to convert each packed quantized data value in the source register to a floating point value, to multiply the floating point value by a first value to generate a first product and to add the first product to a second value to generate a dequantized data value, and to store the dequantized data value in a packed data element location in the destination register.
    Type: Grant
    Filed: August 18, 2020
    Date of Patent: January 31, 2023
    Assignee: INTEL CORPORATION
    Inventors: Karthik Vaidyanathan, Michael Apodaca, Thomas Raoux, Carsten Benthin, Kai Xiao, Carson Brownlee, Joshua Barczak
  • Patent number: 11562461
    Abstract: An apparatus to facilitate compute optimization is disclosed. The apparatus includes one or more processing units to provide a first set of shader operations associated with a shader stage of a graphics pipeline, a scheduler to schedule shader threads for processing, and a field-programmable gate array (FPGA) dynamically configured to provide a second set of shader operations associated with the shader stage of the graphics pipeline.
    Type: Grant
    Filed: November 18, 2021
    Date of Patent: January 24, 2023
    Assignee: Intel Corporation
    Inventors: Prasoonkumar Surti, Narayan Srinivasa, Feng Chen, Joydeep Ray, Ben J. Ashbaugh, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Balaji Vembu, Tsung-Han Lin, Kamal Sinha, Rajkishore Barik, Sara S. Baghsorkhi, Justin E. Gottschlich, Altug Koker, Nadathur Rajagopalan Satish, Farshad Akhbari, Dukhwan Kim, Wenyin Fu, Travis T. Schluessler, Josh B. Mastronarde, Linda L. Hurd, John H. Feit, Jeffery S. Boles, Adam T. Lake, Karthik Vaidyanathan, Devan Burke, Subramaniam Maiyuran, Abhishek R. Appu
  • Patent number: 11556511
    Abstract: Embodiments are generally directed to compression for compression for sparse data structures utilizing mode search approximation. An embodiment of an apparatus includes one or more processors including a graphics processor to process data; and a memory for storage of data, including compressed data. The one or more processors are to provide for compression of a data structure, including identification of a mode in the data structure, the data structure including a plurality of values and the mode being a most repeated value in a data structure, wherein identification of the mode includes application of a mode approximation operation, and encoding of an output vector to include the identified mode, a significance map to indicate locations at which the mode is present in the data structure, and remaining uncompressed data from the data structure.
    Type: Grant
    Filed: April 1, 2019
    Date of Patent: January 17, 2023
    Assignee: Intel Corporation
    Inventors: Prasoonkumar Surti, Abhishek R. Appu, Karol Szerszen, Eric Liskay, Karthik Vaidyanathan
  • Publication number: 20220414970
    Abstract: Apparatus and method for speculative execution of hit and intersection shaders on programmable ray tracing architectures. For example, one embodiment of an apparatus comprises: single-instruction multiple-data (SIMD) or single-instruction multiple-thread (SIMT) execution units (EUs) to execute shaders; and ray tracing circuitry to execute a ray traversal thread, the ray tracing engine comprising: traversal/intersection circuitry, responsive to the traversal thread, to traverse a ray through an acceleration data structure comprising a plurality of hierarchically arranged nodes and to intersect the ray with a primitive contained within at least one of the nodes; and shader deferral circuitry to defer and aggregate multiple shader invocations resulting from the traversal thread until a particular triggering event is detected, wherein the multiple shaders are to be dispatched on the EUs in a single shader batch upon detection of the triggering event.
    Type: Application
    Filed: July 19, 2022
    Publication date: December 29, 2022
    Inventors: Gabor LIKTOR, Karthik VAIDYANATHAN, Jefferson AMSTUTZ, Atsuo KUWAHARA, Michael DOYLE, Travis SCHLUESSLER
  • Publication number: 20220366634
    Abstract: An apparatus and method for merging primitives and coordinating between vertex and ray transformations on a shared transformation unit. For example, one embodiment of a graphics processor comprises: a queue comprising a plurality of entries; ordering circuitry/logic to order triangles front to back within the queue; pairing circuitry/logic to identify triangles in the queue sharing an edge and to merge the triangles sharing an edge to produce merged triangle pairs; and shared transformation circuitry to alternate between performing vertex transformations on vertices of the merged triangle pairs and to performing ray transformations on ray direction/origin data.
    Type: Application
    Filed: May 17, 2022
    Publication date: November 17, 2022
    Applicant: Intel Corporation
    Inventors: Sven Woop, Prasoonkumar Surti, Karthik Vaidyanathan, Carsten Benthin, Joshua Barczak, Saikat Mandal
  • Publication number: 20220335562
    Abstract: Embodiments provide mechanisms to facilitate compute operations for deep neural networks. One embodiment comprises a graphics processing unit comprising one or more multiprocessors, at least one of the one or more multiprocessors including a register file to store a plurality of different types of operands and a plurality of processing cores. The plurality of processing cores includes a first set of processing cores of a first type and a second set of processing cores of a second type. The first set of processing cores are associated with a first memory channel and the second set of processing cores are associated with a second memory channel.
    Type: Application
    Filed: May 11, 2022
    Publication date: October 20, 2022
    Applicant: Intel Corporation
    Inventors: Prasoonkumar Surti, Narayan Srinivasa, Feng Chen, Joydeep Ray, Ben J. Ashbaugh, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Balaji Vembu, Tsung-Han Lin, Kamal Sinha, Rajkishore Barik, Sara S. Baghsorkhi, Justin E. Gottschlich, Altug Koker, Nadathur Rajagopalan Satish, Farshad Akhbari, Dukhwan Kim, Wenyin Fu, Travis T. Schluessler, Josh B. Mastronarde, Linda L. Hurd, John H. Feit, Jeffery S. Boles, Adam T. Lake, Karthik Vaidyanathan, Devan Burke, Subramaniam Maiyuran, Abhishek R. Appu