Patents by Inventor Vineet Goel
Vineet Goel has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20210374607Abstract: A device is disclosed. The device includes a machine learning die including a memory and one or more machine learning accelerators; and a processing core die stacked with the machine learning die, the processing core die being configured to execute shader programs for controlling operations on the machine learning die, wherein the memory is configurable as either or both of a cache and a directly accessible memory.Type: ApplicationFiled: December 21, 2020Publication date: December 2, 2021Applicant: Advanced Micro Devices, Inc.Inventors: Maxim V. Kazakov, Swapnil P. Sakharshete, Milind N. Nemlekar, Vineet Goel
-
Patent number: 11158106Abstract: Techniques for performing shader operations are provided. The techniques include, performing pixel shading at a shading rate defined by pixel shader variable rate shading (“VRS”) data, updating the pixel VRS data that indicates one or more shading rates for one or more tiles based on whether the tiles of the one or more tiles include triangle edges or do not include triangle edges, to generate updated VRS data, and writing a VRS rate feedback buffer based on the updated VRS data.Type: GrantFiled: December 20, 2019Date of Patent: October 26, 2021Assignee: Advanced Micro Devices, Inc.Inventors: Skyler Jonathon Saleh, Vineet Goel, Pazhani Pillai, Ruijin Wu, Christopher J. Brennan, Andrew S. Pomianowski
-
Publication number: 20210225060Abstract: A processing device and a method of tiled rendering of an image for display is provided. The processing device includes memory and a processor. The processor is configured to receive the image comprising one or more three dimensional (3D) objects, divide the image into tiles, execute coarse level tiling for the tiles of the image and execute fine level tiling for the tiles of the image. The processing device also includes same fixed function hardware used to execute the coarse level tiling and the fine level tiling. The processor is also configured to determine visibility information for a first one of the tiles. The visibility information is divided into draw call visibility information and triangle visibility information for each remaining tile of the image.Type: ApplicationFiled: September 25, 2020Publication date: July 22, 2021Applicant: Advanced Micro Devices, Inc.Inventors: Mika Tuomi, Kiia Kallio, Ruijin Wu, Anirudh R. Acharya, Vineet Goel
-
Publication number: 20210192827Abstract: Techniques for performing shader operations are provided. The techniques include, performing pixel shading at a shading rate defined by pixel shader variable rate shading (“VRS”) data, updating the pixel VRS data that indicates one or more shading rates for one or more tiles based on whether the tiles of the one or more tiles include triangle edges or do not include triangle edges, to generate updated VRS data, and writing a VRS rate feedback buffer based on the updated VRS data.Type: ApplicationFiled: December 20, 2019Publication date: June 24, 2021Applicant: Advanced Micro Devices, Inc.Inventors: Skyler Jonathon Saleh, Vineet Goel, Pazhani Pillai, Ruijin Wu, Christopher J. Brennan, Andrew S. Pomianowski
-
Publication number: 20210150669Abstract: A processing device is provided which includes memory and a processor. The processor is configured to receive an input image having a first resolution, generate linear down-sampled versions of the input image by down-sampling the input image via a linear upscaling network and generate non-linear down-sampled versions of the input image by down-sampling the input image via a non-linear upscaling network.Type: ApplicationFiled: November 18, 2019Publication date: May 20, 2021Applicant: Advanced Micro Devices, Inc.Inventors: Alexander M. Potapov, Skyler Jonathon Saleh, Swapnil P. Sakharshete, Vineet Goel
-
Publication number: 20210089423Abstract: A technique for operating a processor that includes multiple cores is provided. The technique includes determining a number of active applications, selecting a processor configuration for the processor based on the number of active applications, configuring the processor according to the selected processor configuration, and executing the active applications with the configured processor.Type: ApplicationFiled: June 26, 2020Publication date: March 25, 2021Applicant: Advanced Micro Devices, Inc.Inventors: Ruijin Wu, Skyler Jonathon Saleh, Vineet Goel
-
Publication number: 20210026686Abstract: Techniques for performing machine learning operations are provided. The techniques include configuring a first portion of a first chiplet as a cache; performing caching operations via the first portion; configuring at least a first sub-portion of the first portion of the chiplet as directly-accessible memory; and performing machine learning operations with the first sub-portion by a machine learning accelerator within the first chiplet.Type: ApplicationFiled: July 20, 2020Publication date: January 28, 2021Applicant: Advanced Micro Devices, Inc.Inventors: Swapnil P. Sakharshete, Andrew S. Pomianowski, Maxim V. Kazakov, Vineet Goel, Milind N. Nemlekar, Skyler Jonathon Saleh
-
Publication number: 20200184002Abstract: A processing device is provided which includes memory configured to store data and a processor configured to determine, based on convolutional parameters associated with an image, a virtual general matrix-matrix multiplication (GEMM) space of a virtual GEMM space output matrix and generate, in the virtual GEMM space output matrix, a convolution result by matrix multiplying the data corresponding to a virtual GEMM space input matrix with the data corresponding to a virtual GEMM space filter matrix. The processing device also includes convolutional mapping hardware configured to map, based on the convolutional parameters, positions of the virtual GEMM space input matrix to positions of an image space of the image.Type: ApplicationFiled: August 30, 2019Publication date: June 11, 2020Applicant: Advanced Micro Devices, Inc.Inventors: Swapnil P. Sakharshete, Samuel Lawrence Wasmundt, Maxim V. Kazakov, Vineet Goel
-
Publication number: 20200118328Abstract: Aspects of this disclosure relate to a process for rendering graphics that includes performing, with a hardware unit of a graphics processing unit (GPU) designated for vertex shading, a vertex shading operation to shade input vertices so as to output vertex shaded vertices, wherein the hardware unit adheres to an interface that receives a single vertex as an input and generates a single vertex as an output. The process also includes performing, with the hardware unit of the GPU designated for vertex shading, a hull shading operation to generate one or more control points based on one or more of the vertex shaded vertices, wherein the one or more hull shading operations operate on at least one of the one or more vertex shaded vertices to output the one or more control points.Type: ApplicationFiled: December 11, 2019Publication date: April 16, 2020Inventors: Vineet Goel, Andrew Evan Gruber, Donghyun Kim
-
Publication number: 20200098169Abstract: Described herein are techniques for improving the effectiveness of depth culling. In a first technique, a binner is used to sort primitives into depth bins. Each depth bin covers a range of depths. The binner transmits the depth bins to the screen space pipeline for processing in near-to-far order. Processing the near bins first results in the depth buffer being updated, allowing fragments for the primitives in the farther bins to be culled more aggressively than if the depth binning did not occur. In a second technique, a buffer is used to initiate two-pass processing through the screen space pipeline. In the first pass, primitives are sent down to update the depth block and are then culled. The fragments are processed normally in the second pass, with the benefit of the updated depth values.Type: ApplicationFiled: September 21, 2018Publication date: March 26, 2020Applicant: Advanced Micro Devices, Inc.Inventors: Ruijin Wu, Young In Yeo, Sagar S. Bhandare, Vineet Goel, Martin G. Sarov, Christopher J. Brennan
-
Patent number: 10600142Abstract: A compute unit accesses a chunk of bits that represent indices of vertices of a graphics primitive. The compute unit sets values of a first bit to indicate whether the chunk is monotonic or ordinary, second bits to define an offset that is determined based on values of indices in the chunk, and sets of third bits that determine values of the indices in the chunk based on the offset defined by the second bits. The compute unit writes a compressed chunk represented by the first bit, the second bits, and the sets of third bits to a memory. The compressed chunk is decompressed and the decompressed indices are written to an index buffer. In some embodiments, the indices are decompressed based on metadata that includes offsets that are determined based on values of the indices and bitfields that indicate characteristics of the indices.Type: GrantFiled: December 5, 2017Date of Patent: March 24, 2020Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Usame Ceylan, Young In Yeo, Todd Martin, Vineet Goel
-
Patent number: 10559123Abstract: Aspects of this disclosure relate to a process for rendering graphics that includes designating a hardware shading unit of a graphics processing unit (GPU) to perform first shading operations associated with a first shader stage of a rendering pipeline. The process also includes switching operational modes of the hardware shading unit upon completion of the first shading operations. The process also includes performing, with the hardware shading unit of the GPU designated to perform the first shading operations, second shading operations associated with a second, different shader stage of the rendering pipeline.Type: GrantFiled: March 14, 2013Date of Patent: February 11, 2020Assignee: QUALCOMM IncorporatedInventors: Vineet Goel, Andrew Evan Gruber
-
Patent number: 10535185Abstract: Aspects of this disclosure relate to a process for rendering graphics that includes performing, with a hardware unit of a graphics processing unit (GPU) designated for vertex shading, a vertex shading operation to shade input vertices so as to output vertex shaded vertices, wherein the hardware unit adheres to an interface that receives a single vertex as an input and generates a single vertex as an output. The process also includes performing, with the hardware unit of the GPU designated for vertex shading, a hull shading operation to generate one or more control points based on one or more of the vertex shaded vertices, wherein the one or more hull shading operations operate on at least one of the one or more vertex shaded vertices to output the one or more control points.Type: GrantFiled: March 14, 2013Date of Patent: January 14, 2020Assignee: QUALCOMM IncorporatedInventors: Vineet Goel, Andrew Evan Gruber, Donghyun Kim
-
Patent number: 10453243Abstract: Processing of non-real-time and real-time workloads is performed using discrete pipelines. A first pipeline includes a first shader and one or more fixed function hardware blocks. A second pipeline includes a second shader that is configured to emulate the at least one fixed function hardware block. First and second memory elements store first state information for the first pipeline and second state information for the second pipeline, respectively. A non-real-time workload executing in the first pipeline is preempted at a primitive boundary in response to a real-time workload being dispatched for execution in the second pipeline. The first memory element retains the first state information in response to preemption of the non-real-time workload. The first pipeline is configured to resume processing the subsequent primitive on the basis of the first state information stored in the first memory element.Type: GrantFiled: January 3, 2019Date of Patent: October 22, 2019Assignee: Advanced Micro Devices, Inc.Inventors: Anirudh R. Acharya, Swapnil Sakharshete, Michael Mantor, Mangesh P. Nijasure, Todd Martin, Vineet Goel
-
Patent number: 10417791Abstract: Techniques are described for using a texture unit to perform operations of a shader processor. Some operations of a shader processor are repeatedly executed until a condition is satisfied, and in each execution iteration, the shader processor accesses the texture unit. Techniques are described for the texture unit to perform such operations until the condition is satisfied.Type: GrantFiled: June 12, 2018Date of Patent: September 17, 2019Assignee: QUALCOMM IncorporatedInventors: Usame Ceylan, Vineet Goel, Juraj Obert, Liang Li
-
Publication number: 20190197761Abstract: A texture processor based ray tracing accelerator method and system are described. The system includes a shader, texture processor (TP) and cache, which are interconnected. The TP includes a texture address unit (TA), a texture cache processor (TCP), a filter pipeline unit and a ray intersection engine. The shader sends a texture instruction which contains ray data and a pointer to a bounded volume hierarchy (BVH) node to the TA. The TCP uses an address provided by the TA to fetch BVH node data from the cache. The ray intersection engine performs ray-BVH node type intersection testing using the ray data and the BVH node data. The intersection testing results and indications for BVH traversal are returned to the shader via a texture data return path. The shader reviews the intersection results and the indications to decide how to traverse to the next BVH node.Type: ApplicationFiled: December 22, 2017Publication date: June 27, 2019Applicant: Advanced Micro Devices, Inc.Inventors: Skyler Jonathon Saleh, Maxim V. Kazakov, Vineet Goel
-
Publication number: 20190172173Abstract: A compute unit accesses a chunk of bits that represent indices of vertices of a graphics primitive. The compute unit sets values of a first bit to indicate whether the chunk is monotonic or ordinary, second bits to define an offset that is determined based on values of indices in the chunk, and sets of third bits that determine values of the indices in the chunk based on the offset defined by the second bits. The compute unit writes a compressed chunk represented by the first bit, the second bits, and the sets of third bits to a memory. The compressed chunk is decompressed and the decompressed indices are written to an index buffer. In some embodiments, the indices are decompressed based on metadata that includes offsets that are determined based on values of the indices and bitfields that indicate characteristics of the indices.Type: ApplicationFiled: December 5, 2017Publication date: June 6, 2019Inventors: Usame Ceylan, Young In Yeo, Todd Martin, Vineet Goel
-
Publication number: 20190164328Abstract: Processing of non-real-time and real-time workloads is performed using discrete pipelines. A first pipeline includes a first shader and one or more fixed function hardware blocks. A second pipeline includes a second shader that is configured to emulate the at least one fixed function hardware block. First and second memory elements store first state information for the first pipeline and second state information for the second pipeline, respectively. A non-real-time workload executing in the first pipeline is preempted at a primitive boundary in response to a real-time workload being dispatched for execution in the second pipeline. The first memory element retains the first state information in response to preemption of the non-real-time workload. The first pipeline is configured to resume processing the subsequent primitive on the basis of the first state information stored in the first memory element.Type: ApplicationFiled: January 3, 2019Publication date: May 30, 2019Inventors: Anirudh R. ACHARYA, Swapnil SAKHARSHETE, Michael MANTOR, Mangesh P. NIJASURE, Todd MARTIN, Vineet GOEL
-
Patent number: 10210593Abstract: A graphics processing unit (GPU) may dispatch a first set of commands for execution on one or more processing units of the GPU. The GPU may receive notification from a host device indicating that a second set of commands are ready to execute on the GPU. In response, the GPU may issue a first preemption command at a first preemption granularity to the one or more processing units. In response to the GPU failing to preempt execution of the first set of commands within an elapsed time period after issuing the first preemption command, the GPU may issue a second preemption command at a second preemption granularity to the one or more processing units, where the second preemption granularity is finer-grained than the first preemption granularity.Type: GrantFiled: January 28, 2016Date of Patent: February 19, 2019Assignee: QUALCOMM IncorporatedInventors: Anirudh Rajendra Acharya, Alexei Vladimirovich Bourd, David Rigel Garcia Garcia, Milind Nilkanth Nemlekar, Vineet Goel
-
Patent number: 10210650Abstract: Processing of non-real-time and real-time workloads is performed using discrete pipelines. A first pipeline includes a first shader and one or more fixed function hardware blocks. A second pipeline includes a second shader that is configured to emulate the at least one fixed function hardware block. First and second memory elements store first state information for the first pipeline and second state information for the second pipeline, respectively. A non-real-time workload executing in the first pipeline is preempted at a primitive boundary in response to a real-time workload being dispatched for execution in the second pipeline. The first memory element retains the first state information in response to preemption of the non-real-time workload. The first pipeline is configured to resume processing the subsequent primitive on the basis of the first state information stored in the first memory element.Type: GrantFiled: November 30, 2017Date of Patent: February 19, 2019Assignee: Advanced Micro Devices, Inc.Inventors: Anirudh R. Acharya, Swapnil Sakharshete, Michael Mantor, Mangesh P. Nijasure, Todd Martin, Vineet Goel