Patents by Inventor Vineet Goel

Vineet Goel has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Patched shading in graphics processing

Patent number: 11200733

Abstract: Aspects of this disclosure relate to a process for rendering graphics that includes performing, with a hardware unit of a graphics processing unit (GPU) designated for vertex shading, a vertex shading operation to shade input vertices so as to output vertex shaded vertices, wherein the hardware unit adheres to an interface that receives a single vertex as an input and generates a single vertex as an output. The process also includes performing, with the hardware unit of the GPU designated for vertex shading, a hull shading operation to generate one or more control points based on one or more of the vertex shaded vertices, wherein the one or more hull shading operations operate on at least one of the one or more vertex shaded vertices to output the one or more control points.

Type: Grant

Filed: December 11, 2019

Date of Patent: December 14, 2021

Assignee: QUALCOMM Incorporated

Inventors: Vineet Goel, Andrew Evan Gruber, Donghyun Kim
Method and system for depth pre-processing and geometry sorting using binning hardware

Patent number: 11195326

Abstract: Described herein are techniques for improving the effectiveness of depth culling. In a first technique, a binner is used to sort primitives into depth bins. Each depth bin covers a range of depths. The binner transmits the depth bins to the screen space pipeline for processing in near-to-far order. Processing the near bins first results in the depth buffer being updated, allowing fragments for the primitives in the farther bins to be culled more aggressively than if the depth binning did not occur. In a second technique, a buffer is used to initiate two-pass processing through the screen space pipeline. In the first pass, primitives are sent down to update the depth block and are then culled. The fragments are processed normally in the second pass, with the benefit of the updated depth values.

Type: Grant

Filed: September 21, 2018

Date of Patent: December 7, 2021

Assignee: Advanced Micro Devices, Inc.

Inventors: Ruijin Wu, Young In Yeo, Sagar S. Bhandare, Vineet Goel, Martin G. Sarov, Christopher J. Brennan
STACKED DIES FOR MACHINE LEARNING ACCELERATOR

Publication number: 20210374607

Abstract: A device is disclosed. The device includes a machine learning die including a memory and one or more machine learning accelerators; and a processing core die stacked with the machine learning die, the processing core die being configured to execute shader programs for controlling operations on the machine learning die, wherein the memory is configurable as either or both of a cache and a directly accessible memory.

Type: Application

Filed: December 21, 2020

Publication date: December 2, 2021

Applicant: Advanced Micro Devices, Inc.

Inventors: Maxim V. Kazakov, Swapnil P. Sakharshete, Milind N. Nemlekar, Vineet Goel
VRS rate feedback

Patent number: 11158106

Abstract: Techniques for performing shader operations are provided. The techniques include, performing pixel shading at a shading rate defined by pixel shader variable rate shading (“VRS”) data, updating the pixel VRS data that indicates one or more shading rates for one or more tiles based on whether the tiles of the one or more tiles include triangle edges or do not include triangle edges, to generate updated VRS data, and writing a VRS rate feedback buffer based on the updated VRS data.

Type: Grant

Filed: December 20, 2019

Date of Patent: October 26, 2021

Assignee: Advanced Micro Devices, Inc.

Inventors: Skyler Jonathon Saleh, Vineet Goel, Pazhani Pillai, Ruijin Wu, Christopher J. Brennan, Andrew S. Pomianowski
HYBRID BINNING

Publication number: 20210225060

Abstract: A processing device and a method of tiled rendering of an image for display is provided. The processing device includes memory and a processor. The processor is configured to receive the image comprising one or more three dimensional (3D) objects, divide the image into tiles, execute coarse level tiling for the tiles of the image and execute fine level tiling for the tiles of the image. The processing device also includes same fixed function hardware used to execute the coarse level tiling and the fine level tiling. The processor is also configured to determine visibility information for a first one of the tiles. The visibility information is divided into draw call visibility information and triangle visibility information for each remaining tile of the image.

Type: Application

Filed: September 25, 2020

Publication date: July 22, 2021

Applicant: Advanced Micro Devices, Inc.

Inventors: Mika Tuomi, Kiia Kallio, Ruijin Wu, Anirudh R. Acharya, Vineet Goel
VRS RATE FEEDBACK

Publication number: 20210192827

Abstract: Techniques for performing shader operations are provided. The techniques include, performing pixel shading at a shading rate defined by pixel shader variable rate shading (“VRS”) data, updating the pixel VRS data that indicates one or more shading rates for one or more tiles based on whether the tiles of the one or more tiles include triangle edges or do not include triangle edges, to generate updated VRS data, and writing a VRS rate feedback buffer based on the updated VRS data.

Type: Application

Filed: December 20, 2019

Publication date: June 24, 2021

Applicant: Advanced Micro Devices, Inc.

Inventors: Skyler Jonathon Saleh, Vineet Goel, Pazhani Pillai, Ruijin Wu, Christopher J. Brennan, Andrew S. Pomianowski
GAMING SUPER RESOLUTION

Publication number: 20210150669

Abstract: A processing device is provided which includes memory and a processor. The processor is configured to receive an input image having a first resolution, generate linear down-sampled versions of the input image by down-sampling the input image via a linear upscaling network and generate non-linear down-sampled versions of the input image by down-sampling the input image via a non-linear upscaling network.

Type: Application

Filed: November 18, 2019

Publication date: May 20, 2021

Applicant: Advanced Micro Devices, Inc.

Inventors: Alexander M. Potapov, Skyler Jonathon Saleh, Swapnil P. Sakharshete, Vineet Goel
FLEXIBLE MULTI-USER GRAPHICS ARCHITECTURE

Publication number: 20210089423

Abstract: A technique for operating a processor that includes multiple cores is provided. The technique includes determining a number of active applications, selecting a processor configuration for the processor based on the number of active applications, configuring the processor according to the selected processor configuration, and executing the active applications with the configured processor.

Type: Application

Filed: June 26, 2020

Publication date: March 25, 2021

Applicant: Advanced Micro Devices, Inc.

Inventors: Ruijin Wu, Skyler Jonathon Saleh, Vineet Goel
CHIPLET-INTEGRATED MACHINE LEARNING ACCELERATORS

Publication number: 20210026686

Abstract: Techniques for performing machine learning operations are provided. The techniques include configuring a first portion of a first chiplet as a cache; performing caching operations via the first portion; configuring at least a first sub-portion of the first portion of the chiplet as directly-accessible memory; and performing machine learning operations with the first sub-portion by a machine learning accelerator within the first chiplet.

Type: Application

Filed: July 20, 2020

Publication date: January 28, 2021

Applicant: Advanced Micro Devices, Inc.

Inventors: Swapnil P. Sakharshete, Andrew S. Pomianowski, Maxim V. Kazakov, Vineet Goel, Milind N. Nemlekar, Skyler Jonathon Saleh
HARDWARE ACCELERATED CONVOLUTION

Publication number: 20200184002

Abstract: A processing device is provided which includes memory configured to store data and a processor configured to determine, based on convolutional parameters associated with an image, a virtual general matrix-matrix multiplication (GEMM) space of a virtual GEMM space output matrix and generate, in the virtual GEMM space output matrix, a convolution result by matrix multiplying the data corresponding to a virtual GEMM space input matrix with the data corresponding to a virtual GEMM space filter matrix. The processing device also includes convolutional mapping hardware configured to map, based on the convolutional parameters, positions of the virtual GEMM space input matrix to positions of an image space of the image.

Type: Application

Filed: August 30, 2019

Publication date: June 11, 2020

Applicant: Advanced Micro Devices, Inc.

Inventors: Swapnil P. Sakharshete, Samuel Lawrence Wasmundt, Maxim V. Kazakov, Vineet Goel
PATCHED SHADING IN GRAPHICS PROCESSING

Publication number: 20200118328

Abstract: Aspects of this disclosure relate to a process for rendering graphics that includes performing, with a hardware unit of a graphics processing unit (GPU) designated for vertex shading, a vertex shading operation to shade input vertices so as to output vertex shaded vertices, wherein the hardware unit adheres to an interface that receives a single vertex as an input and generates a single vertex as an output. The process also includes performing, with the hardware unit of the GPU designated for vertex shading, a hull shading operation to generate one or more control points based on one or more of the vertex shaded vertices, wherein the one or more hull shading operations operate on at least one of the one or more vertex shaded vertices to output the one or more control points.

Type: Application

Filed: December 11, 2019

Publication date: April 16, 2020

Inventors: Vineet Goel, Andrew Evan Gruber, Donghyun Kim
METHOD AND SYSTEM FOR DEPTH PRE-PROCESSING AND GEOMETRY SORTING USING BINNING HARDWARE

Publication number: 20200098169

Abstract: Described herein are techniques for improving the effectiveness of depth culling. In a first technique, a binner is used to sort primitives into depth bins. Each depth bin covers a range of depths. The binner transmits the depth bins to the screen space pipeline for processing in near-to-far order. Processing the near bins first results in the depth buffer being updated, allowing fragments for the primitives in the farther bins to be culled more aggressively than if the depth binning did not occur. In a second technique, a buffer is used to initiate two-pass processing through the screen space pipeline. In the first pass, primitives are sent down to update the depth block and are then culled. The fragments are processed normally in the second pass, with the benefit of the updated depth values.

Type: Application

Filed: September 21, 2018

Publication date: March 26, 2020

Applicant: Advanced Micro Devices, Inc.

Inventors: Ruijin Wu, Young In Yeo, Sagar S. Bhandare, Vineet Goel, Martin G. Sarov, Christopher J. Brennan
Compression and decompression of indices in a graphics pipeline

Patent number: 10600142

Abstract: A compute unit accesses a chunk of bits that represent indices of vertices of a graphics primitive. The compute unit sets values of a first bit to indicate whether the chunk is monotonic or ordinary, second bits to define an offset that is determined based on values of indices in the chunk, and sets of third bits that determine values of the indices in the chunk based on the offset defined by the second bits. The compute unit writes a compressed chunk represented by the first bit, the second bits, and the sets of third bits to a memory. The compressed chunk is decompressed and the decompressed indices are written to an index buffer. In some embodiments, the indices are decompressed based on metadata that includes offsets that are determined based on values of the indices and bitfields that indicate characteristics of the indices.

Type: Grant

Filed: December 5, 2017

Date of Patent: March 24, 2020

Assignee: ADVANCED MICRO DEVICES, INC.

Inventors: Usame Ceylan, Young In Yeo, Todd Martin, Vineet Goel
Patched shading in graphics processing

Patent number: 10559123

Abstract: Aspects of this disclosure relate to a process for rendering graphics that includes designating a hardware shading unit of a graphics processing unit (GPU) to perform first shading operations associated with a first shader stage of a rendering pipeline. The process also includes switching operational modes of the hardware shading unit upon completion of the first shading operations. The process also includes performing, with the hardware shading unit of the GPU designated to perform the first shading operations, second shading operations associated with a second, different shader stage of the rendering pipeline.

Type: Grant

Filed: March 14, 2013

Date of Patent: February 11, 2020

Assignee: QUALCOMM Incorporated

Inventors: Vineet Goel, Andrew Evan Gruber
Patched shading in graphics processing

Patent number: 10535185

Abstract: Aspects of this disclosure relate to a process for rendering graphics that includes performing, with a hardware unit of a graphics processing unit (GPU) designated for vertex shading, a vertex shading operation to shade input vertices so as to output vertex shaded vertices, wherein the hardware unit adheres to an interface that receives a single vertex as an input and generates a single vertex as an output. The process also includes performing, with the hardware unit of the GPU designated for vertex shading, a hull shading operation to generate one or more control points based on one or more of the vertex shaded vertices, wherein the one or more hull shading operations operate on at least one of the one or more vertex shaded vertices to output the one or more control points.

Type: Grant

Filed: March 14, 2013

Date of Patent: January 14, 2020

Assignee: QUALCOMM Incorporated

Inventors: Vineet Goel, Andrew Evan Gruber, Donghyun Kim
Primitive level preemption using discrete non-real-time and real time pipelines

Patent number: 10453243

Abstract: Processing of non-real-time and real-time workloads is performed using discrete pipelines. A first pipeline includes a first shader and one or more fixed function hardware blocks. A second pipeline includes a second shader that is configured to emulate the at least one fixed function hardware block. First and second memory elements store first state information for the first pipeline and second state information for the second pipeline, respectively. A non-real-time workload executing in the first pipeline is preempted at a primitive boundary in response to a real-time workload being dispatched for execution in the second pipeline. The first memory element retains the first state information in response to preemption of the non-real-time workload. The first pipeline is configured to resume processing the subsequent primitive on the basis of the first state information stored in the first memory element.

Type: Grant

Filed: January 3, 2019

Date of Patent: October 22, 2019

Assignee: Advanced Micro Devices, Inc.

Inventors: Anirudh R. Acharya, Swapnil Sakharshete, Michael Mantor, Mangesh P. Nijasure, Todd Martin, Vineet Goel
Multi-step texture processing with feedback in texture unit

Patent number: 10417791

Abstract: Techniques are described for using a texture unit to perform operations of a shader processor. Some operations of a shader processor are repeatedly executed until a condition is satisfied, and in each execution iteration, the shader processor accesses the texture unit. Techniques are described for the texture unit to perform such operations until the condition is satisfied.

Type: Grant

Filed: June 12, 2018

Date of Patent: September 17, 2019

Assignee: QUALCOMM Incorporated

Inventors: Usame Ceylan, Vineet Goel, Juraj Obert, Liang Li
TEXTURE PROCESSOR BASED RAY TRACING ACCELERATION METHOD AND SYSTEM

Publication number: 20190197761

Abstract: A texture processor based ray tracing accelerator method and system are described. The system includes a shader, texture processor (TP) and cache, which are interconnected. The TP includes a texture address unit (TA), a texture cache processor (TCP), a filter pipeline unit and a ray intersection engine. The shader sends a texture instruction which contains ray data and a pointer to a bounded volume hierarchy (BVH) node to the TA. The TCP uses an address provided by the TA to fetch BVH node data from the cache. The ray intersection engine performs ray-BVH node type intersection testing using the ray data and the BVH node data. The intersection testing results and indications for BVH traversal are returned to the shader via a texture data return path. The shader reviews the intersection results and the indications to decide how to traverse to the next BVH node.

Type: Application

Filed: December 22, 2017

Publication date: June 27, 2019

Applicant: Advanced Micro Devices, Inc.

Inventors: Skyler Jonathon Saleh, Maxim V. Kazakov, Vineet Goel
COMPRESSION AND DECOMPRESSION OF INDICES IN A GRAPHICS PIPELINE

Publication number: 20190172173

Abstract: A compute unit accesses a chunk of bits that represent indices of vertices of a graphics primitive. The compute unit sets values of a first bit to indicate whether the chunk is monotonic or ordinary, second bits to define an offset that is determined based on values of indices in the chunk, and sets of third bits that determine values of the indices in the chunk based on the offset defined by the second bits. The compute unit writes a compressed chunk represented by the first bit, the second bits, and the sets of third bits to a memory. The compressed chunk is decompressed and the decompressed indices are written to an index buffer. In some embodiments, the indices are decompressed based on metadata that includes offsets that are determined based on values of the indices and bitfields that indicate characteristics of the indices.

Type: Application

Filed: December 5, 2017

Publication date: June 6, 2019

Inventors: Usame Ceylan, Young In Yeo, Todd Martin, Vineet Goel
PRIMITIVE LEVEL PREEMPTION USING DISCRETE NON-REAL-TIME AND REAL TIME PIPELINES

Publication number: 20190164328

Abstract: Processing of non-real-time and real-time workloads is performed using discrete pipelines. A first pipeline includes a first shader and one or more fixed function hardware blocks. A second pipeline includes a second shader that is configured to emulate the at least one fixed function hardware block. First and second memory elements store first state information for the first pipeline and second state information for the second pipeline, respectively. A non-real-time workload executing in the first pipeline is preempted at a primitive boundary in response to a real-time workload being dispatched for execution in the second pipeline. The first memory element retains the first state information in response to preemption of the non-real-time workload. The first pipeline is configured to resume processing the subsequent primitive on the basis of the first state information stored in the first memory element.

Type: Application

Filed: January 3, 2019

Publication date: May 30, 2019

Inventors: Anirudh R. ACHARYA, Swapnil SAKHARSHETE, Michael MANTOR, Mangesh P. NIJASURE, Todd MARTIN, Vineet GOEL

prev 1 2 3 4 5 6 … next