Patents by Inventor Andrei Khodakovsky

Andrei Khodakovsky has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Binding constants at runtime for improved resource utilization

Patent number: 10877757

Abstract: A just-in-time (JIT) compiler binds constants to specific memory locations at runtime. The JIT compiler parses program code derived from a multithreaded application and identifies an instruction that references a uniform constant. The JIT compiler then determines a chain of pointers that originates within a root table specified in the multithreaded application and terminates at the uniform constant. The JIT compiler generates additional instructions for traversing the chain of pointers and inserts these instructions into the program code. A parallel processor executes this compiled code and, in doing so, causes a thread to traverse the chain of pointers and bind the uniform constant to a uniform register at runtime. Each thread in a group of threads executing on the parallel processor may then access the uniform constant.

Type: Grant

Filed: February 14, 2018

Date of Patent: December 29, 2020

Assignee: NVIDIA Corporation

Inventors: Ajay Tirumala, Jack Choquette, Manan Patel, Shirish Gadre, Praveen Kaushik, Amanpreet Grewal, Shekhar Divekar, Andrei Khodakovsky
Multi-GPU frame rendering

Patent number: 10430915

Abstract: One or more copy commands are scheduled for locating one or more pages of data in a local memory of a graphics processing unit (GPU) for more efficient access to the pages of data during rendering. A first processing unit that is coupled to a first GPU receives a notification that an access request count has reached a specified threshold. The first processing unit schedules a copy command to copy the first page of data to a first memory circuit of the first GPU from a second memory circuit of the second GPU. The copy command is included within a GPU command stream.

Type: Grant

Filed: January 24, 2018

Date of Patent: October 1, 2019

Assignee: NVIDIA Corporation

Inventors: Andrei Khodakovsky, Kirill A. Dmitriev, Rouslan L. Dimitrov, Tzyywei Hwang, Wishwesh Anil Gandhi, Lacky Vasant Shah
Multi-GPU frame rendering

Patent number: 10402937

Abstract: A method for rendering graphics frames allocates rendering work to multiple graphics processing units (GPUs) that are configured to allow access to pages of data stored in locally attached memory of a peer GPU. The method includes the steps of generating, by a first GPU coupled to a first memory circuit, one or more first memory access requests to render a first primitive for a first frame, where at least one of the first memory access requests targets a first page of data that physically resides within a second memory circuit coupled to a second GPU. The first GPU requests the first page of data through a first data link coupling the first GPU to the second GPU and a register circuit within the first GPU accumulates an access request count for the first page of data. The first GPU notifies a driver that the access request count has reached a specified threshold.

Type: Grant

Filed: December 28, 2017

Date of Patent: September 3, 2019

Assignee: NVIDIA Corporation

Inventors: Rouslan L. Dimitrov, Kirill A. Dmitriev, Andrei Khodakovsky, Tzyywei Hwang, Wishwesh Anil Gandhi, Lacky Vasant Shah
MULTI-GPU FRAME RENDERING

Publication number: 20190206018

Abstract: One or more copy commands are scheduled for locating one or more pages of data in a local memory of a graphics processing unit (GPU) for more efficient access to the pages of data during rendering. A first processing unit that is coupled to a first GPU receives a notification that an access request count has reached a specified threshold. The first processing unit schedules a copy command to copy the first page of data to a first memory circuit of the first GPU from a second memory circuit of the second GPU. The copy command is included within a GPU command stream.

Type: Application

Filed: January 24, 2018

Publication date: July 4, 2019

Inventors: Andrei Khodakovsky, Kirill A. Dmitriev, Rouslan L. Dimitrov, Tzyywei Hwang, Wishwesh Anil Gandhi, Lacky Vasant Shah
MULTI-GPU FRAME RENDERING

Publication number: 20190206023

Abstract: A method for rendering graphics frames allocates rendering work to multiple graphics processing units (GPUs) that are configured to allow access to pages of data stored in locally attached memory of a peer GPU. The method includes the steps of generating, by a first GPU coupled to a first memory circuit, one or more first memory access requests to render a first primitive for a first frame, where at least one of the first memory access requests targets a first page of data that physically resides within a second memory circuit coupled to a second GPU. The first GPU requests the first page of data through a first data link coupling the first GPU to the second GPU and a register circuit within the first GPU accumulates an access request count for the first page of data. The first GPU notifies a driver that the access request count has reached a specified threshold.

Type: Application

Filed: December 28, 2017

Publication date: July 4, 2019

Inventors: Rouslan L. Dimitrov, Kirill A. Dmitriev, Andrei Khodakovsky, Tzyywei Hwang, Wishwesh Anil Gandhi, Lacky Vasant Shah
BINDING CONSTANTS AT RUNTIME FOR IMPROVED RESOURCE UTILIZATION

Publication number: 20190146817

Abstract: A just-in-time (JIT) compiler binds constants to specific memory locations at runtime. The JIT compiler parses program code derived from a multithreaded application and identifies an instruction that references a uniform constant. The JIT compiler then determines a chain of pointers that originates within a root table specified in the multithreaded application and terminates at the uniform constant. The JIT compiler generates additional instructions for traversing the chain of pointers and inserts these instructions into the program code. A parallel processor executes this compiled code and, in doing so, causes a thread to traverse the chain of pointers and bind the uniform constant to a uniform register at runtime. Each thread in a group of threads executing on the parallel processor may then access the uniform constant.

Type: Application

Filed: February 14, 2018

Publication date: May 16, 2019

Inventors: Ajay TIRUMALA, Jack CHOQUETTE, Manan PATEL, Shirish GADRE, Praveen KAUSHIK, Amanpreet GREWAL, Shekhar DIVEKAR, Andrei KHODAKOVSKY
Techniques for managing graphics processing resources in a tile-based architecture

Patent number: 10083036

Abstract: One embodiment of the present invention sets forth a technique for managing graphics processing resources in a tile-based architecture. The technique includes storing a release packet associated with a graphics processing resource in a buffer and initiating a replay of graphics primitives stored in the buffer and associated with the graphics processing resource. The technique further includes, for each tile included in a plurality of tiles and processed during the replay, reading the release packet and determining whether the tile is a last tile processed during the replay. The technique further includes determining not to transmit the release packet to a screen-space pipeline and continuing to read graphics data stored in the buffer if the tile is not the last tile to be processed during the replay, or transmitting the release packet to the screen-space pipeline if the tile is the last tile to be processed during the replay.

Type: Grant

Filed: October 3, 2013

Date of Patent: September 25, 2018

Assignee: NVIDIA CORPORATION

Inventors: Ziyad S. Hakura, Cynthia Ann Edgeworth Allison, Dale L. Kirkland, Andrei Khodakovsky, Jeffrey A. Bolz
Managing deferred contexts in a cache tiling architecture

Patent number: 10032242

Abstract: A method for managing bind-render-target commands in a tile-based architecture. The method includes receiving a requested set of bound render targets and a draw command. The method also includes, upon receiving the draw command, determining whether a current set of bound render targets includes each of the render targets identified in the requested set. The method further includes, if the current set does not include each render target identified in the requested set, then issuing a flush-tiling-unit-command to a parallel processing subsystem, modifying the current set to include each render target identified in the requested set, and issuing bind-render-target commands identifying the requested set to the tile-based architecture for processing. The method further includes, if the current set of render targets includes each render target identified in the requested set, then not issuing the flush-tiling-unit-command.

Type: Grant

Filed: October 1, 2013

Date of Patent: July 24, 2018

Assignee: NVIDIA CORPORATION

Inventors: Ziyad S. Hakura, Jeffrey A. Bolz, Amanpreet Grewal, Matthew Johnson, Andrei Khodakovsky
Caching of adaptively sized cache tiles in a unified L2 cache with surface compression

Patent number: 9734548

Abstract: One embodiment of the present invention includes techniques for adaptively sizing cache tiles in a graphics system. A device driver associated with a graphics system sets a cache tile size associated with a cache tile to a first size. The detects a change from a first render target configuration that includes a first set of render targets to a second render target configuration that includes a second set of render targets. The device driver sets the cache tile size to a second size based on the second render target configuration. One advantage of the disclosed approach is that the cache tile size is adaptively sized, resulting in fewer cache tiles for less complex render target configurations. Adaptively sizing cache tiles leads to more efficient processor utilization and reduced power requirements. In addition, a unified L2 cache tile allows dynamic partitioning of cache memory between cache tile data and other data.

Type: Grant

Filed: August 28, 2013

Date of Patent: August 15, 2017

Assignee: NVIDIA Corporation

Inventors: Ziyad S. Hakura, Rouslan Dimitrov, Emmett M. Kilgariff, Andrei Khodakovsky
MANAGING DEFERRED CONTEXTS IN A CACHE TILING ARCHITECTURE

Publication number: 20170206623

Abstract: A method for managing bind-render-target commands in a tile-based architecture. The method includes receiving a requested set of bound render targets and a draw command. The method also includes, upon receiving the draw command, determining whether a current set of bound render targets includes each of the render targets identified in the requested set. The method further includes, if the current set does not include each render target identified in the requested set, then issuing a flush-tiling-unit-command to a parallel processing subsystem, modifying the current set to include each render target identified in the requested set, and issuing bind-render-target commands identifying the requested set to the tile-based architecture for processing. The method further includes, if the current set of render targets includes each render target identified in the requested set, then not issuing the flush-tiling-unit-command.

Type: Application

Filed: October 1, 2013

Publication date: July 20, 2017

Applicant: NVIDIA CORPORATION

Inventors: Ziyad S. HAKURA, Jeffrey A. BOLZ, Amanpreet GREWAL, Matthew JOHNSON, Andrei KHODAKOVSKY
System, method, and computer program product for using compression with programmable sample locations

Patent number: 9230362

Abstract: A system, method, and computer program product enable compression with programmable sample locations, where the compression is a function of the programmable sample locations. The method includes the steps of storing a first value specifying a programmed sample location within a pixel in a sample pattern table and storing, in a memory, geometric surface parameters corresponding to a first attribute at the programmed sample location within a first pixel of a display surface. An instruction to store a second value specifying the programmed sample location within the pixel in the sample pattern table is received. The attribute is reconstructed based on the geometric surface parameters and the first value.

Type: Grant

Filed: September 11, 2013

Date of Patent: January 5, 2016

Assignee: NVIDIA Corporation

Inventors: Eric B. Lum, Jeffrey Alan Bolz, Rui Manuel Bastos, Andrei Khodakovsky, Christian Johannes Amsinck, Bengt-Olaf Schneider
System, method, and computer program product for using compression with programmable sample locations

Patent number: 9230363

Abstract: A system, method, and computer program product enable compression with programmable sample locations, where the compression is a function of the programmable sample locations. The method includes the steps of storing a first value specifying a programmed sample location within a pixel in a first sample pattern table that is associated with a first display surface and storing, in a memory, geometric surface parameters corresponding to a first attribute at the programmed sample location within a first pixel of the first display surface. A second value specifying the programmed sample location within the pixel in a second sample pattern table that is associated with a second display surface is also stored and the first attribute is reconstructed based on the geometric surface parameters and the first value.

Type: Grant

Filed: September 11, 2013

Date of Patent: January 5, 2016

Assignee: NVIDIA Corporation

Inventors: Eric B. Lum, Jeffrey Alan Bolz, Rui Manuel Bastos, Andrei Khodakovsky, Christian Johannes Amsinck, Bengt-Olaf Schneider
SYSTEM, METHOD, AND COMPUTER PROGRAM PRODUCT FOR MAPPING TILES TO PHYSICAL MEMORY LOCATIONS

Publication number: 20150109315

Abstract: A system, method, and computer program product are provided for mapping tiles to physical memory locations. In use, a plurality of virtual tiles associated with a texture is identified. Additionally, a request to perform a mapping of the plurality of virtual tiles to one or more physical memory locations is received. Further, the plurality of virtual tiles is mapped to the one or more physical memory locations, utilizing a page table.

Type: Application

Filed: October 23, 2013

Publication date: April 23, 2015

Applicant: NVIDIA Corporation

Inventors: Amanpreet Grewal, Andrei Khodakovsky, Yu Denny Dong, Henry Packard Moreton, Naveen Leekha
SYSTEM, METHOD, AND COMPUTER PROGRAM PRODUCT FOR USING COMPRESSION WITH PROGRAMMABLE SAMPLE LOCATIONS

Publication number: 20150070380

Abstract: A system, method, and computer program product are provided for using compression with programmable sample locations, where the compression is a function of the programmable sample locations. The method includes the steps of storing a first value specifying a programmed sample location within a pixel in a sample pattern table and storing, in a memory, geometric surface parameters corresponding to a first attribute at the programmed sample location within a first pixel of a display surface. An instruction to store a second value specifying the programmed sample location within the pixel in the sample pattern table is received. The attribute is reconstructed based on the geometric surface parameters and the first value.

Type: Application

Filed: September 11, 2013

Publication date: March 12, 2015

Applicant: NVIDIA Corporation

Inventors: Eric B. Lum, Jeffrey Alan Bolz, Rui Manuel Bastos, Andrei Khodakovsky, Christian Johannes Amsinck, Bengt-Olaf Schneider
SYSTEM, METHOD, AND COMPUTER PROGRAM PRODUCT FOR USING COMPRESSION WITH PROGRAMMABLE SAMPLE LOCATIONS

Publication number: 20150070381

Abstract: A system, method, and computer program product are provided for using compression with programmable sample locations, where the compression is a function of the programmable sample locations. The method includes the steps of storing a first value specifying a programmed sample location within a pixel in a first sample pattern table that is associated with a first display surface and storing, in a memory, geometric surface parameters corresponding to a first attribute at the programmed sample location within a first pixel of the first display surface. A second value specifying the programmed sample location within the pixel in a second sample pattern table that is associated with a second display surface is also stored and the first attribute is reconstructed based on the geometric surface parameters and the first value.

Type: Application

Filed: September 11, 2013

Publication date: March 12, 2015

Applicant: NVIDIA Corporation

Inventors: Eric B. Lum, Jeffrey Alan Bolz, Rui Manuel Bastos, Andrei Khodakovsky, Christian Johannes Amsinck, Bengt-Olaf Schneider
CACHING OF ADAPTIVELY SIZED CACHE TILES IN A UNIFIED L2 CACHE WITH SURFACE COMPRESSION

Publication number: 20140118379

Abstract: One embodiment of the present invention includes techniques for adaptively sizing cache tiles in a graphics system. A device driver associated with a graphics system sets a cache tile size associated with a cache tile to a first size. The detects a change from a first render target configuration that includes a first set of render targets to a second render target configuration that includes a second set of render targets. The device driver sets the cache tile size to a second size based on the second render target configuration. One advantage of the disclosed approach is that the cache tile size is adaptively sized, resulting in fewer cache tiles for less complex render target configurations. Adaptively sizing cache tiles leads to more efficient processor utilization and reduced power requirements. In addition, a unified L2 cache tile allows dynamic partitioning of cache memory between cache tile data and other data.

Type: Application

Filed: August 28, 2013

Publication date: May 1, 2014

Applicant: NVIDIA CORPORATION

Inventors: Ziyad S. HAKURA, Rouslan DIMITROV, Emmett M. KILGARIFF, Andrei KHODAKOVSKY
MANAGING DEFERRED CONTEXTS IN A CACHE TILING ARCHITECTURE

Publication number: 20140118363

Abstract: A method for managing bind-render-target commands in a tile-based architecture. The method includes receiving a requested set of bound render targets and a draw command. The method also includes, upon receiving the draw command, determining whether a current set of bound render targets includes each of the render targets identified in the requested set. The method further includes, if the current set does not include each render target identified in the requested set, then issuing a flush-tiling-unit-command to a parallel processing subsystem, modifying the current set to include each render target identified in the requested set, and issuing bind-render-target commands identifying the requested set to the tile-based architecture for processing. The method further includes, if the current set of render targets includes each render target identified in the requested set, then not issuing the flush-tiling-unit-command.

Type: Application

Filed: October 1, 2013

Publication date: May 1, 2014

Applicant: NVIDIA CORPORATION

Inventors: Ziyad S. HAKURA, Jeffrey A. BOLZ, Amanpreet GREWAL, Matthew JOHNSON, Andrei KHODAKOVSKY
TECHNIQUES FOR MANAGINGGRAPHICS PROCESSING RESOURCES IN A TILE-BASED ARCHITECTURE

Publication number: 20140118373

Abstract: One embodiment of the present invention sets forth a technique for managing graphics processing resources in a tile-based architecture. The technique includes storing a release packet associated with a graphics processing resource in a buffer and initiating a replay of graphics primitives stored in the buffer and associated with the graphics processing resource. The technique further includes, for each tile included in a plurality of tiles and processed during the replay, reading the release packet and determining whether the tile is a last tile processed during the replay. The technique further includes determining not to transmit the release packet to a screen-space pipeline and continuing to read graphics data stored in the buffer if the tile is not the last tile to be processed during the replay, or transmitting the release packet to the screen-space pipeline if the tile is the last tile to be processed during the replay.

Type: Application

Filed: October 3, 2013

Publication date: May 1, 2014

Applicant: NVIDIA CORPORATION

Inventors: Ziyad S. HAKURA, Cynthia Ann Edgeworth ALLISON, Dale L. KIRKLAND, Andrei KHODAKOVSKY, Jeffrey A. BOLZ
System and method for temporal load balancing across GPUs

Patent number: 8427474

Abstract: One embodiment of the present invention sets forth a method for dynamically load balancing rendering operations across an IGPU and a DGPU. For each frame, the graphics driver configures the IGPU to pre-compute Z-values for a portion of the display surface and to write feedback data to the system memory indicating the time that the IGPU used to process the frame. The graphics driver then configures the DGPU to use the pre-computed Z-values while rendering to the complete display surface and to write feedback data to the system memory indicating the time that the DGPU used to process the frame. The graphics driver uses the feedback data from the IGPU and DGPU in conjunction with the percentage of the display surface that the IGPU Z-rendered for the frame to scale the portion of the display surface that the IGPU Z-renders for one or more subsequent frames. In this fashion, overall processing within the graphics pipeline is optimized across the IGPU and DGPU.

Type: Grant

Filed: October 3, 2008

Date of Patent: April 23, 2013

Assignee: Nvidia Corporation

Inventors: Andrei Khodakovsky, Franck R. Diard
System and method for temporal load balancing across GPUs

Patent number: 8228337

Abstract: One embodiment of the present invention sets forth a method for dynamically load balancing rendering operations across an IGPU and a DGPU. For each frame, the graphics driver configures the IGPU to pre-compute Z-values for a portion of the display surface and to write feedback data to the system memory indicating the time that the IGPU used to process the frame. The graphics driver then configures the DGPU to use the pre-computed Z-values while rendering to the complete display surface and to write feedback data to the system memory indicating the time that the DGPU used to process the frame. The graphics driver uses the feedback data from the IGPU and DGPU in conjunction with the percentage of the display surface that the IGPU Z-rendered for the frame to scale the portion of the display surface that the IGPU Z-renders for one or more subsequent frames. In this fashion, overall processing within the graphics pipeline is optimized across the IGPU and DGPU.

Type: Grant

Filed: October 3, 2008

Date of Patent: July 24, 2012

Assignee: NVIDIA Corporation

Inventors: Andrei Khodakovsky, Franck R. Diard

1 2 next