Patents by Inventor Andrei Khodakovsky

Andrei Khodakovsky has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10877757
    Abstract: A just-in-time (JIT) compiler binds constants to specific memory locations at runtime. The JIT compiler parses program code derived from a multithreaded application and identifies an instruction that references a uniform constant. The JIT compiler then determines a chain of pointers that originates within a root table specified in the multithreaded application and terminates at the uniform constant. The JIT compiler generates additional instructions for traversing the chain of pointers and inserts these instructions into the program code. A parallel processor executes this compiled code and, in doing so, causes a thread to traverse the chain of pointers and bind the uniform constant to a uniform register at runtime. Each thread in a group of threads executing on the parallel processor may then access the uniform constant.
    Type: Grant
    Filed: February 14, 2018
    Date of Patent: December 29, 2020
    Assignee: NVIDIA Corporation
    Inventors: Ajay Tirumala, Jack Choquette, Manan Patel, Shirish Gadre, Praveen Kaushik, Amanpreet Grewal, Shekhar Divekar, Andrei Khodakovsky
  • Patent number: 10430915
    Abstract: One or more copy commands are scheduled for locating one or more pages of data in a local memory of a graphics processing unit (GPU) for more efficient access to the pages of data during rendering. A first processing unit that is coupled to a first GPU receives a notification that an access request count has reached a specified threshold. The first processing unit schedules a copy command to copy the first page of data to a first memory circuit of the first GPU from a second memory circuit of the second GPU. The copy command is included within a GPU command stream.
    Type: Grant
    Filed: January 24, 2018
    Date of Patent: October 1, 2019
    Assignee: NVIDIA Corporation
    Inventors: Andrei Khodakovsky, Kirill A. Dmitriev, Rouslan L. Dimitrov, Tzyywei Hwang, Wishwesh Anil Gandhi, Lacky Vasant Shah
  • Patent number: 10402937
    Abstract: A method for rendering graphics frames allocates rendering work to multiple graphics processing units (GPUs) that are configured to allow access to pages of data stored in locally attached memory of a peer GPU. The method includes the steps of generating, by a first GPU coupled to a first memory circuit, one or more first memory access requests to render a first primitive for a first frame, where at least one of the first memory access requests targets a first page of data that physically resides within a second memory circuit coupled to a second GPU. The first GPU requests the first page of data through a first data link coupling the first GPU to the second GPU and a register circuit within the first GPU accumulates an access request count for the first page of data. The first GPU notifies a driver that the access request count has reached a specified threshold.
    Type: Grant
    Filed: December 28, 2017
    Date of Patent: September 3, 2019
    Assignee: NVIDIA Corporation
    Inventors: Rouslan L. Dimitrov, Kirill A. Dmitriev, Andrei Khodakovsky, Tzyywei Hwang, Wishwesh Anil Gandhi, Lacky Vasant Shah
  • Publication number: 20190206018
    Abstract: One or more copy commands are scheduled for locating one or more pages of data in a local memory of a graphics processing unit (GPU) for more efficient access to the pages of data during rendering. A first processing unit that is coupled to a first GPU receives a notification that an access request count has reached a specified threshold. The first processing unit schedules a copy command to copy the first page of data to a first memory circuit of the first GPU from a second memory circuit of the second GPU. The copy command is included within a GPU command stream.
    Type: Application
    Filed: January 24, 2018
    Publication date: July 4, 2019
    Inventors: Andrei Khodakovsky, Kirill A. Dmitriev, Rouslan L. Dimitrov, Tzyywei Hwang, Wishwesh Anil Gandhi, Lacky Vasant Shah
  • Publication number: 20190206023
    Abstract: A method for rendering graphics frames allocates rendering work to multiple graphics processing units (GPUs) that are configured to allow access to pages of data stored in locally attached memory of a peer GPU. The method includes the steps of generating, by a first GPU coupled to a first memory circuit, one or more first memory access requests to render a first primitive for a first frame, where at least one of the first memory access requests targets a first page of data that physically resides within a second memory circuit coupled to a second GPU. The first GPU requests the first page of data through a first data link coupling the first GPU to the second GPU and a register circuit within the first GPU accumulates an access request count for the first page of data. The first GPU notifies a driver that the access request count has reached a specified threshold.
    Type: Application
    Filed: December 28, 2017
    Publication date: July 4, 2019
    Inventors: Rouslan L. Dimitrov, Kirill A. Dmitriev, Andrei Khodakovsky, Tzyywei Hwang, Wishwesh Anil Gandhi, Lacky Vasant Shah
  • Publication number: 20190146817
    Abstract: A just-in-time (JIT) compiler binds constants to specific memory locations at runtime. The JIT compiler parses program code derived from a multithreaded application and identifies an instruction that references a uniform constant. The JIT compiler then determines a chain of pointers that originates within a root table specified in the multithreaded application and terminates at the uniform constant. The JIT compiler generates additional instructions for traversing the chain of pointers and inserts these instructions into the program code. A parallel processor executes this compiled code and, in doing so, causes a thread to traverse the chain of pointers and bind the uniform constant to a uniform register at runtime. Each thread in a group of threads executing on the parallel processor may then access the uniform constant.
    Type: Application
    Filed: February 14, 2018
    Publication date: May 16, 2019
    Inventors: Ajay TIRUMALA, Jack CHOQUETTE, Manan PATEL, Shirish GADRE, Praveen KAUSHIK, Amanpreet GREWAL, Shekhar DIVEKAR, Andrei KHODAKOVSKY
  • Patent number: 10083036
    Abstract: One embodiment of the present invention sets forth a technique for managing graphics processing resources in a tile-based architecture. The technique includes storing a release packet associated with a graphics processing resource in a buffer and initiating a replay of graphics primitives stored in the buffer and associated with the graphics processing resource. The technique further includes, for each tile included in a plurality of tiles and processed during the replay, reading the release packet and determining whether the tile is a last tile processed during the replay. The technique further includes determining not to transmit the release packet to a screen-space pipeline and continuing to read graphics data stored in the buffer if the tile is not the last tile to be processed during the replay, or transmitting the release packet to the screen-space pipeline if the tile is the last tile to be processed during the replay.
    Type: Grant
    Filed: October 3, 2013
    Date of Patent: September 25, 2018
    Assignee: NVIDIA CORPORATION
    Inventors: Ziyad S. Hakura, Cynthia Ann Edgeworth Allison, Dale L. Kirkland, Andrei Khodakovsky, Jeffrey A. Bolz
  • Patent number: 10032242
    Abstract: A method for managing bind-render-target commands in a tile-based architecture. The method includes receiving a requested set of bound render targets and a draw command. The method also includes, upon receiving the draw command, determining whether a current set of bound render targets includes each of the render targets identified in the requested set. The method further includes, if the current set does not include each render target identified in the requested set, then issuing a flush-tiling-unit-command to a parallel processing subsystem, modifying the current set to include each render target identified in the requested set, and issuing bind-render-target commands identifying the requested set to the tile-based architecture for processing. The method further includes, if the current set of render targets includes each render target identified in the requested set, then not issuing the flush-tiling-unit-command.
    Type: Grant
    Filed: October 1, 2013
    Date of Patent: July 24, 2018
    Assignee: NVIDIA CORPORATION
    Inventors: Ziyad S. Hakura, Jeffrey A. Bolz, Amanpreet Grewal, Matthew Johnson, Andrei Khodakovsky
  • Patent number: 9734548
    Abstract: One embodiment of the present invention includes techniques for adaptively sizing cache tiles in a graphics system. A device driver associated with a graphics system sets a cache tile size associated with a cache tile to a first size. The detects a change from a first render target configuration that includes a first set of render targets to a second render target configuration that includes a second set of render targets. The device driver sets the cache tile size to a second size based on the second render target configuration. One advantage of the disclosed approach is that the cache tile size is adaptively sized, resulting in fewer cache tiles for less complex render target configurations. Adaptively sizing cache tiles leads to more efficient processor utilization and reduced power requirements. In addition, a unified L2 cache tile allows dynamic partitioning of cache memory between cache tile data and other data.
    Type: Grant
    Filed: August 28, 2013
    Date of Patent: August 15, 2017
    Assignee: NVIDIA Corporation
    Inventors: Ziyad S. Hakura, Rouslan Dimitrov, Emmett M. Kilgariff, Andrei Khodakovsky
  • Publication number: 20170206623
    Abstract: A method for managing bind-render-target commands in a tile-based architecture. The method includes receiving a requested set of bound render targets and a draw command. The method also includes, upon receiving the draw command, determining whether a current set of bound render targets includes each of the render targets identified in the requested set. The method further includes, if the current set does not include each render target identified in the requested set, then issuing a flush-tiling-unit-command to a parallel processing subsystem, modifying the current set to include each render target identified in the requested set, and issuing bind-render-target commands identifying the requested set to the tile-based architecture for processing. The method further includes, if the current set of render targets includes each render target identified in the requested set, then not issuing the flush-tiling-unit-command.
    Type: Application
    Filed: October 1, 2013
    Publication date: July 20, 2017
    Applicant: NVIDIA CORPORATION
    Inventors: Ziyad S. HAKURA, Jeffrey A. BOLZ, Amanpreet GREWAL, Matthew JOHNSON, Andrei KHODAKOVSKY
  • Patent number: 9230362
    Abstract: A system, method, and computer program product enable compression with programmable sample locations, where the compression is a function of the programmable sample locations. The method includes the steps of storing a first value specifying a programmed sample location within a pixel in a sample pattern table and storing, in a memory, geometric surface parameters corresponding to a first attribute at the programmed sample location within a first pixel of a display surface. An instruction to store a second value specifying the programmed sample location within the pixel in the sample pattern table is received. The attribute is reconstructed based on the geometric surface parameters and the first value.
    Type: Grant
    Filed: September 11, 2013
    Date of Patent: January 5, 2016
    Assignee: NVIDIA Corporation
    Inventors: Eric B. Lum, Jeffrey Alan Bolz, Rui Manuel Bastos, Andrei Khodakovsky, Christian Johannes Amsinck, Bengt-Olaf Schneider
  • Patent number: 9230363
    Abstract: A system, method, and computer program product enable compression with programmable sample locations, where the compression is a function of the programmable sample locations. The method includes the steps of storing a first value specifying a programmed sample location within a pixel in a first sample pattern table that is associated with a first display surface and storing, in a memory, geometric surface parameters corresponding to a first attribute at the programmed sample location within a first pixel of the first display surface. A second value specifying the programmed sample location within the pixel in a second sample pattern table that is associated with a second display surface is also stored and the first attribute is reconstructed based on the geometric surface parameters and the first value.
    Type: Grant
    Filed: September 11, 2013
    Date of Patent: January 5, 2016
    Assignee: NVIDIA Corporation
    Inventors: Eric B. Lum, Jeffrey Alan Bolz, Rui Manuel Bastos, Andrei Khodakovsky, Christian Johannes Amsinck, Bengt-Olaf Schneider
  • Publication number: 20150109315
    Abstract: A system, method, and computer program product are provided for mapping tiles to physical memory locations. In use, a plurality of virtual tiles associated with a texture is identified. Additionally, a request to perform a mapping of the plurality of virtual tiles to one or more physical memory locations is received. Further, the plurality of virtual tiles is mapped to the one or more physical memory locations, utilizing a page table.
    Type: Application
    Filed: October 23, 2013
    Publication date: April 23, 2015
    Applicant: NVIDIA Corporation
    Inventors: Amanpreet Grewal, Andrei Khodakovsky, Yu Denny Dong, Henry Packard Moreton, Naveen Leekha
  • Publication number: 20150070380
    Abstract: A system, method, and computer program product are provided for using compression with programmable sample locations, where the compression is a function of the programmable sample locations. The method includes the steps of storing a first value specifying a programmed sample location within a pixel in a sample pattern table and storing, in a memory, geometric surface parameters corresponding to a first attribute at the programmed sample location within a first pixel of a display surface. An instruction to store a second value specifying the programmed sample location within the pixel in the sample pattern table is received. The attribute is reconstructed based on the geometric surface parameters and the first value.
    Type: Application
    Filed: September 11, 2013
    Publication date: March 12, 2015
    Applicant: NVIDIA Corporation
    Inventors: Eric B. Lum, Jeffrey Alan Bolz, Rui Manuel Bastos, Andrei Khodakovsky, Christian Johannes Amsinck, Bengt-Olaf Schneider
  • Publication number: 20150070381
    Abstract: A system, method, and computer program product are provided for using compression with programmable sample locations, where the compression is a function of the programmable sample locations. The method includes the steps of storing a first value specifying a programmed sample location within a pixel in a first sample pattern table that is associated with a first display surface and storing, in a memory, geometric surface parameters corresponding to a first attribute at the programmed sample location within a first pixel of the first display surface. A second value specifying the programmed sample location within the pixel in a second sample pattern table that is associated with a second display surface is also stored and the first attribute is reconstructed based on the geometric surface parameters and the first value.
    Type: Application
    Filed: September 11, 2013
    Publication date: March 12, 2015
    Applicant: NVIDIA Corporation
    Inventors: Eric B. Lum, Jeffrey Alan Bolz, Rui Manuel Bastos, Andrei Khodakovsky, Christian Johannes Amsinck, Bengt-Olaf Schneider
  • Publication number: 20140118379
    Abstract: One embodiment of the present invention includes techniques for adaptively sizing cache tiles in a graphics system. A device driver associated with a graphics system sets a cache tile size associated with a cache tile to a first size. The detects a change from a first render target configuration that includes a first set of render targets to a second render target configuration that includes a second set of render targets. The device driver sets the cache tile size to a second size based on the second render target configuration. One advantage of the disclosed approach is that the cache tile size is adaptively sized, resulting in fewer cache tiles for less complex render target configurations. Adaptively sizing cache tiles leads to more efficient processor utilization and reduced power requirements. In addition, a unified L2 cache tile allows dynamic partitioning of cache memory between cache tile data and other data.
    Type: Application
    Filed: August 28, 2013
    Publication date: May 1, 2014
    Applicant: NVIDIA CORPORATION
    Inventors: Ziyad S. HAKURA, Rouslan DIMITROV, Emmett M. KILGARIFF, Andrei KHODAKOVSKY
  • Publication number: 20140118363
    Abstract: A method for managing bind-render-target commands in a tile-based architecture. The method includes receiving a requested set of bound render targets and a draw command. The method also includes, upon receiving the draw command, determining whether a current set of bound render targets includes each of the render targets identified in the requested set. The method further includes, if the current set does not include each render target identified in the requested set, then issuing a flush-tiling-unit-command to a parallel processing subsystem, modifying the current set to include each render target identified in the requested set, and issuing bind-render-target commands identifying the requested set to the tile-based architecture for processing. The method further includes, if the current set of render targets includes each render target identified in the requested set, then not issuing the flush-tiling-unit-command.
    Type: Application
    Filed: October 1, 2013
    Publication date: May 1, 2014
    Applicant: NVIDIA CORPORATION
    Inventors: Ziyad S. HAKURA, Jeffrey A. BOLZ, Amanpreet GREWAL, Matthew JOHNSON, Andrei KHODAKOVSKY
  • Publication number: 20140118373
    Abstract: One embodiment of the present invention sets forth a technique for managing graphics processing resources in a tile-based architecture. The technique includes storing a release packet associated with a graphics processing resource in a buffer and initiating a replay of graphics primitives stored in the buffer and associated with the graphics processing resource. The technique further includes, for each tile included in a plurality of tiles and processed during the replay, reading the release packet and determining whether the tile is a last tile processed during the replay. The technique further includes determining not to transmit the release packet to a screen-space pipeline and continuing to read graphics data stored in the buffer if the tile is not the last tile to be processed during the replay, or transmitting the release packet to the screen-space pipeline if the tile is the last tile to be processed during the replay.
    Type: Application
    Filed: October 3, 2013
    Publication date: May 1, 2014
    Applicant: NVIDIA CORPORATION
    Inventors: Ziyad S. HAKURA, Cynthia Ann Edgeworth ALLISON, Dale L. KIRKLAND, Andrei KHODAKOVSKY, Jeffrey A. BOLZ
  • Patent number: 8427474
    Abstract: One embodiment of the present invention sets forth a method for dynamically load balancing rendering operations across an IGPU and a DGPU. For each frame, the graphics driver configures the IGPU to pre-compute Z-values for a portion of the display surface and to write feedback data to the system memory indicating the time that the IGPU used to process the frame. The graphics driver then configures the DGPU to use the pre-computed Z-values while rendering to the complete display surface and to write feedback data to the system memory indicating the time that the DGPU used to process the frame. The graphics driver uses the feedback data from the IGPU and DGPU in conjunction with the percentage of the display surface that the IGPU Z-rendered for the frame to scale the portion of the display surface that the IGPU Z-renders for one or more subsequent frames. In this fashion, overall processing within the graphics pipeline is optimized across the IGPU and DGPU.
    Type: Grant
    Filed: October 3, 2008
    Date of Patent: April 23, 2013
    Assignee: Nvidia Corporation
    Inventors: Andrei Khodakovsky, Franck R. Diard
  • Patent number: 8228337
    Abstract: One embodiment of the present invention sets forth a method for dynamically load balancing rendering operations across an IGPU and a DGPU. For each frame, the graphics driver configures the IGPU to pre-compute Z-values for a portion of the display surface and to write feedback data to the system memory indicating the time that the IGPU used to process the frame. The graphics driver then configures the DGPU to use the pre-computed Z-values while rendering to the complete display surface and to write feedback data to the system memory indicating the time that the DGPU used to process the frame. The graphics driver uses the feedback data from the IGPU and DGPU in conjunction with the percentage of the display surface that the IGPU Z-rendered for the frame to scale the portion of the display surface that the IGPU Z-renders for one or more subsequent frames. In this fashion, overall processing within the graphics pipeline is optimized across the IGPU and DGPU.
    Type: Grant
    Filed: October 3, 2008
    Date of Patent: July 24, 2012
    Assignee: NVIDIA Corporation
    Inventors: Andrei Khodakovsky, Franck R. Diard