Patents by Inventor Dale L. Kirkland

Dale L. Kirkland has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Power efficient attribute handling for tessellation and geometry shaders

Patent number: 11663767

Abstract: Attributes of graphics objects are processed in a plurality of graphics processing pipelines. A streaming multiprocessor (SM) retrieves a first set of parameters associated with a set of graphics objects from a first set of buffers. The SM performs a first set of operations on the first set of parameters according to a first phase of processing to produce a second set of parameters stored in a second set of buffers. The SM performs a second set of operations on the second set of parameters according to a second phase of processing to produce a third set of parameters stored in a third set of buffers. One advantage of the disclosed techniques is that work is redistributed from a first phase to a second phase of graphics processing without having to copy the attributes to and retrieve the attributes from the cache or system memory, resulting in reduced power consumption.

Type: Grant

Filed: February 20, 2013

Date of Patent: May 30, 2023

Assignee: NVIDIA Corporation

Inventors: Ziyad S. Hakura, Dale L. Kirkland
REAL-TIME HARDWARE-ASSISTED GPU TUNING USING MACHINE LEARNING

Publication number: 20230007920

Abstract: Graphics processing unit (GPU) performance and power efficiency is improved using machine learning to tune operating parameters based on performance monitor values and application information. Performance monitor values are processed using machine learning techniques to generate model parameters, which are used by a control unit within the GPU to provide real-time updates to the operating parameters. In one embodiment, a neural network processes the performance monitor values to generate operating parameters in real-time.

Type: Application

Filed: September 15, 2022

Publication date: January 12, 2023

Inventors: Rouslan L. Dimitrov, Dale L. Kirkland, Emmett M. Kilgariff, Sachin Satish Idgunji, Siddharth Sharma
Real-time hardware-assisted GPU tuning using machine learning

Patent number: 11481950

Abstract: Graphics processing unit (GPU) performance and power efficiency is improved using machine learning to tune operating parameters based on performance monitor values and application information. Performance monitor values are processed using machine learning techniques to generate model parameters, which are used by a control unit within the GPU to provide real-time updates to the operating parameters. In one embodiment, a neural network processes the performance monitor values to generate operating parameters in real-time.

Type: Grant

Filed: January 29, 2021

Date of Patent: October 25, 2022

Assignee: NVIDIA Corporation

Inventors: Rouslan L. Dimitrov, Dale L. Kirkland, Emmett M. Kilgariff, Sachin Satish Idgunji, Siddharth Sharma
REAL-TIME HARDWARE-ASSISTED GPU TUNING USING MACHINE LEARNING

Publication number: 20210174569

Abstract: Graphics processing unit (GPU) performance and power efficiency is improved using machine learning to tune operating parameters based on performance monitor values and application information. Performance monitor values are processed using machine learning techniques to generate model parameters, which are used by a control unit within the GPU to provide real-time updates to the operating parameters. In one embodiment, a neural network processes the performance monitor values to generate operating parameters in real-time.

Type: Application

Filed: January 29, 2021

Publication date: June 10, 2021

Inventors: Rouslan L. Dimitrov, Dale L. Kirkland, Emmett M. Kilgariff, Sachin Satish Idgunji, Siddharth Sharma
Real-time hardware-assisted GPU tuning using machine learning

Patent number: 10909738

Abstract: Graphics processing unit (GPU) performance and power efficiency is improved using machine learning to tune operating parameters based on performance monitor values and application information. Performance monitor values are processed using machine learning techniques to generate model parameters, which are used by a control unit within the GPU to provide real-time updates to the operating parameters. In one embodiment, a neural network processes the performance monitor values to generate operating parameters in real-time.

Type: Grant

Filed: January 5, 2018

Date of Patent: February 2, 2021

Assignee: NVIDIA Corporation

Inventors: Rouslan L. Dimitrov, Dale L. Kirkland, Emmett M. Kilgariff, Sachin Satish Idgunji, Siddharth Sharma
System-generated stable barycentric coordinates and direct plane equation access

Patent number: 10861230

Abstract: A graphics processing pipeline includes three architectural features that allow a fragment shader to efficiently calculate per-sample attribute values using barycentric coordinates and per-vertex attributes. The first feature is barycentric coordinate injection to provide barycentric coordinates to the fragment shader. The second feature is an attribute qualifier that allows an attribute of a graphics primitive to be processed without conventional fixed-function interpolation. The third feature is a direct access path from the fragment shader to triangle data storage hardware resources where vertex attribute data and/or plane equation coefficients are stored. Allowing the fragment shader to calculate per-sample attribute values in this way advantageously increases system flexibility while reducing workload associated with triangle plane equation setup.

Type: Grant

Filed: February 6, 2019

Date of Patent: December 8, 2020

Assignee: NVIDIA Corporation

Inventors: David Patrick, Dale L. Kirkland, Henry Packard Moreton, Ziyad Sami Hakura, Yury Uralsky
SYSTEM-GENERATED STABLE BARYCENTRIC COORDINATES AND DIRECT PLANE EQUATION ACCESS

Publication number: 20200043228

Abstract: A graphics processing pipeline includes three architectural features that allow a fragment shader to efficiently calculate per-sample attribute values using barycentric coordinates and per-vertex attributes. The first feature is barycentric coordinate injection to provide barycentric coordinates to the fragment shader. The second feature is an attribute qualifier that allows an attribute of a graphics primitive to be processed without conventional fixed-function interpolation. The third feature is a direct access path from the fragment shader to triangle data storage hardware resources where vertex attribute data and/or plane equation coefficients are stored. Allowing the fragment shader to calculate per-sample attribute values in this way advantageously increases system flexibility while reducing workload associated with triangle plane equation setup.

Type: Application

Filed: February 6, 2019

Publication date: February 6, 2020

Inventors: David Patrick, Dale L. Kirkland, Henry Packard Moreton, Ziyad Sami Hakura, Yury Uralsky
REAL-TIME HARDWARE-ASSISTED GPU TUNING USING MACHINE LEARNING

Publication number: 20190213775

Abstract: Graphics processing unit (GPU) performance and power efficiency is improved using machine learning to tune operating parameters based on performance monitor values and application information. Performance monitor values are processed using machine learning techniques to generate model parameters, which are used by a control unit within the GPU to provide real-time updates to the operating parameters. In one embodiment, a neural network processes the performance monitor values to generate operating parameters in real-time.

Type: Application

Filed: January 5, 2018

Publication date: July 11, 2019

Inventors: Rouslan L. Dimitrov, Dale L. Kirkland, Emmett M. Kilgariff, Sachin Satish Idgunji, Siddharth Sharma
State handling in a tiled architecture

Patent number: 10282803

Abstract: One embodiment of the present invention includes a graphics subsystem that includes a tiling unit, a crossbar unit, and a screen-space pipeline. The crossbar unit is configured to transmit primitives interleaved with state change commands to the tiling unit. The tiling unit is configured to record an initial state associated with the primitives and to transmit to the screen-space pipeline one or more primitives in the primitives that overlap a first cache tile. The tiling unit is further configured to transmit the initial state to the screen-space pipeline and to transmit to the screen-space pipeline one or more primitives in the primitives that overlap a second cache tile. The tiling unit includes a state filter block configured to determine that a first state change in the state change commands is followed by a second state change, without an intervening primitive, and to forego transmitting the first state change in response.

Type: Grant

Filed: September 3, 2013

Date of Patent: May 7, 2019

Assignee: NVIDIA CORPORATION

Inventors: Ziyad S. Hakura, Pierre Souillot, Cynthia Ann Edgeworth Allison, Dale L. Kirkland, Walter R. Steiner
Techniques for managing graphics processing resources in a tile-based architecture

Patent number: 10083036

Abstract: One embodiment of the present invention sets forth a technique for managing graphics processing resources in a tile-based architecture. The technique includes storing a release packet associated with a graphics processing resource in a buffer and initiating a replay of graphics primitives stored in the buffer and associated with the graphics processing resource. The technique further includes, for each tile included in a plurality of tiles and processed during the replay, reading the release packet and determining whether the tile is a last tile processed during the replay. The technique further includes determining not to transmit the release packet to a screen-space pipeline and continuing to read graphics data stored in the buffer if the tile is not the last tile to be processed during the replay, or transmitting the release packet to the screen-space pipeline if the tile is the last tile to be processed during the replay.

Type: Grant

Filed: October 3, 2013

Date of Patent: September 25, 2018

Assignee: NVIDIA CORPORATION

Inventors: Ziyad S. Hakura, Cynthia Ann Edgeworth Allison, Dale L. Kirkland, Andrei Khodakovsky, Jeffrey A. Bolz
Distributed tiled caching

Patent number: 10032243

Abstract: One embodiment of the present invention sets forth a graphics subsystem configured to implement distributed cache tiling. The graphics subsystem includes one or more world-space pipelines, one or more screen-space pipelines, one or more tiling units, and a crossbar unit. Each world-space pipeline is implemented in a different processing entity and is coupled to a different tiling unit. Each screen-space pipeline is implemented in a different processing entity and is coupled to the crossbar unit. The tiling units are configured to receive primitives from the world-space pipelines, generate cache tile batches based on the primitives, and transmit the primitives to the screen-space pipelines. One advantage of the disclosed approach is that primitives are processed in application-programming-interface order in a highly parallel tiling architecture. Another advantage is that primitives are processed in cache tile order, which reduces memory bandwidth consumption and improves cache memory utilization.

Type: Grant

Filed: October 18, 2013

Date of Patent: July 24, 2018

Assignee: NVIDIA CORPORATION

Inventors: Ziyad S. Hakura, Cynthia Ann Edgeworth Allison, Dale L. Kirkland, Walter R. Steiner
Heuristics for improving performance in a tile based architecture

Patent number: 9792122

Abstract: One embodiment of the present invention includes a technique for processing graphics primitives in a tile-based architecture. The technique includes storing, in a buffer, a first plurality of graphics primitives and a first plurality of state bundles received from the world-space pipeline. The technique further includes determining, based on a first condition, that the first plurality of graphics primitives should be replayed from the buffer, and, in response, replaying the first plurality of graphics primitives against a first tile included in a first plurality of tiles. Replaying the first plurality of graphics primitives includes comparing each graphics primitive against the first tile to determine whether the graphics primitive intersects the first tile, determining that one or more graphics primitives intersects the first tile, and transmitting the one or more graphics primitives and one or more associated state bundles to a screen-space pipeline for processing.

Type: Grant

Filed: October 4, 2013

Date of Patent: October 17, 2017

Assignee: NVIDIA CORPORATION

Inventors: Ziyad S. Hakura, Walter R. Steiner, Cynthia Ann Edgeworth Allison, Rouslan Dimitrov, Karim M. Abdalla, Dale L. Kirkland, Emmett M. Kilgariff
Mid-primitive graphics execution preemption

Patent number: 9710874

Abstract: One embodiment of the present invention sets forth a technique for mid-primitive execution preemption. When preemption is initiated no new instructions are issued, in-flight instructions progress to an execution unit boundary, and the execution state is unloaded from the processing pipeline. The execution units within the processing pipeline, including the coarse rasterization unit complete execution of in-flight instructions and become idle. However, rasterization of a triangle may be preempted at a coarse raster region boundary. The amount of context state to be stored is reduced because the execution units are idle. Preempting at the mid-primitive level during rasterization reduces the time from when preemption is initiated to when another process can execute because the entire triangle is not rasterized.

Type: Grant

Filed: December 27, 2012

Date of Patent: July 18, 2017

Assignee: NVIDIA Corporation

Inventors: Gregory Scott Palmer, Ziyad S. Hakura, Emmett M. Kilgariff, Dale L. Kirkland, Lacky V. Shah
Techniques for managing graphics processing resources in a tile-based architecture

Patent number: 9639366

Abstract: One embodiment of the present invention sets forth a technique for managing buffer table entries in a tile-based architecture. The technique includes binding a plurality of shader registers to a buffer table entry. The technique further includes processing at least one tile by reading a buffer table index stored in the shader register to access the buffer table entry, reading a buffer address stored in the buffer table entry, accessing data associated with the buffer address, and unbinding the shader register from the buffer table entry. The technique further includes determining that none of the shader registers is still bound to the buffer table entry and, in response, causing a release packet to be inserted into an instruction stream. The technique further includes determining that a last tile has been processed and, in response, transmitting the release packet to cause the buffer table entry to be released.

Type: Grant

Filed: October 3, 2013

Date of Patent: May 2, 2017

Assignee: NVIDIA CORPORATION

Inventors: Karim M. Abdalla, Ziyad S. Hakura, Cynthia Ann Edgeworth Allison, Dale L. Kirkland
Method and system for reducing a polygon bounding box

Patent number: 9633458

Abstract: In a graphics processing pipeline, a processing unit establishes a bounding box around a polygon in order to identify sample points that are covered by the polygon. For a given sample point included within the bounding box, the processing unit constructs a set of lines that intersect at the sample point, where each line in the set of lines is parallel to at least one side of the polygon. When all vertices of the polygon reside on one side of at least one line in the set of lines, the processing unit may reduce the size of the bounding box to exclude the sample point.

Type: Grant

Filed: January 23, 2012

Date of Patent: April 25, 2017

Assignee: NVIDIA Corporation

Inventors: Walter R. Steiner, Eric Lum, Dale L. Kirkland, Steven James Heinrich, David Charles Patrick
Method and system for distributing work batches to processing units based on a number of enabled streaming multiprocessors

Patent number: 9594599

Abstract: A work distribution unit distributes work batches to general processing clusters (GPCs) based on the number of streaming multiprocessors included in each GPC. Advantageously, each GPC receives an amount of work that is proportional to the amount of processing power afforded by the GPC. Embodiments include a method for distributing batches of processing tasks to two or more general processing clusters (GPCs), including the steps of updating a counter value for each of the two or more GPCs based on the number of enabled parallel processing units within each of the two or more GPCs, and distributing a batch of processing tasks to a first GPC of the two or more GPCs based on a counter value associated with the first GPC and based on a load signal received from the first GPC.

Type: Grant

Filed: October 14, 2009

Date of Patent: March 14, 2017

Assignee: NVIDIA Corporation

Inventors: Philip Browning Johnson, Dale L. Kirkland, Karim M. Abdalla
Heuristics for improving performance in a tile-based architecture

Patent number: 9542189

Abstract: One embodiment of the present invention includes a technique for processing graphics primitives in a tile-based architecture. The technique includes storing, in a buffer, a first plurality of graphics primitives and a first plurality of state bundles received from a world-space pipeline, and transmitting the first plurality of graphics primitives to a screen-space pipeline for processing while a tiling function is enabled. The technique further includes storing, in the buffer, a second plurality of graphics primitives and a second plurality of state bundles received from the world-space pipeline. The technique further includes determining, based on a first condition, that the tiling function should be disabled and that the second plurality of graphics primitives should be flushed from the buffer, and transmitting the second plurality of graphics primitives to the screen-space pipeline for processing while the tiling function is disabled.

Type: Grant

Filed: October 4, 2013

Date of Patent: January 10, 2017

Assignee: NVIDIA Corporation

Inventors: Ziyad S. Hakura, Cynthia Ann Edgeworth Allison, Joseph Cavanaugh, Dale L. Kirkland, Emmett M. Kilgariff
Distributing primitives to multiple rasterizers

Patent number: 9536341

Abstract: One embodiment of the present invention sets forth a technique for parallel distribution of primitives to multiple rasterizers. Multiple, independent geometry units perform geometry processing concurrently on different graphics primitives. A primitive distribution scheme delivers primitives from the multiple geometry units concurrently to multiple rasterizers at rates of multiple primitives per clock. The multiple, independent rasterizer units perform rasterization concurrently on one or more graphics primitives, enabling the rendering of multiple primitives per system clock.

Type: Grant

Filed: October 19, 2009

Date of Patent: January 3, 2017

Assignee: NVIDIA Corporation

Inventors: Johnny S. Rhoades, Emmett M. Kilgariff, Michael C. Shebanow, Ziyad S. Hakura, Dale L. Kirkland, James Daniel Kelly
Distributed tiled caching

Patent number: 9483270

Abstract: One embodiment of the present invention sets forth a graphics subsystem configured to implement distributed tiled caching. The graphics subsystem includes one or more world-space pipelines, one or more screen-space pipelines, one or more tiling units, and a crossbar unit. Each world-space pipeline is implemented in a different processing entity and is coupled to a different tiling unit. Each screen-space pipeline is implemented in a different processing entity and is coupled to the crossbar unit. The tiling units are configured to receive primitives from the world-space pipelines, generate cache tile batches based on the primitives, and transmit the primitives to the screen-space pipelines. One advantage of the disclosed approach is that primitives are processed in application-programming-interface order in a highly parallel tiling architecture. Another advantage is that primitives are processed in cache tile order, which reduces memory bandwidth consumption and improves cache memory utilization.

Type: Grant

Filed: October 18, 2013

Date of Patent: November 1, 2016

Assignee: NVIDIA Corporation

Inventors: Ziyad S. Hakura, Cynthia Ann Edgeworth Allison, Dale L. Kirkland, Walter R. Steiner
Techniques for managing graphics processing resources in a tile-based architecture

Patent number: 9448804

Abstract: One embodiment of the present invention sets forth a technique for managing buffer entries in a tile-based architecture. The technique includes receiving a first plurality of graphics primitives and a first buffer address at which attributes associated with the first plurality of graphics primitives are stored. The technique further includes, for each tile included in a plurality of tiles, transmitting the first plurality of graphics primitives and the first buffer address to a screen space pipeline and receiving an acknowledgement from the screen space pipeline indicating that processing the first plurality of graphics primitives has completed. The technique further includes determining that processing the first plurality of graphics primitives has completed for a last tile included in the plurality of tiles and that the acknowledgement has been received for each tile included in the plurality of tiles, and, in response, releasing a buffer entry associated with the first buffer address.

Type: Grant

Filed: October 3, 2013

Date of Patent: September 20, 2016

Assignee: NVIDIA Corporation

Inventors: Ziyad S. Hakura, Cynthia Ann Edgeworth Allison, Dale L. Kirkland

1 2 3 4 next