Patents by Inventor Dale L. Kirkland

Dale L. Kirkland has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11663767
    Abstract: Attributes of graphics objects are processed in a plurality of graphics processing pipelines. A streaming multiprocessor (SM) retrieves a first set of parameters associated with a set of graphics objects from a first set of buffers. The SM performs a first set of operations on the first set of parameters according to a first phase of processing to produce a second set of parameters stored in a second set of buffers. The SM performs a second set of operations on the second set of parameters according to a second phase of processing to produce a third set of parameters stored in a third set of buffers. One advantage of the disclosed techniques is that work is redistributed from a first phase to a second phase of graphics processing without having to copy the attributes to and retrieve the attributes from the cache or system memory, resulting in reduced power consumption.
    Type: Grant
    Filed: February 20, 2013
    Date of Patent: May 30, 2023
    Assignee: NVIDIA Corporation
    Inventors: Ziyad S. Hakura, Dale L. Kirkland
  • Publication number: 20230007920
    Abstract: Graphics processing unit (GPU) performance and power efficiency is improved using machine learning to tune operating parameters based on performance monitor values and application information. Performance monitor values are processed using machine learning techniques to generate model parameters, which are used by a control unit within the GPU to provide real-time updates to the operating parameters. In one embodiment, a neural network processes the performance monitor values to generate operating parameters in real-time.
    Type: Application
    Filed: September 15, 2022
    Publication date: January 12, 2023
    Inventors: Rouslan L. Dimitrov, Dale L. Kirkland, Emmett M. Kilgariff, Sachin Satish Idgunji, Siddharth Sharma
  • Patent number: 11481950
    Abstract: Graphics processing unit (GPU) performance and power efficiency is improved using machine learning to tune operating parameters based on performance monitor values and application information. Performance monitor values are processed using machine learning techniques to generate model parameters, which are used by a control unit within the GPU to provide real-time updates to the operating parameters. In one embodiment, a neural network processes the performance monitor values to generate operating parameters in real-time.
    Type: Grant
    Filed: January 29, 2021
    Date of Patent: October 25, 2022
    Assignee: NVIDIA Corporation
    Inventors: Rouslan L. Dimitrov, Dale L. Kirkland, Emmett M. Kilgariff, Sachin Satish Idgunji, Siddharth Sharma
  • Publication number: 20210174569
    Abstract: Graphics processing unit (GPU) performance and power efficiency is improved using machine learning to tune operating parameters based on performance monitor values and application information. Performance monitor values are processed using machine learning techniques to generate model parameters, which are used by a control unit within the GPU to provide real-time updates to the operating parameters. In one embodiment, a neural network processes the performance monitor values to generate operating parameters in real-time.
    Type: Application
    Filed: January 29, 2021
    Publication date: June 10, 2021
    Inventors: Rouslan L. Dimitrov, Dale L. Kirkland, Emmett M. Kilgariff, Sachin Satish Idgunji, Siddharth Sharma
  • Patent number: 10909738
    Abstract: Graphics processing unit (GPU) performance and power efficiency is improved using machine learning to tune operating parameters based on performance monitor values and application information. Performance monitor values are processed using machine learning techniques to generate model parameters, which are used by a control unit within the GPU to provide real-time updates to the operating parameters. In one embodiment, a neural network processes the performance monitor values to generate operating parameters in real-time.
    Type: Grant
    Filed: January 5, 2018
    Date of Patent: February 2, 2021
    Assignee: NVIDIA Corporation
    Inventors: Rouslan L. Dimitrov, Dale L. Kirkland, Emmett M. Kilgariff, Sachin Satish Idgunji, Siddharth Sharma
  • Patent number: 10861230
    Abstract: A graphics processing pipeline includes three architectural features that allow a fragment shader to efficiently calculate per-sample attribute values using barycentric coordinates and per-vertex attributes. The first feature is barycentric coordinate injection to provide barycentric coordinates to the fragment shader. The second feature is an attribute qualifier that allows an attribute of a graphics primitive to be processed without conventional fixed-function interpolation. The third feature is a direct access path from the fragment shader to triangle data storage hardware resources where vertex attribute data and/or plane equation coefficients are stored. Allowing the fragment shader to calculate per-sample attribute values in this way advantageously increases system flexibility while reducing workload associated with triangle plane equation setup.
    Type: Grant
    Filed: February 6, 2019
    Date of Patent: December 8, 2020
    Assignee: NVIDIA Corporation
    Inventors: David Patrick, Dale L. Kirkland, Henry Packard Moreton, Ziyad Sami Hakura, Yury Uralsky
  • Publication number: 20200043228
    Abstract: A graphics processing pipeline includes three architectural features that allow a fragment shader to efficiently calculate per-sample attribute values using barycentric coordinates and per-vertex attributes. The first feature is barycentric coordinate injection to provide barycentric coordinates to the fragment shader. The second feature is an attribute qualifier that allows an attribute of a graphics primitive to be processed without conventional fixed-function interpolation. The third feature is a direct access path from the fragment shader to triangle data storage hardware resources where vertex attribute data and/or plane equation coefficients are stored. Allowing the fragment shader to calculate per-sample attribute values in this way advantageously increases system flexibility while reducing workload associated with triangle plane equation setup.
    Type: Application
    Filed: February 6, 2019
    Publication date: February 6, 2020
    Inventors: David Patrick, Dale L. Kirkland, Henry Packard Moreton, Ziyad Sami Hakura, Yury Uralsky
  • Publication number: 20190213775
    Abstract: Graphics processing unit (GPU) performance and power efficiency is improved using machine learning to tune operating parameters based on performance monitor values and application information. Performance monitor values are processed using machine learning techniques to generate model parameters, which are used by a control unit within the GPU to provide real-time updates to the operating parameters. In one embodiment, a neural network processes the performance monitor values to generate operating parameters in real-time.
    Type: Application
    Filed: January 5, 2018
    Publication date: July 11, 2019
    Inventors: Rouslan L. Dimitrov, Dale L. Kirkland, Emmett M. Kilgariff, Sachin Satish Idgunji, Siddharth Sharma
  • Patent number: 10282803
    Abstract: One embodiment of the present invention includes a graphics subsystem that includes a tiling unit, a crossbar unit, and a screen-space pipeline. The crossbar unit is configured to transmit primitives interleaved with state change commands to the tiling unit. The tiling unit is configured to record an initial state associated with the primitives and to transmit to the screen-space pipeline one or more primitives in the primitives that overlap a first cache tile. The tiling unit is further configured to transmit the initial state to the screen-space pipeline and to transmit to the screen-space pipeline one or more primitives in the primitives that overlap a second cache tile. The tiling unit includes a state filter block configured to determine that a first state change in the state change commands is followed by a second state change, without an intervening primitive, and to forego transmitting the first state change in response.
    Type: Grant
    Filed: September 3, 2013
    Date of Patent: May 7, 2019
    Assignee: NVIDIA CORPORATION
    Inventors: Ziyad S. Hakura, Pierre Souillot, Cynthia Ann Edgeworth Allison, Dale L. Kirkland, Walter R. Steiner
  • Patent number: 10083036
    Abstract: One embodiment of the present invention sets forth a technique for managing graphics processing resources in a tile-based architecture. The technique includes storing a release packet associated with a graphics processing resource in a buffer and initiating a replay of graphics primitives stored in the buffer and associated with the graphics processing resource. The technique further includes, for each tile included in a plurality of tiles and processed during the replay, reading the release packet and determining whether the tile is a last tile processed during the replay. The technique further includes determining not to transmit the release packet to a screen-space pipeline and continuing to read graphics data stored in the buffer if the tile is not the last tile to be processed during the replay, or transmitting the release packet to the screen-space pipeline if the tile is the last tile to be processed during the replay.
    Type: Grant
    Filed: October 3, 2013
    Date of Patent: September 25, 2018
    Assignee: NVIDIA CORPORATION
    Inventors: Ziyad S. Hakura, Cynthia Ann Edgeworth Allison, Dale L. Kirkland, Andrei Khodakovsky, Jeffrey A. Bolz
  • Patent number: 10032243
    Abstract: One embodiment of the present invention sets forth a graphics subsystem configured to implement distributed cache tiling. The graphics subsystem includes one or more world-space pipelines, one or more screen-space pipelines, one or more tiling units, and a crossbar unit. Each world-space pipeline is implemented in a different processing entity and is coupled to a different tiling unit. Each screen-space pipeline is implemented in a different processing entity and is coupled to the crossbar unit. The tiling units are configured to receive primitives from the world-space pipelines, generate cache tile batches based on the primitives, and transmit the primitives to the screen-space pipelines. One advantage of the disclosed approach is that primitives are processed in application-programming-interface order in a highly parallel tiling architecture. Another advantage is that primitives are processed in cache tile order, which reduces memory bandwidth consumption and improves cache memory utilization.
    Type: Grant
    Filed: October 18, 2013
    Date of Patent: July 24, 2018
    Assignee: NVIDIA CORPORATION
    Inventors: Ziyad S. Hakura, Cynthia Ann Edgeworth Allison, Dale L. Kirkland, Walter R. Steiner
  • Patent number: 9792122
    Abstract: One embodiment of the present invention includes a technique for processing graphics primitives in a tile-based architecture. The technique includes storing, in a buffer, a first plurality of graphics primitives and a first plurality of state bundles received from the world-space pipeline. The technique further includes determining, based on a first condition, that the first plurality of graphics primitives should be replayed from the buffer, and, in response, replaying the first plurality of graphics primitives against a first tile included in a first plurality of tiles. Replaying the first plurality of graphics primitives includes comparing each graphics primitive against the first tile to determine whether the graphics primitive intersects the first tile, determining that one or more graphics primitives intersects the first tile, and transmitting the one or more graphics primitives and one or more associated state bundles to a screen-space pipeline for processing.
    Type: Grant
    Filed: October 4, 2013
    Date of Patent: October 17, 2017
    Assignee: NVIDIA CORPORATION
    Inventors: Ziyad S. Hakura, Walter R. Steiner, Cynthia Ann Edgeworth Allison, Rouslan Dimitrov, Karim M. Abdalla, Dale L. Kirkland, Emmett M. Kilgariff
  • Patent number: 9710874
    Abstract: One embodiment of the present invention sets forth a technique for mid-primitive execution preemption. When preemption is initiated no new instructions are issued, in-flight instructions progress to an execution unit boundary, and the execution state is unloaded from the processing pipeline. The execution units within the processing pipeline, including the coarse rasterization unit complete execution of in-flight instructions and become idle. However, rasterization of a triangle may be preempted at a coarse raster region boundary. The amount of context state to be stored is reduced because the execution units are idle. Preempting at the mid-primitive level during rasterization reduces the time from when preemption is initiated to when another process can execute because the entire triangle is not rasterized.
    Type: Grant
    Filed: December 27, 2012
    Date of Patent: July 18, 2017
    Assignee: NVIDIA Corporation
    Inventors: Gregory Scott Palmer, Ziyad S. Hakura, Emmett M. Kilgariff, Dale L. Kirkland, Lacky V. Shah
  • Patent number: 9639366
    Abstract: One embodiment of the present invention sets forth a technique for managing buffer table entries in a tile-based architecture. The technique includes binding a plurality of shader registers to a buffer table entry. The technique further includes processing at least one tile by reading a buffer table index stored in the shader register to access the buffer table entry, reading a buffer address stored in the buffer table entry, accessing data associated with the buffer address, and unbinding the shader register from the buffer table entry. The technique further includes determining that none of the shader registers is still bound to the buffer table entry and, in response, causing a release packet to be inserted into an instruction stream. The technique further includes determining that a last tile has been processed and, in response, transmitting the release packet to cause the buffer table entry to be released.
    Type: Grant
    Filed: October 3, 2013
    Date of Patent: May 2, 2017
    Assignee: NVIDIA CORPORATION
    Inventors: Karim M. Abdalla, Ziyad S. Hakura, Cynthia Ann Edgeworth Allison, Dale L. Kirkland
  • Patent number: 9633458
    Abstract: In a graphics processing pipeline, a processing unit establishes a bounding box around a polygon in order to identify sample points that are covered by the polygon. For a given sample point included within the bounding box, the processing unit constructs a set of lines that intersect at the sample point, where each line in the set of lines is parallel to at least one side of the polygon. When all vertices of the polygon reside on one side of at least one line in the set of lines, the processing unit may reduce the size of the bounding box to exclude the sample point.
    Type: Grant
    Filed: January 23, 2012
    Date of Patent: April 25, 2017
    Assignee: NVIDIA Corporation
    Inventors: Walter R. Steiner, Eric Lum, Dale L. Kirkland, Steven James Heinrich, David Charles Patrick
  • Patent number: 9594599
    Abstract: A work distribution unit distributes work batches to general processing clusters (GPCs) based on the number of streaming multiprocessors included in each GPC. Advantageously, each GPC receives an amount of work that is proportional to the amount of processing power afforded by the GPC. Embodiments include a method for distributing batches of processing tasks to two or more general processing clusters (GPCs), including the steps of updating a counter value for each of the two or more GPCs based on the number of enabled parallel processing units within each of the two or more GPCs, and distributing a batch of processing tasks to a first GPC of the two or more GPCs based on a counter value associated with the first GPC and based on a load signal received from the first GPC.
    Type: Grant
    Filed: October 14, 2009
    Date of Patent: March 14, 2017
    Assignee: NVIDIA Corporation
    Inventors: Philip Browning Johnson, Dale L. Kirkland, Karim M. Abdalla
  • Patent number: 9542189
    Abstract: One embodiment of the present invention includes a technique for processing graphics primitives in a tile-based architecture. The technique includes storing, in a buffer, a first plurality of graphics primitives and a first plurality of state bundles received from a world-space pipeline, and transmitting the first plurality of graphics primitives to a screen-space pipeline for processing while a tiling function is enabled. The technique further includes storing, in the buffer, a second plurality of graphics primitives and a second plurality of state bundles received from the world-space pipeline. The technique further includes determining, based on a first condition, that the tiling function should be disabled and that the second plurality of graphics primitives should be flushed from the buffer, and transmitting the second plurality of graphics primitives to the screen-space pipeline for processing while the tiling function is disabled.
    Type: Grant
    Filed: October 4, 2013
    Date of Patent: January 10, 2017
    Assignee: NVIDIA Corporation
    Inventors: Ziyad S. Hakura, Cynthia Ann Edgeworth Allison, Joseph Cavanaugh, Dale L. Kirkland, Emmett M. Kilgariff
  • Patent number: 9536341
    Abstract: One embodiment of the present invention sets forth a technique for parallel distribution of primitives to multiple rasterizers. Multiple, independent geometry units perform geometry processing concurrently on different graphics primitives. A primitive distribution scheme delivers primitives from the multiple geometry units concurrently to multiple rasterizers at rates of multiple primitives per clock. The multiple, independent rasterizer units perform rasterization concurrently on one or more graphics primitives, enabling the rendering of multiple primitives per system clock.
    Type: Grant
    Filed: October 19, 2009
    Date of Patent: January 3, 2017
    Assignee: NVIDIA Corporation
    Inventors: Johnny S. Rhoades, Emmett M. Kilgariff, Michael C. Shebanow, Ziyad S. Hakura, Dale L. Kirkland, James Daniel Kelly
  • Patent number: 9483270
    Abstract: One embodiment of the present invention sets forth a graphics subsystem configured to implement distributed tiled caching. The graphics subsystem includes one or more world-space pipelines, one or more screen-space pipelines, one or more tiling units, and a crossbar unit. Each world-space pipeline is implemented in a different processing entity and is coupled to a different tiling unit. Each screen-space pipeline is implemented in a different processing entity and is coupled to the crossbar unit. The tiling units are configured to receive primitives from the world-space pipelines, generate cache tile batches based on the primitives, and transmit the primitives to the screen-space pipelines. One advantage of the disclosed approach is that primitives are processed in application-programming-interface order in a highly parallel tiling architecture. Another advantage is that primitives are processed in cache tile order, which reduces memory bandwidth consumption and improves cache memory utilization.
    Type: Grant
    Filed: October 18, 2013
    Date of Patent: November 1, 2016
    Assignee: NVIDIA Corporation
    Inventors: Ziyad S. Hakura, Cynthia Ann Edgeworth Allison, Dale L. Kirkland, Walter R. Steiner
  • Patent number: 9448804
    Abstract: One embodiment of the present invention sets forth a technique for managing buffer entries in a tile-based architecture. The technique includes receiving a first plurality of graphics primitives and a first buffer address at which attributes associated with the first plurality of graphics primitives are stored. The technique further includes, for each tile included in a plurality of tiles, transmitting the first plurality of graphics primitives and the first buffer address to a screen space pipeline and receiving an acknowledgement from the screen space pipeline indicating that processing the first plurality of graphics primitives has completed. The technique further includes determining that processing the first plurality of graphics primitives has completed for a last tile included in the plurality of tiles and that the acknowledgement has been received for each tile included in the plurality of tiles, and, in response, releasing a buffer entry associated with the first buffer address.
    Type: Grant
    Filed: October 3, 2013
    Date of Patent: September 20, 2016
    Assignee: NVIDIA Corporation
    Inventors: Ziyad S. Hakura, Cynthia Ann Edgeworth Allison, Dale L. Kirkland