Patents by Inventor Ziyad S. Hakura

Ziyad S. Hakura has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11663767
    Abstract: Attributes of graphics objects are processed in a plurality of graphics processing pipelines. A streaming multiprocessor (SM) retrieves a first set of parameters associated with a set of graphics objects from a first set of buffers. The SM performs a first set of operations on the first set of parameters according to a first phase of processing to produce a second set of parameters stored in a second set of buffers. The SM performs a second set of operations on the second set of parameters according to a second phase of processing to produce a third set of parameters stored in a third set of buffers. One advantage of the disclosed techniques is that work is redistributed from a first phase to a second phase of graphics processing without having to copy the attributes to and retrieve the attributes from the cache or system memory, resulting in reduced power consumption.
    Type: Grant
    Filed: February 20, 2013
    Date of Patent: May 30, 2023
    Assignee: NVIDIA Corporation
    Inventors: Ziyad S. Hakura, Dale L. Kirkland
  • Patent number: 11107176
    Abstract: A tile-based system for processing graphics data. The tile based system includes a first screen-space pipeline, a cache unit, and a first tiling unit. The first tiling unit is configured to transmit a first set of primitives that overlap a first cache tile and a first prefetch command to the first screen-space pipeline for processing, and transmit a second set of primitives that overlap a second cache tile to the first screen-space pipeline for processing. The first prefetch command is configured to cause the cache unit to fetch data associated with the second cache tile from an external memory unit. The first tiling unit may also be configured to transmit a first flush command to the screen-space pipeline for processing with the first set of primitives. The first flush command is configured to cause the cache unit to flush data associated with the first cache tile.
    Type: Grant
    Filed: October 1, 2013
    Date of Patent: August 31, 2021
    Assignee: NVIDIA Corporation
    Inventors: Ziyad S. Hakura, Rouslan Dimitrov
  • Patent number: 10489875
    Abstract: One embodiment of the present invention includes a method for tracking which cache tiles included in a plurality of cache tiles are intersected by a plurality of bounding boxes. The method includes receiving the plurality of bounding boxes, wherein each bounding box is associated with one or more graphics primitives being rendered to a render surface, and wherein the render surface is divided into the plurality of cache tiles. The method further includes, for each bounding box included in the plurality of bounding boxes, determining one or more cache tiles included in the plurality of cache tiles that are intersected by the bounding box, and storing a result in an array for each cache tile that is intersected by the bounding box. Finally, the method includes determining not to process a cache tile included in the plurality of cache tiles based on the results stored in the array.
    Type: Grant
    Filed: August 14, 2013
    Date of Patent: November 26, 2019
    Assignee: NVIDIA CORPORATION
    Inventors: Ziyad S. Hakura, Cynthia Allison
  • Patent number: 10438314
    Abstract: One embodiment of the present invention sets forth a graphics processing system. The graphics processing system includes a screen-space pipeline and a tiling unit. The screen-space pipeline is configured to perform visibility testing and fragment shading. The tiling unit is configured to determine that a first set of primitives overlaps a first cache tile. The tiling unit is also configured to first transmit the first set of primitives to the screen-space pipeline with a command configured to cause the screen-space pipeline to process the first set of primitives in a z-only mode, and then transmit the first set of primitives to the screen-space pipeline with a command configured to cause the screen-space pipeline to process the first set of primitives in a normal mode. In the z-only mode, at least some fragment shading operations are disabled in the screen-space pipeline. In the normal mode, fragment shading operations are enabled.
    Type: Grant
    Filed: April 23, 2018
    Date of Patent: October 8, 2019
    Assignee: NVIDIA CORPORATION
    Inventors: Ziyad S. Hakura, Jerome F. Duluk, Jr.
  • Publication number: 20190243652
    Abstract: One embodiment of the present invention sets forth a graphics processing system. The graphics processing system includes a screen-space pipeline and a tiling unit. The screen-space pipeline is configured to perform visibility testing and fragment shading. The tiling unit is configured to determine that a first set of primitives overlaps a first cache tile. The tiling unit is also configured to first transmit the first set of primitives to the screen-space pipeline with a command configured to cause the screen-space pipeline to process the first set of primitives in a z-only mode, and then transmit the first set of primitives to the screen-space pipeline with a command configured to cause the screen-space pipeline to process the first set of primitives in a normal mode. In the z-only mode, at least some fragment shading operations are disabled in the screen-space pipeline. In the normal mode, fragment shading operations are enabled.
    Type: Application
    Filed: April 23, 2018
    Publication date: August 8, 2019
    Inventors: Ziyad S. HAKURA, Jerome F. DULUK, JR.
  • Patent number: 10282803
    Abstract: One embodiment of the present invention includes a graphics subsystem that includes a tiling unit, a crossbar unit, and a screen-space pipeline. The crossbar unit is configured to transmit primitives interleaved with state change commands to the tiling unit. The tiling unit is configured to record an initial state associated with the primitives and to transmit to the screen-space pipeline one or more primitives in the primitives that overlap a first cache tile. The tiling unit is further configured to transmit the initial state to the screen-space pipeline and to transmit to the screen-space pipeline one or more primitives in the primitives that overlap a second cache tile. The tiling unit includes a state filter block configured to determine that a first state change in the state change commands is followed by a second state change, without an intervening primitive, and to forego transmitting the first state change in response.
    Type: Grant
    Filed: September 3, 2013
    Date of Patent: May 7, 2019
    Assignee: NVIDIA CORPORATION
    Inventors: Ziyad S. Hakura, Pierre Souillot, Cynthia Ann Edgeworth Allison, Dale L. Kirkland, Walter R. Steiner
  • Patent number: 10223122
    Abstract: One embodiment of the present invention sets forth a graphics processing system configured to track event counts in a tile-based architecture. The graphics processing system includes a screen-space pipeline and a tiling unit. The screen-space pipeline includes a first unit, a count memory associated with the first unit, and an accumulating memory associated with the first unit. The first unit is configured to detect an event type and increment the count memory. The tiling unit is configured to cause the screen-space pipeline to update an external memory address to reflect a first value stored in the count memory when the first unit completes processing of a first set of primitives. The tiling unit is also configured to cause the screen-space pipeline to update the accumulating memory to reflect a second value stored in the count memory when the first unit completes processing of a second set of primitives.
    Type: Grant
    Filed: April 9, 2017
    Date of Patent: March 5, 2019
    Assignee: NVIDIA CORPORATION
    Inventors: Ziyad S. Hakura, Jerome F. Duluk, Jr.
  • Publication number: 20180307490
    Abstract: One embodiment of the present invention sets forth a graphics processing system. The graphics processing system includes a screen-space pipeline and a tiling unit. The screen-space pipeline is configured to perform visibility testing and fragment shading. The tiling unit is configured to determine that a first set of primitives overlaps a first cache tile. The tiling unit is also configured to first transmit the first set of primitives to the screen-space pipeline with a command configured to cause the screen-space pipeline to process the first set of primitives in a z-only mode, and then transmit the first set of primitives to the screen-space pipeline with a command configured to cause the screen-space pipeline to process the first set of primitives in a normal mode. In the z-only mode, at least some fragment shading operations are disabled in the screen-space pipeline. In the normal mode, fragment shading operations are enabled.
    Type: Application
    Filed: April 23, 2018
    Publication date: October 25, 2018
    Inventors: Ziyad S. HAKURA, Jerome F. DULUK, JR.
  • Patent number: 10083036
    Abstract: One embodiment of the present invention sets forth a technique for managing graphics processing resources in a tile-based architecture. The technique includes storing a release packet associated with a graphics processing resource in a buffer and initiating a replay of graphics primitives stored in the buffer and associated with the graphics processing resource. The technique further includes, for each tile included in a plurality of tiles and processed during the replay, reading the release packet and determining whether the tile is a last tile processed during the replay. The technique further includes determining not to transmit the release packet to a screen-space pipeline and continuing to read graphics data stored in the buffer if the tile is not the last tile to be processed during the replay, or transmitting the release packet to the screen-space pipeline if the tile is the last tile to be processed during the replay.
    Type: Grant
    Filed: October 3, 2013
    Date of Patent: September 25, 2018
    Assignee: NVIDIA CORPORATION
    Inventors: Ziyad S. Hakura, Cynthia Ann Edgeworth Allison, Dale L. Kirkland, Andrei Khodakovsky, Jeffrey A. Bolz
  • Patent number: 10032242
    Abstract: A method for managing bind-render-target commands in a tile-based architecture. The method includes receiving a requested set of bound render targets and a draw command. The method also includes, upon receiving the draw command, determining whether a current set of bound render targets includes each of the render targets identified in the requested set. The method further includes, if the current set does not include each render target identified in the requested set, then issuing a flush-tiling-unit-command to a parallel processing subsystem, modifying the current set to include each render target identified in the requested set, and issuing bind-render-target commands identifying the requested set to the tile-based architecture for processing. The method further includes, if the current set of render targets includes each render target identified in the requested set, then not issuing the flush-tiling-unit-command.
    Type: Grant
    Filed: October 1, 2013
    Date of Patent: July 24, 2018
    Assignee: NVIDIA CORPORATION
    Inventors: Ziyad S. Hakura, Jeffrey A. Bolz, Amanpreet Grewal, Matthew Johnson, Andrei Khodakovsky
  • Patent number: 10032243
    Abstract: One embodiment of the present invention sets forth a graphics subsystem configured to implement distributed cache tiling. The graphics subsystem includes one or more world-space pipelines, one or more screen-space pipelines, one or more tiling units, and a crossbar unit. Each world-space pipeline is implemented in a different processing entity and is coupled to a different tiling unit. Each screen-space pipeline is implemented in a different processing entity and is coupled to the crossbar unit. The tiling units are configured to receive primitives from the world-space pipelines, generate cache tile batches based on the primitives, and transmit the primitives to the screen-space pipelines. One advantage of the disclosed approach is that primitives are processed in application-programming-interface order in a highly parallel tiling architecture. Another advantage is that primitives are processed in cache tile order, which reduces memory bandwidth consumption and improves cache memory utilization.
    Type: Grant
    Filed: October 18, 2013
    Date of Patent: July 24, 2018
    Assignee: NVIDIA CORPORATION
    Inventors: Ziyad S. Hakura, Cynthia Ann Edgeworth Allison, Dale L. Kirkland, Walter R. Steiner
  • Patent number: 9952868
    Abstract: One embodiment of the present invention sets forth a graphics processing system. The graphics processing system includes a screen-space pipeline and a tiling unit. The screen-space pipeline is configured to perform visibility testing and fragment shading. The tiling unit is configured to determine that a first set of primitives overlaps a first cache tile. The tiling unit is also configured to first transmit the first set of primitives to the screen-space pipeline with a command configured to cause the screen-space pipeline to process the first set of primitives in a z-only mode, and then transmit the first set of primitives to the screen-space pipeline with a command configured to cause the screen-space pipeline to process the first set of primitives in a normal mode. In the z-only mode, at least some fragment shading operations are disabled in the screen-space pipeline. In the normal mode, fragment shading operations are enabled.
    Type: Grant
    Filed: October 1, 2013
    Date of Patent: April 24, 2018
    Assignee: NVIDIA Corporation
    Inventors: Ziyad S. Hakura, Jerome F. Duluk, Jr.
  • Patent number: 9830741
    Abstract: Techniques are disclosed for processing graphics objects in a stage of a graphics processing pipeline. The techniques include receiving a graphics primitive associated with the graphics object, and determining a plurality of attributes corresponding to one or more vertices associated with the graphics primitive. The techniques further include determining values for one or more state parameters associated with a downstream stage of the graphics processing pipeline based on a visual effect associated with the graphics primitive. The techniques further include transmitting the state parameter values to the downstream stage of the graphics processing pipeline. One advantage of the disclosed techniques is that visual effects are flexibly and efficiently performed.
    Type: Grant
    Filed: November 7, 2012
    Date of Patent: November 28, 2017
    Assignee: NVIDIA Corporation
    Inventors: Emmett M. Kilgariff, Morgan McGuire, Yury Y. Uralsky, Ziyad S. Hakura
  • Patent number: 9792122
    Abstract: One embodiment of the present invention includes a technique for processing graphics primitives in a tile-based architecture. The technique includes storing, in a buffer, a first plurality of graphics primitives and a first plurality of state bundles received from the world-space pipeline. The technique further includes determining, based on a first condition, that the first plurality of graphics primitives should be replayed from the buffer, and, in response, replaying the first plurality of graphics primitives against a first tile included in a first plurality of tiles. Replaying the first plurality of graphics primitives includes comparing each graphics primitive against the first tile to determine whether the graphics primitive intersects the first tile, determining that one or more graphics primitives intersects the first tile, and transmitting the one or more graphics primitives and one or more associated state bundles to a screen-space pipeline for processing.
    Type: Grant
    Filed: October 4, 2013
    Date of Patent: October 17, 2017
    Assignee: NVIDIA CORPORATION
    Inventors: Ziyad S. Hakura, Walter R. Steiner, Cynthia Ann Edgeworth Allison, Rouslan Dimitrov, Karim M. Abdalla, Dale L. Kirkland, Emmett M. Kilgariff
  • Patent number: 9779533
    Abstract: One embodiment of the present invention includes a method for processing graphics objects. The method includes receiving a first draw-call and a second draw-call. The method also includes dividing the first draw-call into a first set of sub-draw-calls and the second draw-call into a second set of sub-draw-calls. The method further includes identifying a first screen tile. The method also includes identifying a first group of sub-draw-calls included in the first set of sub-draw-calls that overlap the first screen tile and a second group of sub-draw-calls included in the second set of sub-draw-calls that overlap the second screen tile. The method further includes causing the first group of sub-draw-calls and the second group of sub-draw-calls to be processed together.
    Type: Grant
    Filed: January 27, 2014
    Date of Patent: October 3, 2017
    Assignee: NVIDIA Corporation
    Inventors: Rouslan Dimitrov, Ziyad S. Hakura
  • Patent number: 9734548
    Abstract: One embodiment of the present invention includes techniques for adaptively sizing cache tiles in a graphics system. A device driver associated with a graphics system sets a cache tile size associated with a cache tile to a first size. The detects a change from a first render target configuration that includes a first set of render targets to a second render target configuration that includes a second set of render targets. The device driver sets the cache tile size to a second size based on the second render target configuration. One advantage of the disclosed approach is that the cache tile size is adaptively sized, resulting in fewer cache tiles for less complex render target configurations. Adaptively sizing cache tiles leads to more efficient processor utilization and reduced power requirements. In addition, a unified L2 cache tile allows dynamic partitioning of cache memory between cache tile data and other data.
    Type: Grant
    Filed: August 28, 2013
    Date of Patent: August 15, 2017
    Assignee: NVIDIA Corporation
    Inventors: Ziyad S. Hakura, Rouslan Dimitrov, Emmett M. Kilgariff, Andrei Khodakovsky
  • Patent number: 9720842
    Abstract: A device driver calculates a tile size for a plurality of cache memories in a cache hierarchy. The device driver calculates a storage capacity of a first cache memory. The device driver calculates a first tile size based on the storage capacity of the first cache memory and one or more additional characteristics. The device driver calculates a storage capacity of a second cache memory. The device driver calculates a second tile size based on the storage capacity of the second cache memory and one or more additional characteristics, where the second tile size is different than the first tile size. The device driver transmits the second tile size to a second coalescing binning unit. One advantage of the disclosed techniques is that data locality and cache memory hit rates are improved where tile size is optimized for each cache level in the cache hierarchy.
    Type: Grant
    Filed: February 20, 2013
    Date of Patent: August 1, 2017
    Assignee: NVIDIA Corporation
    Inventors: Rouslan Dimitrov, Rui Bastos, Ziyad S. Hakura, Eric B. Lum
  • Publication number: 20170213313
    Abstract: One embodiment of the present invention sets forth a graphics processing system configured to track event counts in a tile-based architecture. The graphics processing system includes a screen-space pipeline and a tiling unit. The screen-space pipeline includes a first unit, a count memory associated with the first unit, and an accumulating memory associated with the first unit. The first unit is configured to detect an event type and increment the count memory. The tiling unit is configured to cause the screen-space pipeline to update an external memory address to reflect a first value stored in the count memory when the first unit completes processing of a first set of primitives. The tiling unit is also configured to cause the screen-space pipeline to update the accumulating memory to reflect a second value stored in the count memory when the first unit completes processing of a second set of primitives.
    Type: Application
    Filed: April 9, 2017
    Publication date: July 27, 2017
    Inventors: Ziyad S. HAKURA, Jerome F. DULUK
  • Publication number: 20170206623
    Abstract: A method for managing bind-render-target commands in a tile-based architecture. The method includes receiving a requested set of bound render targets and a draw command. The method also includes, upon receiving the draw command, determining whether a current set of bound render targets includes each of the render targets identified in the requested set. The method further includes, if the current set does not include each render target identified in the requested set, then issuing a flush-tiling-unit-command to a parallel processing subsystem, modifying the current set to include each render target identified in the requested set, and issuing bind-render-target commands identifying the requested set to the tile-based architecture for processing. The method further includes, if the current set of render targets includes each render target identified in the requested set, then not issuing the flush-tiling-unit-command.
    Type: Application
    Filed: October 1, 2013
    Publication date: July 20, 2017
    Applicant: NVIDIA CORPORATION
    Inventors: Ziyad S. HAKURA, Jeffrey A. BOLZ, Amanpreet GREWAL, Matthew JOHNSON, Andrei KHODAKOVSKY
  • Patent number: 9710874
    Abstract: One embodiment of the present invention sets forth a technique for mid-primitive execution preemption. When preemption is initiated no new instructions are issued, in-flight instructions progress to an execution unit boundary, and the execution state is unloaded from the processing pipeline. The execution units within the processing pipeline, including the coarse rasterization unit complete execution of in-flight instructions and become idle. However, rasterization of a triangle may be preempted at a coarse raster region boundary. The amount of context state to be stored is reduced because the execution units are idle. Preempting at the mid-primitive level during rasterization reduces the time from when preemption is initiated to when another process can execute because the entire triangle is not rasterized.
    Type: Grant
    Filed: December 27, 2012
    Date of Patent: July 18, 2017
    Assignee: NVIDIA Corporation
    Inventors: Gregory Scott Palmer, Ziyad S. Hakura, Emmett M. Kilgariff, Dale L. Kirkland, Lacky V. Shah