Patents by Inventor Ziyad S. Hakura

Ziyad S. Hakura has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Distributed clip, cull, viewport transform and perspective correction

Patent number: 8704835

Abstract: A parallel processing subsystem includes a plurality of general processing clusters (GPCs). Each GPC includes one or more clipping, culling, viewport transformation, and perspective correction engines (VPC). Since VPCs are distributed per GPC, each VPC can process graphics primitives in parallel with the other VPCs processing graphics primitives.

Type: Grant

Filed: October 8, 2009

Date of Patent: April 22, 2014

Assignee: Nvidia Corporation

Inventors: Ziyad S. Hakura, Emmett M. Kilgariff
Distributing primitives to multiple rasterizers

Patent number: 8704836

Abstract: One embodiment of the present invention sets forth a technique for parallel distribution of primitives to multiple rasterizers. Multiple, independent geometry units perform geometry processing concurrently on different graphics primitives. A primitive distribution scheme delivers primitives from the multiple geometry units concurrently to multiple rasterizers at rates of multiple primitives per clock. The multiple, independent rasterizer units perform rasterization concurrently on one or more graphics primitives, enabling the rendering of multiple primitives per system clock.

Type: Grant

Filed: October 19, 2009

Date of Patent: April 22, 2014

Assignee: NVIDIA Corporation

Inventors: Johnny S. Rhoades, Steven E. Molnar, Emmett M. Kilgariff, Michael C. Shebanow, Ziyad S. Hakura, Dale L. Kirkland, James Daniel Kelly
Calculation of plane equations after determination of Z-buffer visibility

Patent number: 8692829

Abstract: One embodiment of the present invention sets forth a technique for computing plane equations for primitive shading after non-visible pixels are removed by z culling operations and pixel coverage has been determined. The z plane equations are computed before the plane equations for non-z primitive attributes are computed. The z plane equations are then used to perform screen-space z culling of primitives during and following rasterization. Culling of primitives is also performed based on pixel sample coverage. Consequently, primitives that have visible pixels after z culling operations reach the primitive shading unit. The non-z plane equations are only computed for geometry that is visible after the z culling operations. The primitive shading unit does not need to fetch vertex attributes from memory and does not need to compute non-z plane equations for the culled primitives.

Type: Grant

Filed: September 7, 2010

Date of Patent: April 8, 2014

Assignee: Nvidia Corporation

Inventors: Ziyad S. Hakura, Emmett M. Kilgariff
Order-preserving distributed rasterizer

Patent number: 8587581

Abstract: One embodiment of the present invention sets forth a technique for rendering graphics primitives in parallel while maintaining the API primitive ordering. Multiple, independent geometry units perform geometry processing concurrently on different graphics primitives. A primitive distribution scheme delivers primitives concurrently to multiple rasterizers at rates of multiple primitives per clock while maintaining the primitive ordering for each pixel. The multiple, independent rasterizer units perform rasterization concurrently on one or more graphics primitives, enabling the rendering of multiple primitives per system clock.

Type: Grant

Filed: October 15, 2009

Date of Patent: November 19, 2013

Assignee: Nvidia Corporation

Inventors: Steven E. Molnar, Emmett M. Kilgariff, Johnny S. Rhoades, Timothy John Purcell, Sean J. Treichler, Ziyad S. Hakura, Franklin C. Crow, James C. Bowman
Cull before vertex attribute fetch and vertex lighting

Patent number: 8564616

Abstract: One embodiment of the invention sets forth a mechanism for compiling a vertex shader program into two portions, a culling portion and a shading portion. The culling portion of the compiled vertex shader program specifies vertex attributes and instructions of the vertex shader program needed to determine whether early vertex culling operations should be performed on a batch of vertices associated with one or more primitives of a graphics scene. The shading portion of the compiled vertex shader program specifies the remaining vertex attributes and instructions of the vertex shader program for performing vertex lighting and performing other operations on the vertices in the batch of vertices. When the compiled vertex shader program is executed by graphics processing hardware, the shading portion of the compiled vertex shader is executed only when early vertex culling operations are not performed on the batch of vertices.

Type: Grant

Filed: July 17, 2009

Date of Patent: October 22, 2013

Assignee: Nvidia Corporation

Inventors: Ziyad S. Hakura, John Erik Lindholm, Emmett M. Kilgariff, Robert Ohannessian, Scott R. Whitman, James C. Bowman, Patrick R. Brown, Ross A. Cunniff
Cull before vertex attribute fetch and vertex lighting

Patent number: 8542247

Abstract: One embodiment of the invention sets forth a mechanism for compiling a vertex shader program into two portions, a culling portion and a shading portion. The culling portion of the compiled vertex shader program specifies vertex attributes and instructions of the vertex shader program needed to determine whether early vertex culling operations should be performed on a batch of vertices associated with one or more primitives of a graphics scene. The shading portion of the compiled vertex shader program specifies the remaining vertex attributes and instructions of the vertex shader program for performing vertex lighting and performing other operations on the vertices in the batch of vertices. When the compiled vertex shader program is executed by graphics processing hardware, the shading portion of the compiled vertex shader is executed only when early vertex culling operations are not performed on the batch of vertices.

Type: Grant

Filed: July 17, 2009

Date of Patent: September 24, 2013

Assignee: Nvidia Corporation

Inventors: Ziyad S. Hakura, John Erik Lindholm, Emmett M. Kilgariff, Robert Ohannessian, Scott R. Whitman, James C. Bowman, Patrick R. Brown, Ross A. Cunniff
Generating clip state for a batch of vertices

Patent number: 8384736

Abstract: One embodiment of the present invention sets forth a technique for generating a batch clip state stored in clip state machine (CSM) associated with a batch of vertices. Per-vertex clip state is generated for each vertex in the batch of vertices based on the position of each vertex relative to each clip plane. For a given vertex, per-vertex clip state indicates whether the vertex is inside or outside each of the one or more clip planes. The per-vertex clip states of all the vertices in the batch of vertices are coalesced into a batch clip state by determining whether each vertex in the batch of vertices is inside every clip plane, each vertex is outside at least one clip plane or neither. The batch clip state is stored in the CSM associated with the thread group that processes the batch of vertices that can be accessed by further stages of the graphics pipeline.

Type: Grant

Filed: October 14, 2009

Date of Patent: February 26, 2013

Assignee: NVIDIA Corporation

Inventors: John Erik Lindholm, Ziyad S. Hakura
TIME SLICE PROCESSING OF TESSELLATION AND GEOMETRY SHADERS

Publication number: 20130038620

Abstract: One embodiment of the present invention sets forth a technique for redistributing geometric primitives generated by tessellation and geometry shaders for processing by multiple graphics pipelines. Geometric primitives that are generated in a first processing cycle are collected and redistributed more evenly and in smaller tasks to the multiple graphics pipelines for vertex processing in a second processing cycle. The smaller tasks do not exceed the resource limits of a graphics pipeline and the per-vertex processing workloads of the graphics pipelines in the second cycle are balanced and make full use of resources. Therefore, the performance of the tessellation and geometry shaders is improved.

Type: Application

Filed: August 11, 2011

Publication date: February 14, 2013

Inventors: Ziyad S. Hakura, Emmett M. Kilgariff, Dale L. Kirkland, Johnny S. Rhoades, Cynthia Ann Edgeworth Allison, Karim M. Abdalla
Distributed calculation of plane equations

Patent number: 8310482

Abstract: A system for distributed of plane equation calculations. A work distribution unit is configured to receive a set of vertex data that includes meta data associated with each vertex in a modeled three-dimensional scene, to divide the set of vertex data into a plurality of batches of vertices, and to distribute the plurality of batches of vertices to one or more general processing clusters (GPCs). A processing cluster array includes the one or more (GPCs), where each GPC includes one or more shader-primitive-controller units (SPMs), and each SPM is configured to calculate plane equation coefficients for a subset of the vertices included in a batch of vertices. Advantageously, a distributed configuration of multiple plane equation calculation units decreases the size of the data bus that carries plane equation coefficients and increases overall processing throughput.

Type: Grant

Filed: December 1, 2008

Date of Patent: November 13, 2012

Assignee: NVIDIA Corporation

Inventors: Ziyad S. Hakura, Emmett M. Kilgariff
Screen compression

Patent number: 7965895

Abstract: Methods, circuits, and apparatus for reducing memory bandwidth used by a graphics processor. Uncompressed tiles are read from a display buffer portion of a graphics memory and received by an encoder. The uncompressed tiles are compressed and written back to the graphics memory. When a tile is needed again before it has been modified, the compressed version is read from memory, uncompressed, and displayed. To reduce the number of unnecessary writes of compressed tiles to memory, a tile is only written to memory if it has remained static for some number of refresh cycles. Also, to prevent a large number of compressed tiles being written to the display buffer in one refresh cycle, the encoder can be throttled after a number of tiles have been written. Validity information can be stored for use by a CRTC. If a tile is updated, the validity information is updated such that invalid compressed data is not read from memory and displayed.

Type: Grant

Filed: August 10, 2007

Date of Patent: June 21, 2011

Assignee: NVIDIA Corporation

Inventors: John M. Danskin, Ziyad S. Hakura, Edward L. Riegelsberger, Jason M. Musicer, Stephen D. Lew
DISTRIBUTED STREAM OUTPUT IN A PARALLEL PROCESSING UNIT

Publication number: 20110141122

Abstract: A technique for performing stream output operations in a parallel processing system is disclosed. A stream synchronization unit is provided that enables the parallel processing unit to track batches of vertices being processed in a graphics processing pipeline. A plurality of stream output units is also provided, where each stream output unit writes vertex attribute data to one or more stream output buffers for a portion of the batches of vertices. A messaging protocol is implemented between the stream synchronization unit and the plurality of stream output units that ensures that each of the stream output units writes vertex attribute data for the particular batch of vertices distributed to that particular stream output unit in the same order in the stream output buffers as the order in which the batch of vertices was received from a device driver by the parallel processing unit.

Type: Application

Filed: September 29, 2010

Publication date: June 16, 2011

Inventors: Ziyad S. Hakura, Rohit Gupta, Michael C. Shebanow, Emmett M. Kilgariff
VERTEX ATTRIBUTE BUFFER FOR INLINE IMMEDIATE ATTRIBUTES AND CONSTANTS

Publication number: 20110102448

Abstract: One embodiment of the present invention sets forth a technique for providing primitives and vertex attributes to the graphics pipeline. A primitive distribution unit constructs the batches of primitives and writes inline attributes and constants to a vertex attribute buffer (VAB) rather than passing the inline attributes directly to the graphics pipeline. A batch includes indices to attributes, where the attributes for each vertex are stored in a different VAB. The same VAB may be referenced by all of the vertices in a batch or different VABs may be referenced by different vertices in one or more batches. The batches are routed to the different processing engines in the graphics pipeline and each of the processing engines reads the VABs as needed to process the primitives. The number of parallel processing engines may be changed without changing the width or speed of the interconnect used to write the VABs.

Type: Application

Filed: September 30, 2010

Publication date: May 5, 2011

Inventors: Ziyad S. HAKURA, James C. Bowman, Jimmy Earl Chambers, Philip Browning Johnson, Philip Payman Shirvani
ORDER-PRESERVING DISTRIBUTED RASTERIZER

Publication number: 20110090220

Abstract: One embodiment of the present invention sets forth a technique for rendering graphics primitives in parallel while maintaining the API primitive ordering. Multiple, independent geometry units perform geometry processing concurrently on different graphics primitives. A primitive distribution scheme delivers primitives concurrently to multiple rasterizers at rates of multiple primitives per clock while maintaining the primitive ordering for each pixel. The multiple, independent rasterizer units perform rasterization concurrently on one or more graphics primitives, enabling the rendering of multiple primitives per system clock.

Type: Application

Filed: October 15, 2009

Publication date: April 21, 2011

Inventors: Steven E. Molnar, Emmett M. Kilgariff, Johnny S. Rhoades, Timothy John Purcell, Sean J. Treichler, Ziyad S. Hakura, Franklin C. Crow, James C. Bowman
CALCULATION OF PLANE EQUATIONS AFTER DETERMINATION OF Z-BUFFER VISIBILITY

Publication number: 20110080406

Abstract: One embodiment of the present invention sets forth a technique for computing plane equations for primitive shading after non-visible pixels are removed by z culling operations and pixel coverage has been determined. The z plane equations are computed before the plane equations for non-z primitive attributes are computed. The z plane equations are then used to perform screen-space z culling of primitives during and following rasterization. Culling of primitives is also performed based on pixel sample coverage. Consequently, primitives that have visible pixels after z culling operations reach the primitive shading unit. The non-z plane equations are only computed for geometry that is visible after the z culling operations. The primitive shading unit does not need to fetch vertex attributes from memory and does not need to compute non-z plane equations for the culled primitives.

Type: Application

Filed: September 7, 2010

Publication date: April 7, 2011

Inventors: Ziyad S. HAKURA, Emmett M. Kilgariff
Redistribution Of Generated Geometric Primitives

Publication number: 20110080404

Abstract: One embodiment of the present invention sets forth a technique for redistributing geometric primitives generated by tessellation and geometry shaders for per-vertex by multiple graphics pipelines. Geometric primitives that are generated in a first processing stage are collected and redistributed more evenly and in smaller batches to the multiple graphics pipelines for vertex processing in a second processing stage. The smaller batches do not exceed the resource limits of a graphics pipeline and the per-vertex processing workloads of the graphics pipelines in the second stage are balanced. Therefore, the performance of the tessellation and geometry shaders is improved.

Type: Application

Filed: October 4, 2010

Publication date: April 7, 2011

Inventors: Johnny S. RHOADES, Ziyad S. Hakura, Emmett M. Kilgariff, Dale L. Kirkland, Cynthia Ann Edgeworth Allison, Karl M. Wurstner, Karim M. Abdalla
Apparatus, system, and method for Z-culling

Patent number: 7755624

Abstract: A processor generates Z-cull information for tiles and groups of tiles. In one embodiment the processor includes an on-chip cache to coalesce Z information for tiles to identify occluded tiles. In a coprocessor embodiment, the processor provides Z-culling information to a graphics processor.

Type: Grant

Filed: November 7, 2008

Date of Patent: July 13, 2010

Assignee: Nvidia Corporation

Inventors: Ziyad S. Hakura, Michael Brian Cox, Brian K. Langendorf, Brad W. Simeral
Automated generation of theoretical performance analysis based upon workload and design configuration

Publication number: 20090125854

Abstract: A method of more efficiently, easily and cost-effectively analyzing the performance of a device model is disclosed. Embodiments enable automated generation of theoretical performance analysis for a device model based upon a workload associated with rendering graphical data and a configuration of the device model. The workload may be independent of design configuration, thereby enabling determination of the workload without simulating the device model. Additionally, the design configuration may be updated or changed without re-determining the workload. Accordingly, the graphical data may comprise a general or random test which is relatively large in size and covers a relatively large operational scope of the design. Additionally, the workload may comprise graphical information determined based upon the graphical data. Further, the theoretical performance analysis may indicate a graphics pipeline unit of the device model causing a bottleneck in a graphics pipeline of the device model.

Type: Application

Filed: November 8, 2007

Publication date: May 14, 2009

Applicant: NVIDIA Corporation

Inventors: Ziyad S. Hakura, John D. Tynefield, Thomas S. Green
System, apparatus and method for generating nonsequential predictions to access a memory

Patent number: 7461211

Abstract: A system, apparatus, and method are disclosed for storing and prioritizing predictions to anticipate nonsequential accesses to a memory. In one embodiment, an exemplary apparatus is configured as a prefetcher for predicting accesses to a memory. The prefetcher includes a prediction generator configured to generate a prediction that is unpatternable to an address. Also, the prefetcher also can include a target cache coupled to the prediction generator to maintain the prediction in a manner that determines a priority for the prediction. In another embodiment, the prefetcher can also include a priority adjuster. The priority adjuster sets a priority for a prediction relative to other predictions. In some cases, the placement of the prediction is indicative of the priority relative to priorities for the other predictions. In yet another embodiment, the prediction generator uses the priority to determine that the prediction is to be generated before other predictions.

Type: Grant

Filed: August 17, 2004

Date of Patent: December 2, 2008

Assignee: Nvidia Corporation

Inventors: Ziyad S. Hakura, Brian Keith Langendorf, Stefano A. Pescador, Radoslav Danilak, Brad W. Simeral
Apparatus, system, and method for Z-culling

Patent number: 7450120

Abstract: A processor generates Z-cull information for tiles and groups of tiles. In one embodiment the processor includes an on-chip cache to coalesce Z information for tiles to identify occluded tiles. In a coprocessor embodiment, the processor provides Z-culling information to a graphics processor.

Type: Grant

Filed: December 19, 2003

Date of Patent: November 11, 2008

Assignee: Nvidia Corporation

Inventors: Ziyad S. Hakura, Michael Brian Cox, Brian K. Langendorf, Brad W. Simeral
System, apparatus and method for issuing predictions from an inventory to access a memory

Patent number: 7441087

Abstract: A system, apparatus, and method are disclosed for managing predictive accesses to memory. In one embodiment, an exemplary apparatus is configured as a prediction inventory that stores predictions in a number of queues. Each queue is configured to maintain predictions until a subset of the predictions is either issued to access a memory or filtered out as redundant. In another embodiment, an exemplary prefetcher predicts accesses to a memory. The prefetcher comprises a speculator for generating a number of predictions and a prediction inventory, which includes queues each configured to maintain a group of items. The group of items typically includes a triggering address that corresponds to the group. Each item of the group is of one type of prediction. Also, the prefetcher includes an inventory filter configured to compare the number of predictions against one of the queues having the either the same or different prediction type as the number of predictions.

Type: Grant

Filed: August 17, 2004

Date of Patent: October 21, 2008

Assignee: NVIDIA Corporation

Inventors: Ziyad S. Hakura, Brian Keith Langendorf, Stefano A. Pescador, Radoslay Danilak, Brad W. Simeral

prev 1 2 3 4 5 6 next