Patents by Inventor Jeffrey Bolz

Jeffrey Bolz has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Techniques for render pass dependencies in an API

Patent number: 9727392

Abstract: Techniques for passing dependencies in an application programming interface API includes identifying a plurality of passes of execution commands. For each set of passes, wherein one pass is a destination pass and the other pass is a source pass to each other, one or more dependencies, of one or more dependency types, are determined between the execution commands of the destination pass and the source pass. Pass objects are then created for each identified pass, wherein each pass object records the execution commands and dependencies between the corresponding destination and source passes.

Type: Grant

Filed: September 16, 2015

Date of Patent: August 8, 2017

Assignee: NVIDIA CORPORATION

Inventor: Jeffrey Bolz
MANAGING DEFERRED CONTEXTS IN A CACHE TILING ARCHITECTURE

Publication number: 20170206623

Abstract: A method for managing bind-render-target commands in a tile-based architecture. The method includes receiving a requested set of bound render targets and a draw command. The method also includes, upon receiving the draw command, determining whether a current set of bound render targets includes each of the render targets identified in the requested set. The method further includes, if the current set does not include each render target identified in the requested set, then issuing a flush-tiling-unit-command to a parallel processing subsystem, modifying the current set to include each render target identified in the requested set, and issuing bind-render-target commands identifying the requested set to the tile-based architecture for processing. The method further includes, if the current set of render targets includes each render target identified in the requested set, then not issuing the flush-tiling-unit-command.

Type: Application

Filed: October 1, 2013

Publication date: July 20, 2017

Applicant: NVIDIA CORPORATION

Inventors: Ziyad S. HAKURA, Jeffrey A. BOLZ, Amanpreet GREWAL, Matthew JOHNSON, Andrei KHODAKOVSKY
MULTI-PASS RENDERING IN A SCREEN SPACE PIPELINE

Publication number: 20170148203

Abstract: A multi-pass unit interoperates with a device driver to configure a screen space pipeline to perform multiple processing passes with buffered graphics primitives. The multi-pass unit receives primitive data and state bundles from the device driver. The primitive data includes a graphics primitive and a primitive mask. The primitive mask indicates the specific passes when the graphics primitive should be processed. The state bundles include one or more state settings and a state mask. The state mask indicates the specific passes where the state settings should be applied. The primitives and state settings are interleaved. For a given pass, the multi-pass unit extracts the interleaved state settings for that pass and configures the screen space pipeline according to those state settings. The multi-pass unit also extracts the interleaved graphics primitives to be processed in that pass. Then, the multi-pass unit causes the screen space pipeline to process those graphics primitives.

Type: Application

Filed: November 25, 2015

Publication date: May 25, 2017

Inventors: Ziyad HAKURA, Cynthia ALLISON, Dale KIRKLAND, Jeffrey BOLZ, Yury URALSKY, Jonah ALBEN
MULTI-PASS RENDERING IN A SCREEN SPACE PIPELINE

Publication number: 20170148204

Abstract: A multi-pass unit interoperates with a device driver to configure a screen space pipeline to perform multiple processing passes with buffered graphics primitives. The multi-pass unit receives primitive data and state bundles from the device driver. The primitive data includes a graphics primitive and a primitive mask. The primitive mask indicates the specific passes when the graphics primitive should be processed. The state bundles include one or more state settings and a state mask. The state mask indicates the specific passes where the state settings should be applied. The primitives and state settings are interleaved. For a given pass, the multi-pass unit extracts the interleaved state settings for that pass and configures the screen space pipeline according to those state settings. The multi-pass unit also extracts the interleaved graphics primitives to be processed in that pass. Then, the multi-pass unit causes the screen space pipeline to process those graphics primitives.

Type: Application

Filed: November 25, 2015

Publication date: May 25, 2017

Inventors: Ziyad HAKURA, Cynthia ALLISON, Dale KIRKLAND, Jeffrey BOLZ, Yury URALSKY, Jonah ALBEN
TECHNIQUES FOR MAINTAINING ATOMICITY AND ORDERING FOR PIXEL SHADER OPERATIONS

Publication number: 20170116700

Abstract: A tile coalescer within a graphics processing pipeline coalesces coverage data into tiles. The coverage data indicates, for a set of XY positions, whether a graphics primitive covers those XY positions. The tile indicates, for a larger set of XY positions, whether one or more graphics primitives cover those XY positions. The tile coalescer includes coverage data in the tile only once for each XY position, thereby allowing the API ordering of the graphics primitives covering each XY position to be preserved. The tile is then distributed to a set of streaming multiprocessors for shading and blending operations. The different streaming multiprocessors execute thread groups to process the tile. In doing so, those thread groups may perform read-modify-write operations with data stored in memory. Each such thread group is scheduled to execute via atomic operations, and according to the API order of the associated graphics primitives.

Type: Application

Filed: October 27, 2015

Publication date: April 27, 2017

Inventors: ZIYAD HAKURA, ERIC LUM, DALE KIRKLAND, JACK CHOQUETTE, PATRICK R. BROWN, YURY Y. URALSKY, JEFFREY BOLZ
TECHNIQUES FOR MAINTAINING ATOMICITY AND ORDERING FOR PIXEL SHADER OPERATIONS

Publication number: 20170116699

Abstract: A tile coalescer within a graphics processing pipeline coalesces coverage data into tiles. The coverage data indicates, for a set of XY positions, whether a graphics primitive covers those XY positions. The tile indicates, for a larger set of XY positions, whether one or more graphics primitives cover those XY positions. The tile coalescer includes coverage data in the tile only once for each XY position, thereby allowing the API ordering of the graphics primitives covering each XY position to be preserved. The tile is then distributed to a set of streaming multiprocessors for shading and blending operations. The different streaming multiprocessors execute thread groups to process the tile. In doing so, those thread groups may perform read-modify-write operations with data stored in memory. Each such thread group is scheduled to execute via atomic operations, and according to the API order of the associated graphics primitives.

Type: Application

Filed: October 27, 2015

Publication date: April 27, 2017

Inventors: ZIYAD HAKURA, ERIC LUM, DALE KIRKLAND, JACK CHOQUETTE, PATRICK R. BROWN, YURY Y. URALSKY, JEFFREY BOLZ
TECHNIQUES FOR MAINTAINING ATOMICITY AND ORDERING FOR PIXEL SHADER OPERATIONS

Publication number: 20170116698

Abstract: A tile coalescer within a graphics processing pipeline coalesces coverage data into tiles. The coverage data indicates, for a set of XY positions, whether a graphics primitive covers those XY positions. The tile indicates, for a larger set of XY positions, whether one or more graphics primitives cover those XY positions. The tile coalescer includes coverage data in the tile only once for each XY position, thereby allowing the API ordering of the graphics primitives covering each XY position to be preserved. The tile is then distributed to a set of streaming multiprocessors for shading and blending operations. The different streaming multiprocessors execute thread groups to process the tile. In doing so, those thread groups may perform read-modify-write operations with data stored in memory. Each such thread group is scheduled to execute via atomic operations, and according to the API order of the associated graphics primitives.

Type: Application

Filed: October 27, 2015

Publication date: April 27, 2017

Inventors: ZIYAD HAKURA, ERIC LUM, DALE KIRKLAND, JACK CHOQUETTE, PATRICK R. BROWN, YURY Y. URALSKY, JEFFREY BOLZ
Load/store operations in texture hardware

Patent number: 9595075

Abstract: Approaches are disclosed for performing memory access operations in a texture processing pipeline having a first portion configured to process texture memory access operations and a second portion configured to process non-texture memory access operations. A texture unit receives a memory access request. The texture unit determines whether the memory access request includes a texture memory access operation. If the memory access request includes a texture memory access operation, then the texture unit processes the memory access request via at least the first portion of the texture processing pipeline, otherwise, the texture unit processes the memory access request via at least the second portion of the texture processing pipeline. One advantage of the disclosed approach is that the same processing and cache memory may be used for both texture operations and load/store operations to various other address spaces, leading to reduced surface area and power consumption.

Type: Grant

Filed: September 26, 2013

Date of Patent: March 14, 2017

Assignee: NVIDIA Corporation

Inventors: Steven J. Heinrich, Eric T. Anderson, Jeffrey A. Bolz, Jonathan Dunaisky, Ramesh Jandhyala, Joel McCormack, Alexander L. Minkin, Bryon S. Nordquist, Poornachandra Rao
CONTROLLING MULTI-PASS RENDERING SEQUENCES IN A CACHE TILING ARCHITECTURE

Publication number: 20170053375

Abstract: In one embodiment of the present invention a driver configures a graphics pipeline implemented in a cache tiling architecture to perform dynamically-defined multi-pass rendering sequences. In operation, based on sequence-specific configuration data, the driver determines an optimized tile size and, for each pixel in each pass, the set of pixels in each previous pass that influence the processing of the pixel. The driver then configures the graphics pipeline to perform per-tile rendering operations in a region that is translated by a pass-specific offset backward—vertically and/or horizontally—along a tiled caching traversal line. Notably, the offset ensures that the required pixel data from previous passes is available. The driver further configures the graphics pipeline to store the rendered data in cache lines.

Type: Application

Filed: August 18, 2015

Publication date: February 23, 2017

Inventor: Jeffrey A. BOLZ
TECHNIQUE FOR PERFORMING VARIABLE WIDTH DATA COMPRESSION USING A PALETTE OF ENCODINGS

Publication number: 20170053376

Abstract: A subsystem configured to encode an RGBA8 data stream assembles sequences of four-byte groups from the data stream. The subsystem decorrelates the red and blue channels, and computes a difference between each four-byte group and an anchor value. The anchor is encoded at full value. The subsystem then assigns each group a five-bit header based on the number and location of non-zero bytes and on the data content of the non-zero bytes within the group. The subsystem favors zero valued bytes. Thus, when a group includes only zero valued bytes, the header is sufficient to encode the group; no data bits are necessary. Further, two successive groups of zero-valued bytes may be encoded as a single header with no data bits, achieving further data reduction. Finally, the subsystem concatenates all the headers with associated data to yield the source data stream compressed to some ratio, e.g. four-to-one.

Type: Application

Filed: August 20, 2015

Publication date: February 23, 2017

Inventors: Jeffrey A. BOLZ, Jeffrey POOL
Optimizing triangle topology for path rendering

Patent number: 9558573

Abstract: A technique for efficiently rendering path images tessellates path contours into triangle tans comprising a set of representative triangles. Topology of the set of representative triangles is then optimized for greater rasterization efficiency by applying a flip operator to selected triangle pairs within the set of representative triangles. The optimized triangle pairs are then rendered using a path rendering technique, such as stencil and cover.

Type: Grant

Filed: December 17, 2012

Date of Patent: January 31, 2017

Assignee: NVIDIA Corporation

Inventors: Jeffrey A. Bolz, Mark J. Kilgard
STENCIL-THEN-COVER PATH RENDERING WITH SHARED EDGES

Publication number: 20170024897

Abstract: One embodiment of the present invention includes techniques for rasterizing primitives that include edges shared between paths. For each edge, a rasterizer unit selects and applies a sample rule from multiple sample rules. If the edge is shared, then the selected sample rule causes each group of coverage samples associated with a single color sample to be considered as either fully inside or fully outside the edge. Consequently, conflation artifacts caused when the number of coverage samples per pixel exceeds the number of color samples per pixel may be reduced. In prior-art techniques, reducing such conflation artifacts typically involves increasing the number of color samples per pixel to equal the number of coverage samples per pixel. Advantageously, the disclosed techniques enable rendering using algorithms that reduce the ratio of color to coverage samples, thereby decreasing memory consumption and memory bandwidth use, without causing conflation artifacts associated with shared edges.

Type: Application

Filed: October 10, 2016

Publication date: January 26, 2017

Inventors: Mark J. KILGARD, Jeffrey A. BOLZ
Work-queue-based graphics processing unit work creation

Patent number: 9489245

Abstract: One embodiment of the present invention enables threads executing on a processor to locally generate and execute work within that processor by way of work queues and command blocks. A device driver, as an initialization procedure for establishing memory objects that enable the threads to locally generate and execute work, generates a work queue, and sets a GP_GET pointer of the work queue to the first entry in the work queue. The device driver also, during the initialization procedure, sets a GP_PUT pointer of the work queue to the last free entry included in the work queue, thereby establishing a range of entries in the work queue into which new work generated by the threads can be loaded and subsequently executed by the processor. The threads then populate command blocks with generated work and point entries in the work queue to the command blocks to effect processor execution of the work stored in the command blocks.

Type: Grant

Filed: October 26, 2012

Date of Patent: November 8, 2016

Assignee: NVIDIA Corporation

Inventors: Ignacio Llamas, Craig Ross Duttweiler, Jeffrey A. Bolz, Daniel Elliot Wexler
SYSTEM AND METHOD FOR CREATING ALIASED MAPPINGS TO MINIMIZE IMPACT OF CACHE INVALIDATION

Publication number: 20160321773

Abstract: A parallel processor and a method of reducing texture cache invalidation are disclosed. In one embodiment, the parallel processor includes a cache configured to receive lines of data; and a parallel execution unit associated with the cache and configured to execute parallel counterparts of an operation. The parallel counterparts, when executed, are configured to create, in the cache, corresponding aliases of a line of data pertaining to the operation such that the parallel counterparts are operable to invalidate only the corresponding aliases.

Type: Application

Filed: April 28, 2015

Publication date: November 3, 2016

Inventor: Jeffrey Bolz
Stencil then cover path rendering with shared edges

Patent number: 9466115

Abstract: One embodiment of the present invention includes techniques for rasterizing primitives that include edges shared between paths. For each edge, a rasterizer unit selects and applies a sample rule from multiple sample rules. If the edge is shared, then the selected sample rule causes each group of coverage samples associated with a single color sample to be considered as either fully inside or fully outside the edge. Consequently, conflation artifacts caused when the number of coverage samples per pixel exceeds the number of color samples per pixel may be reduced. In prior-art techniques, reducing such conflation artifacts typically involves increasing the number of color samples per pixel to equal the number of coverage samples per pixel. Advantageously, the disclosed techniques enable rendering using algorithms that reduce the ratio of color to coverage samples, thereby decreasing memory consumption and memory bandwidth use, without causing conflation artifacts associated with shared edges.

Type: Grant

Filed: September 16, 2013

Date of Patent: October 11, 2016

Assignee: NVIDIA Corporation

Inventors: Mark J. Kilgard, Jeffrey A. Bolz
Stencil data compression system and method and graphics processing unit incorporating the same

Patent number: 9437025

Abstract: A system and method for compressing stencil data attendant to rendering an image. In one embodiment, the method includes: (1) selecting a base stencil value for a particular group, (2) selecting a single-bit delta value for each sample in the particular group and (3) storing the stencil base value and the delta values in a frame buffer.

Type: Grant

Filed: July 12, 2012

Date of Patent: September 6, 2016

Assignee: NVIDIA CORPORATION

Inventor: Jeffrey A. Bolz
Stencil then cover path rendering with shared edges

Patent number: 9418437

Abstract: One embodiment of the present invention includes techniques for rasterizing primitives that include edges shared between paths. For each edge, a rasterizer unit selects and applies a sample rule from multiple sample rules. If the edge is shared, then the selected sample rule causes each group of coverage samples associated with a single color sample to be considered as either fully inside or fully outside the edge. Consequently, conflation artifacts caused when the number of coverage samples per pixel exceeds the number of color samples per pixel may be reduced. In prior-art techniques, reducing such conflation artifacts typically involves increasing the number of color samples per pixel to equal the number of coverage samples per pixel. Advantageously, the disclosed techniques enable rendering using algorithms that reduce the ratio of color to coverage samples, thereby decreasing memory consumption and memory bandwidth use, without causing conflation artifacts associated with shared edges.

Type: Grant

Filed: September 16, 2013

Date of Patent: August 16, 2016

Assignee: NVIDIA CORPORATION

Inventors: Mark J. Kilgard, Jeffrey A. Bolz
Stencil buffer data compression

Patent number: 9390464

Abstract: A raster operations (ROP) unit is configured to compress stencil values included in a stencil buffer. The ROP unit divides the stencil values into groups, subdivides each group into two halves, and selects an anchor value for each half. If the difference between each of the stencil values and the corresponding anchor lies within an offset range, and the difference between the two anchors lies within a delta range, then the group is compressible. For a compressible group, the ROP unit encodes the anchor value, offsets from anchors, and an anchor delta. This encoding enables the ROP unit to operate on the compressed group instead of the uncompressed stencil values, reducing the number of memory and computational operations associated with the stencil values. Consequently, the ROP unit reduces memory bandwidth use, reduces power consumption, and increases rendering rate compared to conventional ROP units that implement less flexible compression techniques.

Type: Grant

Filed: December 4, 2013

Date of Patent: July 12, 2016

Assignee: NVIDIA Corporation

Inventors: Christian Amsinck, Bengt-olaf Schneider, Jeffrey A. Bolz
Efficient setup and evaluation of filled cubic bezier paths

Patent number: 9384570

Abstract: A graphics processing system includes a central processing unit that processes a cubic Bezier curve corresponding to a filled cubic Bezier path. Additionally, the graphics processing system includes a cubic preprocessor coupled to the central processing unit that formats the cubic Bezier curve to provide a formatted cubic Bezier curve having quadrilateral control points corresponding to a mathematically simple cubic curve. The graphics processing system further includes a graphics processing unit coupled to the cubic preprocessor that employs the formatted cubic Bezier curve in rendering the filled cubic Bezier path. A rendering unit and a display cubic Bezier path filling method are also provided.

Type: Grant

Filed: September 16, 2013

Date of Patent: July 5, 2016

Assignee: NVIDIA CORPORATION

Inventor: Jeffrey A Bolz
Bindless texture and image API

Patent number: 9349154

Abstract: One embodiment of the present invention sets for a method for accessing data objects stored in a memory that is accessible by a graphics processing unit (GPU). The method comprises the steps of creating a data object in the memory based on a command received from an application program, wherein the data object is organized non-linearly in the memory, transmitting a first handle associated with the data object to the application program such that data associated with different draw commands can be accessed by the GPU, wherein the first handle includes an address related to the location of the data object in the memory, receiving a first draw command as well as the first handle from the application program, and transmitting the first draw command and the first handle to the GPU for processing.

Type: Grant

Filed: March 31, 2011

Date of Patent: May 24, 2016

Assignee: NVIDIA Corporation

Inventors: Jeffrey A. Bolz, Patrick R. Brown

prev 1 2 3 4 5 next