Patents by Inventor Jeffrey A. Bolz
Jeffrey A. Bolz has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 10607390Abstract: A device driver is configured to identify a group of compute shaders to be executed in multiple traversals of a graphics processing pipeline. Each such compute shader accesses a compute tile of data having particular dimensions. The device driver interoperates with a tiling unit to determines dimension for a cache tile so that an integer multiple of each compute tile will fit evenly within the cache tile. Thus, when executing compute shaders in different traversals of the graphics processing pipeline, the data processed by those compute shaders can be cached in the cache tile between passes.Type: GrantFiled: December 14, 2016Date of Patent: March 31, 2020Assignee: NVIDIA CorporationInventor: Jeffrey A. Bolz
-
Patent number: 10535114Abstract: In one embodiment of the present invention a driver configures a graphics pipeline implemented in a cache tiling architecture to perform dynamically-defined multi-pass rendering sequences. In operation, based on sequence-specific configuration data, the driver determines an optimized tile size and, for each pixel in each pass, the set of pixels in each previous pass that influence the processing of the pixel. The driver then configures the graphics pipeline to perform per-tile rendering operations in a region that is translated by a pass-specific offset backward—vertically and/or horizontally—along a tiled caching traversal line. Notably, the offset ensures that the required pixel data from previous passes is available. The driver further configures the graphics pipeline to store the rendered data in cache lines.Type: GrantFiled: August 18, 2015Date of Patent: January 14, 2020Assignee: NVIDIA CORPORATIONInventor: Jeffrey A. Bolz
-
Patent number: 10453168Abstract: A tile coalescer within a graphics processing pipeline coalesces coverage data into tiles. The coverage data indicates, for a set of XY positions, whether a graphics primitive covers those XY positions. The tile indicates, for a larger set of XY positions, whether one or more graphics primitives cover those XY positions. The tile coalescer includes coverage data in the tile only once for each XY position, thereby allowing the API ordering of the graphics primitives covering each XY position to be preserved. The tile is then distributed to a set of streaming multiprocessors for shading and blending operations. The different streaming multiprocessors execute thread groups to process the tile. In doing so, those thread groups may perform read-modify-write operations with data stored in memory. Each such thread group is scheduled to execute via atomic operations, and according to the API order of the associated graphics primitives.Type: GrantFiled: August 17, 2018Date of Patent: October 22, 2019Assignee: NVIDIA CORPORATIONInventors: Ziyad Hakura, Eric Lum, Dale Kirkland, Jack Choquette, Patrick R. Brown, Yury Y. Uralsky, Jeffrey Bolz
-
Patent number: 10430989Abstract: A multi-pass unit interoperates with a device driver to configure a screen space pipeline to perform multiple processing passes with buffered graphics primitives. The multi-pass unit receives primitive data and state bundles from the device driver. The primitive data includes a graphics primitive and a primitive mask. The primitive mask indicates the specific passes when the graphics primitive should be processed. The state bundles include one or more state settings and a state mask. The state mask indicates the specific passes where the state settings should be applied. The primitives and state settings are interleaved. For a given pass, the multi-pass unit extracts the interleaved state settings for that pass and configures the screen space pipeline according to those state settings. The multi-pass unit also extracts the interleaved graphics primitives to be processed in that pass. Then, the multi-pass unit causes the screen space pipeline to process those graphics primitives.Type: GrantFiled: November 25, 2015Date of Patent: October 1, 2019Assignee: NVIDIA CORPORATIONInventors: Ziyad Hakura, Cynthia Allison, Dale Kirkland, Jeffrey Bolz, Yury Uralsky, Jonah Alben
-
Patent number: 10417990Abstract: A method of binding graphics resources is provided that includes: (1) identifying graphics resources for binding, (2) generating a bind group for the graphics resources, (3) organizing the bind group into a bind group memory using a bind group layout and (4) providing bind group control for processing of the bind group. A method of organizing graphics resources and a resource organizing unit are also provided.Type: GrantFiled: September 16, 2015Date of Patent: September 17, 2019Assignee: Nvidia CorporationInventor: Jeffrey A. Bolz
-
Patent number: 10187663Abstract: A subsystem configured to encode an RGBA8 data stream assembles sequences of four-byte groups from the data stream. The subsystem decorrelates the red and blue channels, and computes a difference between each four-byte group and an anchor value. The anchor is encoded at full value. The subsystem then assigns each group a five-bit header based on the number and location of non-zero bytes and on the data content of the non-zero bytes within the group. The subsystem favors zero valued bytes. Thus, when a group includes only zero valued bytes, the header is sufficient to encode the group; no data bits are necessary. Further, two successive groups of zero-valued bytes may be encoded as a single header with no data bits, achieving further data reduction. Finally, the subsystem concatenates all the headers with associated data to yield the source data stream compressed to some ratio, e.g. four-to-one.Type: GrantFiled: August 20, 2015Date of Patent: January 22, 2019Assignee: NVIDIA CORPORATIONInventors: Jeffrey A. Bolz, Jeffrey Pool
-
Publication number: 20180374185Abstract: A tile coalescer within a graphics processing pipeline coalesces coverage data into tiles. The coverage data indicates, for a set of XY positions, whether a graphics primitive covers those XY positions. The tile indicates, for a larger set of XY positions, whether one or more graphics primitives cover those XY positions. The tile coalescer includes coverage data in the tile only once for each XY position, thereby allowing the API ordering of the graphics primitives covering each XY position to be preserved. The tile is then distributed to a set of streaming multiprocessors for shading and blending operations. The different streaming multiprocessors execute thread groups to process the tile. In doing so, those thread groups may perform read-modify-write operations with data stored in memory. Each such thread group is scheduled to execute via atomic operations, and according to the API order of the associated graphics primitives.Type: ApplicationFiled: August 17, 2018Publication date: December 27, 2018Inventors: Ziyad Hakura, Eric Lum, Dale Kirkland, Jack Choquette, Patrick R. Brown, Yury Y. Uralsky, Jeffrey Bolz
-
Patent number: 10147222Abstract: A multi-pass unit interoperates with a device driver to configure a screen space pipeline to perform multiple processing passes with buffered graphics primitives. The multi-pass unit receives primitive data and state bundles from the device driver. The primitive data includes a graphics primitive and a primitive mask. The primitive mask indicates the specific passes when the graphics primitive should be processed. The state bundles include one or more state settings and a state mask. The state mask indicates the specific passes where the state settings should be applied. The primitives and state settings are interleaved. For a given pass, the multi-pass unit extracts the interleaved state settings for that pass and configures the screen space pipeline according to those state settings. The multi-pass unit also extracts the interleaved graphics primitives to be processed in that pass. Then, the multi-pass unit causes the screen space pipeline to process those graphics primitives.Type: GrantFiled: November 25, 2015Date of Patent: December 4, 2018Assignee: NVIDIA CORPORATIONInventors: Ziyad Hakura, Cynthia Allison, Dale Kirkland, Jeffrey Bolz, Yury Uralsky, Jonah Alben
-
Patent number: 10134169Abstract: One embodiment of the present invention sets forth a method for accessing texture objects stored within a texture memory. The method comprises the steps of receiving a texture bind request from an application program, wherein the texture bind request includes an object identifier that identifies a first texture object stored in the texture memory and an image identifier that identifies a first image unit, binding the first texture object to the first image unit based on the texture bind request, receiving, within a shader engine, a first shading program command from the application program for performing a first memory access operation on the first texture object, wherein the memory access operation is a store operation or atomic operation to an arbitrary location in the image, and performing, within the shader engine, the first memory access operation on the first texture object via the first image unit.Type: GrantFiled: August 12, 2010Date of Patent: November 20, 2018Assignee: NVIDIA CORPORATIONInventors: Jeffrey A. Bolz, Patrick R. Brown
-
Patent number: 10121220Abstract: A parallel processor and a method of reducing texture cache invalidation are disclosed. In one embodiment, the parallel processor includes a cache configured to receive lines of data; and a parallel execution unit associated with the cache and configured to execute parallel counterparts of an operation. The parallel counterparts, when executed, are configured to create, in the cache, corresponding aliases of a line of data pertaining to the operation such that the parallel counterparts are operable to invalidate only the corresponding aliases.Type: GrantFiled: April 28, 2015Date of Patent: November 6, 2018Assignee: Nvidia CorporationInventor: Jeffrey Bolz
-
Patent number: 10083036Abstract: One embodiment of the present invention sets forth a technique for managing graphics processing resources in a tile-based architecture. The technique includes storing a release packet associated with a graphics processing resource in a buffer and initiating a replay of graphics primitives stored in the buffer and associated with the graphics processing resource. The technique further includes, for each tile included in a plurality of tiles and processed during the replay, reading the release packet and determining whether the tile is a last tile processed during the replay. The technique further includes determining not to transmit the release packet to a screen-space pipeline and continuing to read graphics data stored in the buffer if the tile is not the last tile to be processed during the replay, or transmitting the release packet to the screen-space pipeline if the tile is the last tile to be processed during the replay.Type: GrantFiled: October 3, 2013Date of Patent: September 25, 2018Assignee: NVIDIA CORPORATIONInventors: Ziyad S. Hakura, Cynthia Ann Edgeworth Allison, Dale L. Kirkland, Andrei Khodakovsky, Jeffrey A. Bolz
-
Patent number: 10083514Abstract: One embodiment of the present invention includes techniques for rasterizing primitives that include edges shared between paths. For each edge, a rasterizer unit selects and applies a sample rule from multiple sample rules. If the edge is shared, then the selected sample rule causes each group of coverage samples associated with a single color sample to be considered as either fully inside or fully outside the edge. Consequently, conflation artifacts caused when the number of coverage samples per pixel exceeds the number of color samples per pixel may be reduced. In prior-art techniques, reducing such conflation artifacts typically involves increasing the number of color samples per pixel to equal the number of coverage samples per pixel. Advantageously, the disclosed techniques enable rendering using algorithms that reduce the ratio of color to coverage samples, thereby decreasing memory consumption and memory bandwidth use, without causing conflation artifacts associated with shared edges.Type: GrantFiled: October 10, 2016Date of Patent: September 25, 2018Assignee: NVIDIA CORPORATIONInventors: Mark J. Kilgard, Jeffrey A. Bolz
-
Patent number: 10055806Abstract: A tile coalescer within a graphics processing pipeline coalesces coverage data into tiles. The coverage data indicates, for a set of XY positions, whether a graphics primitive covers those XY positions. The tile indicates, for a larger set of XY positions, whether one or more graphics primitives cover those XY positions. The tile coalescer includes coverage data in the tile only once for each XY position, thereby allowing the API ordering of the graphics primitives covering each XY position to be preserved. The tile is then distributed to a set of streaming multiprocessors for shading and blending operations. The different streaming multiprocessors execute thread groups to process the tile. In doing so, those thread groups may perform read-modify-write operations with data stored in memory. Each such thread group is scheduled to execute via atomic operations, and according to the API order of the associated graphics primitives.Type: GrantFiled: October 27, 2015Date of Patent: August 21, 2018Assignee: NVIDIA CORPORATIONInventors: Ziyad Hakura, Eric Lum, Dale Kirkland, Jack Choquette, Patrick R. Brown, Yury Y. Uralsky, Jeffrey Bolz
-
Patent number: 10032242Abstract: A method for managing bind-render-target commands in a tile-based architecture. The method includes receiving a requested set of bound render targets and a draw command. The method also includes, upon receiving the draw command, determining whether a current set of bound render targets includes each of the render targets identified in the requested set. The method further includes, if the current set does not include each render target identified in the requested set, then issuing a flush-tiling-unit-command to a parallel processing subsystem, modifying the current set to include each render target identified in the requested set, and issuing bind-render-target commands identifying the requested set to the tile-based architecture for processing. The method further includes, if the current set of render targets includes each render target identified in the requested set, then not issuing the flush-tiling-unit-command.Type: GrantFiled: October 1, 2013Date of Patent: July 24, 2018Assignee: NVIDIA CORPORATIONInventors: Ziyad S. Hakura, Jeffrey A. Bolz, Amanpreet Grewal, Matthew Johnson, Andrei Khodakovsky
-
Patent number: 10032245Abstract: A tile coalescer within a graphics processing pipeline coalesces coverage data into tiles. The coverage data indicates, for a set of XY positions, whether a graphics primitive covers those XY positions. The tile indicates, for a larger set of XY positions, whether one or more graphics primitives cover those XY positions. The tile coalescer includes coverage data in the tile only once for each XY position, thereby allowing the API ordering of the graphics primitives covering each XY position to be preserved. The tile is then distributed to a set of streaming multiprocessors for shading and blending operations. The different streaming multiprocessors execute thread groups to process the tile. In doing so, those thread groups may perform read-modify-write operations with data stored in memory. Each such thread group is scheduled to execute via atomic operations, and according to the API order of the associated graphics primitives.Type: GrantFiled: October 27, 2015Date of Patent: July 24, 2018Assignee: NVIDIA CORPORATIONInventors: Ziyad Hakura, Eric Lum, Dale Kirkland, Jack Choquette, Patrick R. Brown, Yury Y. Uralsky, Jeffrey Bolz
-
Patent number: 10019776Abstract: A tile coalescer within a graphics processing pipeline coalesces coverage data into tiles. The coverage data indicates, for a set of XY positions, whether a graphics primitive covers those XY positions. The tile indicates, for a larger set of XY positions, whether one or more graphics primitives cover those XY positions. The tile coalescer includes coverage data in the tile only once for each XY position, thereby allowing the API ordering of the graphics primitives covering each XY position to be preserved. The tile is then distributed to a set of streaming multiprocessors for shading and blending operations. The different streaming multiprocessors execute thread groups to process the tile. In doing so, those thread groups may perform read-modify-write operations with data stored in memory. Each such thread group is scheduled to execute via atomic operations, and according to the API order of the associated graphics primitives.Type: GrantFiled: October 27, 2015Date of Patent: July 10, 2018Assignee: NVIDIA CORPORATIONInventors: Ziyad Hakura, Eric Lum, Dale Kirkland, Jack Choquette, Patrick R. Brown, Yury Y. Uralsky, Jeffrey Bolz
-
Publication number: 20180165787Abstract: A device driver is configured to identify a group of compute shaders to be executed in multiple traversals of a graphics processing pipeline. Each such compute shader accesses a compute tile of data having particular dimensions. The device driver interoperates with a tiling unit to determines dimension for a cache tile so that an integer multiple of each compute tile will fit evenly within the cache tile. Thus, when executing compute shaders in different traversals of the graphics processing pipeline, the data processed by those compute shaders can be cached in the cache tile between passes.Type: ApplicationFiled: December 14, 2016Publication date: June 14, 2018Inventor: Jeffrey A. Bolz
-
Patent number: 9773341Abstract: One embodiment of the present invention includes techniques for rasterizing geometries. First, a processing unit defines a bounding primitive that covers the geometry and does not include any internal edges. If the bounding primitive intersects any enabled clip plane, then the processing unit generates fragments to fill a current viewport. Alternatively, the processing unit generates fragments to fill the bounding primitive. Because the rasterized region includes no internal edges, conflation artifacts caused when the number of coverage samples per pixel exceeds the number of color samples per pixel may be reduced. In prior-art techniques, reducing such conflation artifacts typically involves increasing the number of color samples per pixel to equal the number of coverage samples per pixel.Type: GrantFiled: August 20, 2013Date of Patent: September 26, 2017Assignee: NVIDIA CorporationInventors: Jeffrey A. Bolz, Mark J. Kilgard
-
Patent number: 9767600Abstract: A graphics processing pipeline within a parallel processing unit (PPU) is configured to perform path rendering by generating a collection of graphics primitives that represent each path to be rendered. The graphics processing pipeline determines the coverage of each primitive at a number of stencil sample locations within each different pixel. Then, the graphics processing pipeline reduces the number of stencil samples down to a smaller number of color samples, for each pixel. The graphics processing pipeline is configured to modulate a given color sample associated with a given pixel based on the color values of any graphics primitives that cover the stencil samples from which the color sample was reduced. The final color of the pixel is determined by downsampling the color samples associated with the pixel.Type: GrantFiled: September 5, 2013Date of Patent: September 19, 2017Assignee: NVIDIA CorporationInventors: Jeffrey A. Bolz, Mark J. Kilgard, Henry Packard Moreton, Rui M. Bastos, Eric B. Lum
-
Patent number: 9754561Abstract: One embodiment of the present invention includes a memory management unit (MMU) that is configured to manage sparse mappings. The MMU processes requests to translate virtual addresses to physical addresses based on page table entries (PTEs) that indicate a sparse status. If the MMU determines that the PTE does not include a mapping from a virtual address to a physical address, then the MMU responds to the request based on the sparse status. If the sparse status is active, then the MMU determines the physical address based on whether the type of the request is a write operation and, subsequently, generates an acknowledgement of the request. By contrast, if the sparse status is not active, then the MMU generates a page fault. Advantageously, the disclosed embodiments enable the computer system to manage sparse mappings without incurring the performance degradation associated with both page faults and conventional software-based sparse mapping management.Type: GrantFiled: October 4, 2013Date of Patent: September 5, 2017Assignee: NVIDIA CORPORATIONInventors: Jonathan Dunaisky, Henry Packard Moreton, Jeffrey A. Bolz, Yury Y. Uralsky, James Leroy Deming, Rui M. Bastos, Patrick R. Brown, Amanpreet Grewal, Christian Amsinck, Poornachandra Rao, Jerome F. Duluk, Jr., Andrew J. Tao