Patents by Inventor Rouslan Dimitrov

Rouslan Dimitrov has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Architecture and algorithms for data compression

Patent number: 10338820

Abstract: A system architecture conserves memory bandwidth by including compression utility to process data transfers from the cache into external memory. The cache decompresses transfers from external memory and transfers full format data to naive clients that lack decompression capability and directly transfers compressed data to savvy clients that include decompression capability. An improved compression algorithm includes software that computes the difference between the current data word and each of a number of prior data words. Software selects the prior data word with the smallest difference as the nearest match and encodes the bit width of the difference to this data word. Software then encodes the difference between the current stride and the closest previous stride. Software combines the stride, bit width, and difference to yield final encoded data word. Software may encode the stride of one data word as a value relative to the stride of a previous data word.

Type: Grant

Filed: June 7, 2016

Date of Patent: July 2, 2019

Assignee: NVIDIA CORPORATION

Inventors: Rouslan Dimitrov, Jeff Pool, Praveen Krishnamurthy, Chris Amsinck, Karan Mehra, Scott Cutler
SYSTEMS AND METHODS FOR FRAME TIME SMOOTHING BASED ON MODIFIED ANIMATION ADVANCEMENT AND USE OF POST RENDER QUEUES

Publication number: 20190172181

Abstract: Embodiments of the present invention provide end-to-end frame time synchronization designed to improve smoothness for displaying images of 3D applications, such as PC gaming applications. Traditionally, an application that renders 3D graphics functions based on the assumption that the average render time will be used as the animation time for a given frame. When this condition is not met, and the render time for a frame does not match the average render time of prior frames, the frames are not captured or displayed at a consistent rate. This invention enables feedback to be provided to the rendering application for adjusting the animation times used to produce new frames, and a post-render queue is used to store completed frames for mitigating stutter and hitches. Flip control is used to sync the display of a rendered frame with the animation time used to generate the frame, thereby producing a smooth, consistent image.

Type: Application

Filed: December 3, 2018

Publication date: June 6, 2019

Inventors: Thomas Albert Petersen, Ankan Banerjee, Shishir Goyal, Sau Yan Keith Li, Lars Nordskog, Rouslan Dimitrov
DYNAMIC JITTER AND LATENCY-TOLERANT RENDERING

Publication number: 20190164518

Abstract: Systems and techniques for streaming video with dynamic jitter tolerance are described. In one example, a system includes a server executing an application and generating image frames associated with the application at a frame rate, and a client which displays the image frames on a display that has a predetermined refresh rate and which monitors arrival times of the image frames in relation to the predetermined refresh rate. The server is further configured to dynamically change the frame rate based on the monitoring so that the frame rate more closely corresponds to the predetermined refresh rate of the client's display.

Type: Application

Filed: October 22, 2018

Publication date: May 30, 2019

Inventor: Rouslan DIMITROV
METHOD AND APPARATUS FOR RENDERING PERSPECTIVE-CORRECT IMAGES FOR A TILTED MULTI-DISPLAY ENVIRONMENT

Publication number: 20180322683

Abstract: Techniques for rendering images on multiple tilted displays concurrently to mitigate perspective distortion are disclosed herein. According to one described approach, viewports are assigned to a center monitor and two peripheral monitors. Scene data for the viewports is calculated, and geometric primitives are generated for the viewports based on the scene data. Image transformation is performed based on a modified perspective value to modify geometry of the geometric primitives based on tilt angles of the displays, and the geometric primitives are rasterized using the modified geometry.

Type: Application

Filed: July 17, 2017

Publication date: November 8, 2018

Inventors: Rouslan Dimitrov, Yury Uralsky, Lars Nordskog, Dmitry Zhdan
METHOD AND APPARATUS FOR RENDERING PERSPECTIVE-CORRECT IMAGES FOR A TILTED MULTI-DISPLAY ENVIRONMENT

Publication number: 20180322682

Abstract: Techniques for rendering images on multiple tilted displays concurrently to mitigate perspective distortion are disclosed herein. According to one described approach, viewports are assigned to a center monitor and two peripheral monitors. Scene data for the viewports is calculated, and geometric primitives are generated for the viewports based on the scene data. Image transformation is performed based on a modified perspective value to modify geometry of the geometric primitives based on tilt angles of the displays, and the geometric primitives are rasterized using the modified geometry.

Type: Application

Filed: July 17, 2017

Publication date: November 8, 2018

Inventors: Rouslan Dimitrov, Yury Uralsky, Lars Nordskog, Dmitriy Zhdan
ARCHITECTURE AND ALGORITHMS FOR DATA COMPRESSION

Publication number: 20170351429

Abstract: A system architecture conserves memory bandwidth by including compression utility to process data transfers from the cache into external memory. The cache decompresses transfers from external memory and transfers full format data to naive clients that lack decompression capability and directly transfers compressed data to savvy clients that include decompression capability. An improved compression algorithm includes software that computes the difference between the current data word and each of a number of prior data words. Software selects the prior data word with the smallest difference as the nearest match and encodes the bit width of the difference to this data word. Software then encodes the difference between the current stride and the closest previous stride. Software combines the stride, bit width, and difference to yield final encoded data word. Software may encode the stride of one data word as a value relative to the stride of a previous data word.

Type: Application

Filed: June 7, 2016

Publication date: December 7, 2017

Inventors: Rouslan DIMITROV, Jeff POOL, Praveen KRISHNAMURTHY, Chris AMSINCK, Karan MEHRA, Scott CUTLER
Heuristics for improving performance in a tile based architecture

Patent number: 9792122

Abstract: One embodiment of the present invention includes a technique for processing graphics primitives in a tile-based architecture. The technique includes storing, in a buffer, a first plurality of graphics primitives and a first plurality of state bundles received from the world-space pipeline. The technique further includes determining, based on a first condition, that the first plurality of graphics primitives should be replayed from the buffer, and, in response, replaying the first plurality of graphics primitives against a first tile included in a first plurality of tiles. Replaying the first plurality of graphics primitives includes comparing each graphics primitive against the first tile to determine whether the graphics primitive intersects the first tile, determining that one or more graphics primitives intersects the first tile, and transmitting the one or more graphics primitives and one or more associated state bundles to a screen-space pipeline for processing.

Type: Grant

Filed: October 4, 2013

Date of Patent: October 17, 2017

Assignee: NVIDIA CORPORATION

Inventors: Ziyad S. Hakura, Walter R. Steiner, Cynthia Ann Edgeworth Allison, Rouslan Dimitrov, Karim M. Abdalla, Dale L. Kirkland, Emmett M. Kilgariff
Hierarchical tiled caching

Patent number: 9779533

Abstract: One embodiment of the present invention includes a method for processing graphics objects. The method includes receiving a first draw-call and a second draw-call. The method also includes dividing the first draw-call into a first set of sub-draw-calls and the second draw-call into a second set of sub-draw-calls. The method further includes identifying a first screen tile. The method also includes identifying a first group of sub-draw-calls included in the first set of sub-draw-calls that overlap the first screen tile and a second group of sub-draw-calls included in the second set of sub-draw-calls that overlap the second screen tile. The method further includes causing the first group of sub-draw-calls and the second group of sub-draw-calls to be processed together.

Type: Grant

Filed: January 27, 2014

Date of Patent: October 3, 2017

Assignee: NVIDIA Corporation

Inventors: Rouslan Dimitrov, Ziyad S. Hakura
Caching of adaptively sized cache tiles in a unified L2 cache with surface compression

Patent number: 9734548

Abstract: One embodiment of the present invention includes techniques for adaptively sizing cache tiles in a graphics system. A device driver associated with a graphics system sets a cache tile size associated with a cache tile to a first size. The detects a change from a first render target configuration that includes a first set of render targets to a second render target configuration that includes a second set of render targets. The device driver sets the cache tile size to a second size based on the second render target configuration. One advantage of the disclosed approach is that the cache tile size is adaptively sized, resulting in fewer cache tiles for less complex render target configurations. Adaptively sizing cache tiles leads to more efficient processor utilization and reduced power requirements. In addition, a unified L2 cache tile allows dynamic partitioning of cache memory between cache tile data and other data.

Type: Grant

Filed: August 28, 2013

Date of Patent: August 15, 2017

Assignee: NVIDIA Corporation

Inventors: Ziyad S. Hakura, Rouslan Dimitrov, Emmett M. Kilgariff, Andrei Khodakovsky
Adaptive multilevel binning to improve hierarchical caching

Patent number: 9720842

Abstract: A device driver calculates a tile size for a plurality of cache memories in a cache hierarchy. The device driver calculates a storage capacity of a first cache memory. The device driver calculates a first tile size based on the storage capacity of the first cache memory and one or more additional characteristics. The device driver calculates a storage capacity of a second cache memory. The device driver calculates a second tile size based on the storage capacity of the second cache memory and one or more additional characteristics, where the second tile size is different than the first tile size. The device driver transmits the second tile size to a second coalescing binning unit. One advantage of the disclosed techniques is that data locality and cache memory hit rates are improved where tile size is optimized for each cache level in the cache hierarchy.

Type: Grant

Filed: February 20, 2013

Date of Patent: August 1, 2017

Assignee: NVIDIA Corporation

Inventors: Rouslan Dimitrov, Rui Bastos, Ziyad S. Hakura, Eric B. Lum
Efficient super-sampling with per-pixel shader threads

Patent number: 9495721

Abstract: Techniques for dispatching pixel information in a graphics processing pipeline. A fragment processing unit generates a pixel that includes multiple samples based on a first portion of a graphics primitive received by a first thread. The fragment processing unit calculates a first value for the first pixel, where the first value is calculated only once for the pixel. The fragment processing unit calculates a first set of values for the samples, where each value in the first set of values corresponds to a different sample and is calculated only once for the corresponding sample. The fragment processing unit combines the first value with each value in the first set of values to create a second set of values. The fragment processing unit creates one or more dispatch messages to store the second set of values in a set of output registers.

Type: Grant

Filed: December 21, 2012

Date of Patent: November 15, 2016

Assignee: NVIDIA Corporation

Inventors: Jerome F. Duluk, Jr., Rouslan Dimitrov, Eric Lum, Rui Bastos
Rendering using multiple render target sample masks

Patent number: 9396515

Abstract: One embodiment sets forth a method for transforming 3-D images into 2-D rendered images using render target sample masks. A software application creates multiple render targets associated with a surface. For each render target, the software application also creates an associated render target sample mask configured to select one or more samples included in each pixel. Within the graphics pipeline, a pixel shader processes each pixel individually and outputs multiple render target-specific color values. For each render target, a ROP unit uses the associated render target sample mask to select covered samples included in the pixel. Subsequently, the ROP unit uses the render target-specific color value to update the selected samples in the render target, thereby achieving sample-level color granularity.

Type: Grant

Filed: August 16, 2013

Date of Patent: July 19, 2016

Assignee: NVIDIA CORPORATION

Inventors: Eric B. Lum, Jerome F. Duluk, Jr., Yury Y. Uralsky, Rouslan Dimitrov, Rui M. Bastos
Techniques for adaptively generating bounding boxes

Patent number: 9342311

Abstract: One embodiment of the present invention includes a method for generating accumulated bounding boxes for graphics primitives. The method includes generating a first bounding box associated with a first graphics primitive. The method further includes, for each graphics primitive included in a first set of one or more additional graphics primitives, determining that the graphics primitive is within a threshold distance of the first bounding box, and adding the graphics primitive to the first bounding box. The method further includes determining not to add a second graphics primitive to the first bounding box. The method further includes generating a second bounding box associated with the second graphics primitive. Finally, the method includes transmitting the first bounding box to a tiling unit via a crossbar. One advantage of the disclosed embodiments is that multiple bounding boxes are combined to generate an accumulated bounding box that is then transferred across the crossbar.

Type: Grant

Filed: August 14, 2013

Date of Patent: May 17, 2016

Assignee: NVIDIA Corporation

Inventors: Ziyad S. Hakura, Pierre Souillot, Cynthia Allison, Dale L. Kirkland, Rouslan Dimitrov
HIERARCHICAL TILED CACHING

Publication number: 20150213638

Abstract: One embodiment of the present invention includes a method for processing graphics objects. The method includes receiving a first draw-call and a second draw-call. The method also includes dividing the first draw-call into a first set of sub-draw-calls and the second draw-call into a second set of sub-draw-calls. The method further includes identifying a first screen tile. The method also includes identifying a first group of sub-draw-calls included in the first set of sub-draw-calls that overlap the first screen tile and a second group of sub-draw-calls included in the second set of sub-draw-calls that overlap the second screen tile. The method further includes causing the first group of sub-draw-calls and the second group of sub-draw-calls to be processed together.

Type: Application

Filed: January 27, 2014

Publication date: July 30, 2015

Applicant: NVIDIA CORPORATION

Inventors: Rouslan DIMITROV, Ziyad S. HAKURA
RENDERING USING MULTIPLE RENDER TARGET SAMPLE MASKS

Publication number: 20150049110

Abstract: One embodiment sets forth a method for transforming 3-D images into 2-D rendered images using render target sample masks. A software application creates multiple render targets associated with a surface. For each render target, the software application also creates an associated render target sample mask configured to select one or more samples included in each pixel. Within the graphics pipeline, a pixel shader processes each pixel individually and outputs multiple render target-specific color values. For each render target, a ROP unit uses the associated render target sample mask to select covered samples included in the pixel. Subsequently, the ROP unit uses the render target-specific color value to update the selected samples in the render target, thereby achieving sample-level color granularity.

Type: Application

Filed: August 16, 2013

Publication date: February 19, 2015

Applicant: NVIDIA CORPORATION

Inventors: Eric B. LUM, Jerome F. DULUK, JR., Yury Y. URALSKY, Rouslan DIMITROV, Rui M. BASTOS
Horizon split ambient occlusion

Patent number: 8878849

Abstract: The method includes receiving a plurality of graphics primitives for rendering at a GPU of a computer system and rendering graphics primitives into pixel parameters of the pixels of a display, wherein the parameters include pixel depth values and pixel normal values. For each pixel of the display, an ambient occlusion process is performed. The algorithm takes as input a ND-buffer containing pixel depth values and pixel normals. Based on the pixel 3-D position and the pixel normal vector, horizon heights are computed by sampling the ND-buffer and an occlusion term is computed for each pixel based on the horizon heights. Based on the pixel 3-D position, the pixel normal vector, a normal occlusion term is computed by sampling the ND-buffer above the horizon in multiple directions. An ambient occlusion illumination value is computed by combining the horizon occlusion term and the normal occlusion term.

Type: Grant

Filed: December 14, 2007

Date of Patent: November 4, 2014

Assignee: Nvidia Corporation

Inventors: Rouslan Dimitrov, Louis Bavoil, Miguel Sainz
System, method, and computer program product for providing a dynamic display refresh

Patent number: 8866833

Abstract: A system, method, and computer program product are provided for a dynamic display refresh. In use, a state of a display device is identified in which an entirety of an image frame is currently displayed by the display device. In response to the identification of the state, it is determined whether an entirety of a next image frame to be displayed has been rendered to memory. The next image frame is transmitted to the display device for display thereof, when it is determined that the entirety of the next image frame to be displayed has been rendered to the memory. Further, a refresh of the display device is delayed, when it is determined that the entirety of the next image frame to be displayed has not been rendered to the memory.

Type: Grant

Filed: September 11, 2013

Date of Patent: October 21, 2014

Assignee: NVIDIA Corporation

Inventors: Tom Petersen, David Wyatt, Paul van der Kouwe, Emmett M. Kilgariff, Laurence Harrison, Jensen Huang, Tony Tamasi, Gerrit A. Slavenburg, Thomas F. Fox, David Matthew Stears, Robert Jan Schutten, Ross Cunniff, Ajay Kamalvanshi, Robert Osborne, Rouslan Dimitrov
ADAPTIVE MULTILEVEL BINNING TO IMPROVE HIERARCHICAL CACHING

Publication number: 20140237187

Abstract: A device driver calculates a tile size for a plurality of cache memories in a cache hierarchy. The device driver calculates a storage capacity of a first cache memory. The device driver calculates a first tile size based on the storage capacity of the first cache memory and one or more additional characteristics. The device driver calculates a storage capacity of a second cache memory. The device driver calculates a second tile size based on the storage capacity of the second cache memory and one or more additional characteristics, where the second tile size is different than the first tile size. The device driver transmits the second tile size to a second coalescing binning unit. One advantage of the disclosed techniques is that data locality and cache memory hit rates are improved where tile size is optimized for each cache level in the cache hierarchy.

Type: Application

Filed: February 20, 2013

Publication date: August 21, 2014

Applicant: NVIDIA CORPORATION

Inventors: Rouslan DIMITROV, Rui BASTOS, Ziyad S. HAKURA, Eric B. LUM
EFFICIENT SUPER-SAMPLING WITH PER-PIXEL SHADER THREADS

Publication number: 20140176579

Abstract: Techniques are disclosed for dispatching pixel information in a graphics processing pipeline. A fragment processing unit generates a pixel that includes multiple samples based on a first portion of a graphics primitive received by a first thread. The fragment processing unit calculates a first value for the first pixel, where the first value is calculated only once for the pixel. The fragment processing unit calculates a first set of values for the samples, where each value in the first set of values corresponds to a different sample and is calculated only once for the corresponding sample. The fragment processing unit combines the first value with each value in the first set of values to create a second set of values. The fragment processing unit creates one or more dispatch messages to store the second set of values in a set of output registers. One advantage of the disclosed techniques is that pixel shader programs perform per-sample operations with increased efficiency.

Type: Application

Filed: December 21, 2012

Publication date: June 26, 2014

Applicant: NVIDIA CORPORATION

Inventors: Jerome F. Duluk, JR., Rouslan DIMITROV, Eric LUM, Rui BASTOS
SCHEDULING CACHE TRAFFIC IN A TILE-BASED ARCHITECTURE

Publication number: 20140118366

Abstract: A tile-based system for processing graphics data. The tile based system includes a first screen-space pipeline, a cache unit, and a first tiling unit. The first tiling unit is configured to transmit a first set of primitives that overlap a first cache tile and a first prefetch command to the first screen-space pipeline for processing, and transmit a second set of primitives that overlap a second cache tile to the first screen-space pipeline for processing. The first prefetch command is configured to cause the cache unit to fetch data associated with the second cache tile from an external memory unit. The first tiling unit may also be configured to transmit a first flush command to the screen-space pipeline for processing with the first set of primitives. The first flush command is configured to cause the cache unit to flush data associated with the first cache tile.

Type: Application

Filed: October 1, 2013

Publication date: May 1, 2014

Applicant: NVIDIA Corporation

Inventors: Ziyad S. HAKURA, Rouslan DIMITROV

prev 1 2 3 next