Patents by Inventor Rouslan Dimitrov

Rouslan Dimitrov has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10338820
    Abstract: A system architecture conserves memory bandwidth by including compression utility to process data transfers from the cache into external memory. The cache decompresses transfers from external memory and transfers full format data to naive clients that lack decompression capability and directly transfers compressed data to savvy clients that include decompression capability. An improved compression algorithm includes software that computes the difference between the current data word and each of a number of prior data words. Software selects the prior data word with the smallest difference as the nearest match and encodes the bit width of the difference to this data word. Software then encodes the difference between the current stride and the closest previous stride. Software combines the stride, bit width, and difference to yield final encoded data word. Software may encode the stride of one data word as a value relative to the stride of a previous data word.
    Type: Grant
    Filed: June 7, 2016
    Date of Patent: July 2, 2019
    Assignee: NVIDIA CORPORATION
    Inventors: Rouslan Dimitrov, Jeff Pool, Praveen Krishnamurthy, Chris Amsinck, Karan Mehra, Scott Cutler
  • Publication number: 20190172181
    Abstract: Embodiments of the present invention provide end-to-end frame time synchronization designed to improve smoothness for displaying images of 3D applications, such as PC gaming applications. Traditionally, an application that renders 3D graphics functions based on the assumption that the average render time will be used as the animation time for a given frame. When this condition is not met, and the render time for a frame does not match the average render time of prior frames, the frames are not captured or displayed at a consistent rate. This invention enables feedback to be provided to the rendering application for adjusting the animation times used to produce new frames, and a post-render queue is used to store completed frames for mitigating stutter and hitches. Flip control is used to sync the display of a rendered frame with the animation time used to generate the frame, thereby producing a smooth, consistent image.
    Type: Application
    Filed: December 3, 2018
    Publication date: June 6, 2019
    Inventors: Thomas Albert Petersen, Ankan Banerjee, Shishir Goyal, Sau Yan Keith Li, Lars Nordskog, Rouslan Dimitrov
  • Publication number: 20190164518
    Abstract: Systems and techniques for streaming video with dynamic jitter tolerance are described. In one example, a system includes a server executing an application and generating image frames associated with the application at a frame rate, and a client which displays the image frames on a display that has a predetermined refresh rate and which monitors arrival times of the image frames in relation to the predetermined refresh rate. The server is further configured to dynamically change the frame rate based on the monitoring so that the frame rate more closely corresponds to the predetermined refresh rate of the client's display.
    Type: Application
    Filed: October 22, 2018
    Publication date: May 30, 2019
    Inventor: Rouslan DIMITROV
  • Publication number: 20180322683
    Abstract: Techniques for rendering images on multiple tilted displays concurrently to mitigate perspective distortion are disclosed herein. According to one described approach, viewports are assigned to a center monitor and two peripheral monitors. Scene data for the viewports is calculated, and geometric primitives are generated for the viewports based on the scene data. Image transformation is performed based on a modified perspective value to modify geometry of the geometric primitives based on tilt angles of the displays, and the geometric primitives are rasterized using the modified geometry.
    Type: Application
    Filed: July 17, 2017
    Publication date: November 8, 2018
    Inventors: Rouslan Dimitrov, Yury Uralsky, Lars Nordskog, Dmitry Zhdan
  • Publication number: 20180322682
    Abstract: Techniques for rendering images on multiple tilted displays concurrently to mitigate perspective distortion are disclosed herein. According to one described approach, viewports are assigned to a center monitor and two peripheral monitors. Scene data for the viewports is calculated, and geometric primitives are generated for the viewports based on the scene data. Image transformation is performed based on a modified perspective value to modify geometry of the geometric primitives based on tilt angles of the displays, and the geometric primitives are rasterized using the modified geometry.
    Type: Application
    Filed: July 17, 2017
    Publication date: November 8, 2018
    Inventors: Rouslan Dimitrov, Yury Uralsky, Lars Nordskog, Dmitriy Zhdan
  • Publication number: 20170351429
    Abstract: A system architecture conserves memory bandwidth by including compression utility to process data transfers from the cache into external memory. The cache decompresses transfers from external memory and transfers full format data to naive clients that lack decompression capability and directly transfers compressed data to savvy clients that include decompression capability. An improved compression algorithm includes software that computes the difference between the current data word and each of a number of prior data words. Software selects the prior data word with the smallest difference as the nearest match and encodes the bit width of the difference to this data word. Software then encodes the difference between the current stride and the closest previous stride. Software combines the stride, bit width, and difference to yield final encoded data word. Software may encode the stride of one data word as a value relative to the stride of a previous data word.
    Type: Application
    Filed: June 7, 2016
    Publication date: December 7, 2017
    Inventors: Rouslan DIMITROV, Jeff POOL, Praveen KRISHNAMURTHY, Chris AMSINCK, Karan MEHRA, Scott CUTLER
  • Patent number: 9792122
    Abstract: One embodiment of the present invention includes a technique for processing graphics primitives in a tile-based architecture. The technique includes storing, in a buffer, a first plurality of graphics primitives and a first plurality of state bundles received from the world-space pipeline. The technique further includes determining, based on a first condition, that the first plurality of graphics primitives should be replayed from the buffer, and, in response, replaying the first plurality of graphics primitives against a first tile included in a first plurality of tiles. Replaying the first plurality of graphics primitives includes comparing each graphics primitive against the first tile to determine whether the graphics primitive intersects the first tile, determining that one or more graphics primitives intersects the first tile, and transmitting the one or more graphics primitives and one or more associated state bundles to a screen-space pipeline for processing.
    Type: Grant
    Filed: October 4, 2013
    Date of Patent: October 17, 2017
    Assignee: NVIDIA CORPORATION
    Inventors: Ziyad S. Hakura, Walter R. Steiner, Cynthia Ann Edgeworth Allison, Rouslan Dimitrov, Karim M. Abdalla, Dale L. Kirkland, Emmett M. Kilgariff
  • Patent number: 9779533
    Abstract: One embodiment of the present invention includes a method for processing graphics objects. The method includes receiving a first draw-call and a second draw-call. The method also includes dividing the first draw-call into a first set of sub-draw-calls and the second draw-call into a second set of sub-draw-calls. The method further includes identifying a first screen tile. The method also includes identifying a first group of sub-draw-calls included in the first set of sub-draw-calls that overlap the first screen tile and a second group of sub-draw-calls included in the second set of sub-draw-calls that overlap the second screen tile. The method further includes causing the first group of sub-draw-calls and the second group of sub-draw-calls to be processed together.
    Type: Grant
    Filed: January 27, 2014
    Date of Patent: October 3, 2017
    Assignee: NVIDIA Corporation
    Inventors: Rouslan Dimitrov, Ziyad S. Hakura
  • Patent number: 9734548
    Abstract: One embodiment of the present invention includes techniques for adaptively sizing cache tiles in a graphics system. A device driver associated with a graphics system sets a cache tile size associated with a cache tile to a first size. The detects a change from a first render target configuration that includes a first set of render targets to a second render target configuration that includes a second set of render targets. The device driver sets the cache tile size to a second size based on the second render target configuration. One advantage of the disclosed approach is that the cache tile size is adaptively sized, resulting in fewer cache tiles for less complex render target configurations. Adaptively sizing cache tiles leads to more efficient processor utilization and reduced power requirements. In addition, a unified L2 cache tile allows dynamic partitioning of cache memory between cache tile data and other data.
    Type: Grant
    Filed: August 28, 2013
    Date of Patent: August 15, 2017
    Assignee: NVIDIA Corporation
    Inventors: Ziyad S. Hakura, Rouslan Dimitrov, Emmett M. Kilgariff, Andrei Khodakovsky
  • Patent number: 9720842
    Abstract: A device driver calculates a tile size for a plurality of cache memories in a cache hierarchy. The device driver calculates a storage capacity of a first cache memory. The device driver calculates a first tile size based on the storage capacity of the first cache memory and one or more additional characteristics. The device driver calculates a storage capacity of a second cache memory. The device driver calculates a second tile size based on the storage capacity of the second cache memory and one or more additional characteristics, where the second tile size is different than the first tile size. The device driver transmits the second tile size to a second coalescing binning unit. One advantage of the disclosed techniques is that data locality and cache memory hit rates are improved where tile size is optimized for each cache level in the cache hierarchy.
    Type: Grant
    Filed: February 20, 2013
    Date of Patent: August 1, 2017
    Assignee: NVIDIA Corporation
    Inventors: Rouslan Dimitrov, Rui Bastos, Ziyad S. Hakura, Eric B. Lum
  • Patent number: 9495721
    Abstract: Techniques for dispatching pixel information in a graphics processing pipeline. A fragment processing unit generates a pixel that includes multiple samples based on a first portion of a graphics primitive received by a first thread. The fragment processing unit calculates a first value for the first pixel, where the first value is calculated only once for the pixel. The fragment processing unit calculates a first set of values for the samples, where each value in the first set of values corresponds to a different sample and is calculated only once for the corresponding sample. The fragment processing unit combines the first value with each value in the first set of values to create a second set of values. The fragment processing unit creates one or more dispatch messages to store the second set of values in a set of output registers.
    Type: Grant
    Filed: December 21, 2012
    Date of Patent: November 15, 2016
    Assignee: NVIDIA Corporation
    Inventors: Jerome F. Duluk, Jr., Rouslan Dimitrov, Eric Lum, Rui Bastos
  • Patent number: 9396515
    Abstract: One embodiment sets forth a method for transforming 3-D images into 2-D rendered images using render target sample masks. A software application creates multiple render targets associated with a surface. For each render target, the software application also creates an associated render target sample mask configured to select one or more samples included in each pixel. Within the graphics pipeline, a pixel shader processes each pixel individually and outputs multiple render target-specific color values. For each render target, a ROP unit uses the associated render target sample mask to select covered samples included in the pixel. Subsequently, the ROP unit uses the render target-specific color value to update the selected samples in the render target, thereby achieving sample-level color granularity.
    Type: Grant
    Filed: August 16, 2013
    Date of Patent: July 19, 2016
    Assignee: NVIDIA CORPORATION
    Inventors: Eric B. Lum, Jerome F. Duluk, Jr., Yury Y. Uralsky, Rouslan Dimitrov, Rui M. Bastos
  • Patent number: 9342311
    Abstract: One embodiment of the present invention includes a method for generating accumulated bounding boxes for graphics primitives. The method includes generating a first bounding box associated with a first graphics primitive. The method further includes, for each graphics primitive included in a first set of one or more additional graphics primitives, determining that the graphics primitive is within a threshold distance of the first bounding box, and adding the graphics primitive to the first bounding box. The method further includes determining not to add a second graphics primitive to the first bounding box. The method further includes generating a second bounding box associated with the second graphics primitive. Finally, the method includes transmitting the first bounding box to a tiling unit via a crossbar. One advantage of the disclosed embodiments is that multiple bounding boxes are combined to generate an accumulated bounding box that is then transferred across the crossbar.
    Type: Grant
    Filed: August 14, 2013
    Date of Patent: May 17, 2016
    Assignee: NVIDIA Corporation
    Inventors: Ziyad S. Hakura, Pierre Souillot, Cynthia Allison, Dale L. Kirkland, Rouslan Dimitrov
  • Publication number: 20150213638
    Abstract: One embodiment of the present invention includes a method for processing graphics objects. The method includes receiving a first draw-call and a second draw-call. The method also includes dividing the first draw-call into a first set of sub-draw-calls and the second draw-call into a second set of sub-draw-calls. The method further includes identifying a first screen tile. The method also includes identifying a first group of sub-draw-calls included in the first set of sub-draw-calls that overlap the first screen tile and a second group of sub-draw-calls included in the second set of sub-draw-calls that overlap the second screen tile. The method further includes causing the first group of sub-draw-calls and the second group of sub-draw-calls to be processed together.
    Type: Application
    Filed: January 27, 2014
    Publication date: July 30, 2015
    Applicant: NVIDIA CORPORATION
    Inventors: Rouslan DIMITROV, Ziyad S. HAKURA
  • Publication number: 20150049110
    Abstract: One embodiment sets forth a method for transforming 3-D images into 2-D rendered images using render target sample masks. A software application creates multiple render targets associated with a surface. For each render target, the software application also creates an associated render target sample mask configured to select one or more samples included in each pixel. Within the graphics pipeline, a pixel shader processes each pixel individually and outputs multiple render target-specific color values. For each render target, a ROP unit uses the associated render target sample mask to select covered samples included in the pixel. Subsequently, the ROP unit uses the render target-specific color value to update the selected samples in the render target, thereby achieving sample-level color granularity.
    Type: Application
    Filed: August 16, 2013
    Publication date: February 19, 2015
    Applicant: NVIDIA CORPORATION
    Inventors: Eric B. LUM, Jerome F. DULUK, JR., Yury Y. URALSKY, Rouslan DIMITROV, Rui M. BASTOS
  • Patent number: 8878849
    Abstract: The method includes receiving a plurality of graphics primitives for rendering at a GPU of a computer system and rendering graphics primitives into pixel parameters of the pixels of a display, wherein the parameters include pixel depth values and pixel normal values. For each pixel of the display, an ambient occlusion process is performed. The algorithm takes as input a ND-buffer containing pixel depth values and pixel normals. Based on the pixel 3-D position and the pixel normal vector, horizon heights are computed by sampling the ND-buffer and an occlusion term is computed for each pixel based on the horizon heights. Based on the pixel 3-D position, the pixel normal vector, a normal occlusion term is computed by sampling the ND-buffer above the horizon in multiple directions. An ambient occlusion illumination value is computed by combining the horizon occlusion term and the normal occlusion term.
    Type: Grant
    Filed: December 14, 2007
    Date of Patent: November 4, 2014
    Assignee: Nvidia Corporation
    Inventors: Rouslan Dimitrov, Louis Bavoil, Miguel Sainz
  • Patent number: 8866833
    Abstract: A system, method, and computer program product are provided for a dynamic display refresh. In use, a state of a display device is identified in which an entirety of an image frame is currently displayed by the display device. In response to the identification of the state, it is determined whether an entirety of a next image frame to be displayed has been rendered to memory. The next image frame is transmitted to the display device for display thereof, when it is determined that the entirety of the next image frame to be displayed has been rendered to the memory. Further, a refresh of the display device is delayed, when it is determined that the entirety of the next image frame to be displayed has not been rendered to the memory.
    Type: Grant
    Filed: September 11, 2013
    Date of Patent: October 21, 2014
    Assignee: NVIDIA Corporation
    Inventors: Tom Petersen, David Wyatt, Paul van der Kouwe, Emmett M. Kilgariff, Laurence Harrison, Jensen Huang, Tony Tamasi, Gerrit A. Slavenburg, Thomas F. Fox, David Matthew Stears, Robert Jan Schutten, Ross Cunniff, Ajay Kamalvanshi, Robert Osborne, Rouslan Dimitrov
  • Publication number: 20140237187
    Abstract: A device driver calculates a tile size for a plurality of cache memories in a cache hierarchy. The device driver calculates a storage capacity of a first cache memory. The device driver calculates a first tile size based on the storage capacity of the first cache memory and one or more additional characteristics. The device driver calculates a storage capacity of a second cache memory. The device driver calculates a second tile size based on the storage capacity of the second cache memory and one or more additional characteristics, where the second tile size is different than the first tile size. The device driver transmits the second tile size to a second coalescing binning unit. One advantage of the disclosed techniques is that data locality and cache memory hit rates are improved where tile size is optimized for each cache level in the cache hierarchy.
    Type: Application
    Filed: February 20, 2013
    Publication date: August 21, 2014
    Applicant: NVIDIA CORPORATION
    Inventors: Rouslan DIMITROV, Rui BASTOS, Ziyad S. HAKURA, Eric B. LUM
  • Publication number: 20140176579
    Abstract: Techniques are disclosed for dispatching pixel information in a graphics processing pipeline. A fragment processing unit generates a pixel that includes multiple samples based on a first portion of a graphics primitive received by a first thread. The fragment processing unit calculates a first value for the first pixel, where the first value is calculated only once for the pixel. The fragment processing unit calculates a first set of values for the samples, where each value in the first set of values corresponds to a different sample and is calculated only once for the corresponding sample. The fragment processing unit combines the first value with each value in the first set of values to create a second set of values. The fragment processing unit creates one or more dispatch messages to store the second set of values in a set of output registers. One advantage of the disclosed techniques is that pixel shader programs perform per-sample operations with increased efficiency.
    Type: Application
    Filed: December 21, 2012
    Publication date: June 26, 2014
    Applicant: NVIDIA CORPORATION
    Inventors: Jerome F. Duluk, JR., Rouslan DIMITROV, Eric LUM, Rui BASTOS
  • Publication number: 20140118366
    Abstract: A tile-based system for processing graphics data. The tile based system includes a first screen-space pipeline, a cache unit, and a first tiling unit. The first tiling unit is configured to transmit a first set of primitives that overlap a first cache tile and a first prefetch command to the first screen-space pipeline for processing, and transmit a second set of primitives that overlap a second cache tile to the first screen-space pipeline for processing. The first prefetch command is configured to cause the cache unit to fetch data associated with the second cache tile from an external memory unit. The first tiling unit may also be configured to transmit a first flush command to the screen-space pipeline for processing with the first set of primitives. The first flush command is configured to cause the cache unit to flush data associated with the first cache tile.
    Type: Application
    Filed: October 1, 2013
    Publication date: May 1, 2014
    Applicant: NVIDIA Corporation
    Inventors: Ziyad S. HAKURA, Rouslan DIMITROV