Patents by Inventor Rui Bastos
Rui Bastos has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 10713756Abstract: One aspect of the current disclosure provides a method of upscaling an image. The method includes: rendering an image, wherein the rendering includes generating color samples of the image at a first resolution and depth samples of the image at a second resolution, which is higher than the first resolution; and upscaling the image to an upscaled image at a third resolution, which is higher than the first resolution, using the color samples and the depth samples.Type: GrantFiled: May 1, 2018Date of Patent: July 14, 2020Assignee: Nvidia CorporationInventors: Rouslan Dimitrov, Lei Yang, Chris Amsinck, Walter Donovan, Eric Lum, Rui Bastos
-
Publication number: 20190340730Abstract: One aspect of the current disclosure provides a method of upscaling an image. The method includes: rendering an image, wherein the rendering includes generating color samples of the image at a first resolution and depth samples of the image at a second resolution, which is higher than the first resolution; and upscaling the image to an upscaled image at a third resolution, which is higher than the first resolution, using the color samples and the depth samples.Type: ApplicationFiled: May 1, 2018Publication date: November 7, 2019Inventors: Rouslan Dimitrov, Lei Yang, Chris Amsinck, Walter Donovan, Eric Lum, Rui Bastos
-
Patent number: 10043234Abstract: A system and method for decompressing compressed data (e.g., in a frame buffer) and optionally recompressing the data. The method includes determining a portion of an image to be accessed from a memory and sending a conditional read corresponding to the portion of the image. In response to the conditional read, an indicator operable to indicate that the portion of the image is uncompressed may be received. If the portion of the image is compressed, in response to the conditional read, compressed data corresponding to the portion of the image is received. In response to receiving the compressed data, the compressed data is uncompressed into uncompressed data. The uncompressed data may then be written to the memory corresponding to the portion of the image. The uncompressed data may then be in-place compressed for or during subsequent processing.Type: GrantFiled: December 31, 2012Date of Patent: August 7, 2018Assignee: NVIDIA CorporationInventors: Jonathan Dunaisky, Steven E. Molnar, Christian Amsinck, Rui Bastos, Eric B. Lum, Justin Cobb, Emmett Kilgariff
-
Patent number: 9953455Abstract: Techniques are disclosed for storing post-z coverage data in a render target. A color raster operations (CROP) unit receives a coverage mask associated with a portion of a graphics primitive, where the graphics primitive intersects a pixel that includes a multiple samples, and the portion covers at least one sample. The CROP unit stores the coverage mask in a data field in the render target at a location associated with the pixel. One advantage of the disclosed techniques is that the GPU computes color and other pixel information only for visible fragments as determined by post-z coverage data. The GPU does not compute color and other pixel information for obscured fragments, thereby reducing overall power consumption and improving overall render performance.Type: GrantFiled: March 13, 2013Date of Patent: April 24, 2018Assignee: NVIDIA CorporationInventors: Eric B. Lum, Rui Bastos, Jerome F. Duluk, Jr., Henry Packard Moreton, Yury Y. Uralsky
-
Patent number: 9720842Abstract: A device driver calculates a tile size for a plurality of cache memories in a cache hierarchy. The device driver calculates a storage capacity of a first cache memory. The device driver calculates a first tile size based on the storage capacity of the first cache memory and one or more additional characteristics. The device driver calculates a storage capacity of a second cache memory. The device driver calculates a second tile size based on the storage capacity of the second cache memory and one or more additional characteristics, where the second tile size is different than the first tile size. The device driver transmits the second tile size to a second coalescing binning unit. One advantage of the disclosed techniques is that data locality and cache memory hit rates are improved where tile size is optimized for each cache level in the cache hierarchy.Type: GrantFiled: February 20, 2013Date of Patent: August 1, 2017Assignee: NVIDIA CorporationInventors: Rouslan Dimitrov, Rui Bastos, Ziyad S. Hakura, Eric B. Lum
-
Patent number: 9530189Abstract: A method for compressing framebuffer data is presented. The method includes determining a reduction ratio for framebuffer data in a tile including multiple samples. The reduction ratio determined is independent of the sampling mode, where the sampling mode is the number of samples within each pixel in the tile. The method further includes comparing a first portion of the framebuffer data for each of the multiple samples to determine an equality comparison result and also comparing a second portion of the framebuffer data for each one of the multiple samples to compute per-channel differences for each one of the multiple samples and testing the per-channel differences against a threshold value to determine a threshold comparison result. Finally, the method comprises compressing the framebuffer data for the tile based on the reduction ratio, the equality comparison result and the threshold comparison result to produce output framebuffer data for the tile.Type: GrantFiled: December 27, 2012Date of Patent: December 27, 2016Assignee: NVIDIA CORPORATIONInventors: Jonathan Dunaisky, David Kirk McAllister, Steven E. Molnar, Narayan Kulshrestha, Rui Bastos, Joseph Detmer, William Craig McKnight
-
Patent number: 9495721Abstract: Techniques for dispatching pixel information in a graphics processing pipeline. A fragment processing unit generates a pixel that includes multiple samples based on a first portion of a graphics primitive received by a first thread. The fragment processing unit calculates a first value for the first pixel, where the first value is calculated only once for the pixel. The fragment processing unit calculates a first set of values for the samples, where each value in the first set of values corresponds to a different sample and is calculated only once for the corresponding sample. The fragment processing unit combines the first value with each value in the first set of values to create a second set of values. The fragment processing unit creates one or more dispatch messages to store the second set of values in a set of output registers.Type: GrantFiled: December 21, 2012Date of Patent: November 15, 2016Assignee: NVIDIA CorporationInventors: Jerome F. Duluk, Jr., Rouslan Dimitrov, Eric Lum, Rui Bastos
-
Patent number: 9183609Abstract: A technique for efficiently rendering content reduces each complex blend mode to a series of basic blend operations. The series of basic blend operations are executed within a recirculating pipeline until a final blended value is computed. The recirculating pipeline is positioned within a color raster operations unit of a graphics processing unit for efficient access to image buffer data.Type: GrantFiled: December 20, 2012Date of Patent: November 10, 2015Assignee: NVIDIA CorporationInventors: Rui Bastos, Mark J. Kilgard, William Craig McKnight, Jerome F. Duluk, Jr., Pierre Souillot, Dale L. Kirkland, Christian Amsinck, Joseph Detmer, Christian Rouet, Don Bittel
-
Publication number: 20140267224Abstract: Techniques are disclosed for storing post-z coverage data in a render target. A color raster operations (CROP) unit receives a coverage mask associated with a portion of a graphics primitive, where the graphics primitive intersects a pixel that includes a multiple samples, and the portion covers at least one sample. The CROP unit stores the coverage mask in a data field in the render target at a location associated with the pixel. One advantage of the disclosed techniques is that the GPU computes color and other pixel information only for visible fragments as determined by post-z coverage data. The GPU does not compute color and other pixel information for obscured fragments, thereby reducing overall power consumption and improving overall render performance.Type: ApplicationFiled: March 13, 2013Publication date: September 18, 2014Applicant: NVIDIA CORPORATIONInventors: Eric B. LUM, Rui Bastos, Jerome F. Duluk, JR., Henry Packard Moreton, Yury Y. Uralsky
-
Publication number: 20140237187Abstract: A device driver calculates a tile size for a plurality of cache memories in a cache hierarchy. The device driver calculates a storage capacity of a first cache memory. The device driver calculates a first tile size based on the storage capacity of the first cache memory and one or more additional characteristics. The device driver calculates a storage capacity of a second cache memory. The device driver calculates a second tile size based on the storage capacity of the second cache memory and one or more additional characteristics, where the second tile size is different than the first tile size. The device driver transmits the second tile size to a second coalescing binning unit. One advantage of the disclosed techniques is that data locality and cache memory hit rates are improved where tile size is optimized for each cache level in the cache hierarchy.Type: ApplicationFiled: February 20, 2013Publication date: August 21, 2014Applicant: NVIDIA CORPORATIONInventors: Rouslan DIMITROV, Rui BASTOS, Ziyad S. HAKURA, Eric B. LUM
-
Publication number: 20140184601Abstract: A system and method for decompressing compressed data (e.g., in a frame buffer) and optionally recompressing the data. The method includes determining a portion of an image to be accessed from a memory and sending a conditional read corresponding to the portion of the image. In response to the conditional read, an indicator operable to indicate that the portion of the image is uncompressed may be received. If the portion of the image is compressed, in response to the conditional read, compressed data corresponding to the portion of the image is received. In response to receiving the compressed data, the compressed data is uncompressed into uncompressed data. The uncompressed data may then be written to the memory corresponding to the portion of the image. The uncompressed data may then be in-place compressed for or during subsequent processing.Type: ApplicationFiled: December 31, 2012Publication date: July 3, 2014Applicant: NVIDIA CorporationInventors: Jonathan Dunaisky, Steven E. Molnar, Christian Amsinck, Rui Bastos, Eric B. Lum, Justin Cobb, Emmett Kilgariff
-
Publication number: 20140176568Abstract: A technique for efficiently rendering content reduces each complex blend mode to a series of basic blend operations. The series of basic blend operations are executed within a recirculating pipeline until a final blended value is computed. The recirculating pipeline is positioned within a color raster operations unit of a graphics processing unit for efficient access to image buffer data.Type: ApplicationFiled: December 20, 2012Publication date: June 26, 2014Applicant: NVIDIA CORPORATIONInventors: Rui BASTOS, Mark J. Kilgard, William Craig McKnight, Jerome F. Duluk, Pierre Souillot, Dale L. Kirkland, Christian Amsinck, Joseph Detmer, Christian Rouet, Don Bittel
-
Publication number: 20140176579Abstract: Techniques are disclosed for dispatching pixel information in a graphics processing pipeline. A fragment processing unit generates a pixel that includes multiple samples based on a first portion of a graphics primitive received by a first thread. The fragment processing unit calculates a first value for the first pixel, where the first value is calculated only once for the pixel. The fragment processing unit calculates a first set of values for the samples, where each value in the first set of values corresponds to a different sample and is calculated only once for the corresponding sample. The fragment processing unit combines the first value with each value in the first set of values to create a second set of values. The fragment processing unit creates one or more dispatch messages to store the second set of values in a set of output registers. One advantage of the disclosed techniques is that pixel shader programs perform per-sample operations with increased efficiency.Type: ApplicationFiled: December 21, 2012Publication date: June 26, 2014Applicant: NVIDIA CORPORATIONInventors: Jerome F. Duluk, JR., Rouslan DIMITROV, Eric LUM, Rui BASTOS
-
Publication number: 20130249897Abstract: A method for compressing framebuffer data is presented. The method includes determining a reduction ratio for framebuffer data in a tile including multiple samples. The reduction ratio determined is independent of the sampling mode, where the sampling mode is the number of samples within each pixel in the tile. The method further includes comparing a first portion of the framebuffer data for each of the multiple samples to determine an equality comparison result and also comparing a second portion of the framebuffer data for each one of the multiple samples to compute per-channel differences for each one of the multiple samples and testing the per-channel differences against a threshold value to determine a threshold comparison result. Finally, the method comprises compressing the framebuffer data for the tile based on the reduction ratio, the equality comparison result and the threshold comparison result to produce output framebuffer data for the tile.Type: ApplicationFiled: December 27, 2012Publication date: September 26, 2013Applicant: NVIDIA CORPORATIONInventors: Jonathan Dunaisky, David Kirk McAllister, Steven E. Molnar, Narayan Kulshrestha, Rui Bastos, Joseph Detmer, William Craig McKnight
-
Patent number: 7852341Abstract: A method and system for patching instructions in a 3-D graphics pipeline. Specifically, in one embodiment, instructions to be executed within a scheduling process for a shader pipeline of the 3-D graphics pipeline are patchable. A scheduler includes a decode table, an expansion table, and a resource table that are each patchable. The decode table translates high level instructions to an appropriate microcode sequence. The patchable expansion table expands a high level instruction to a program of microcode if the high level instruction is complex. The resource table assigns the units for executing the microcode. Addresses within each of the tables can be patched to modify existing instructions and create new instructions. That is, contents in each address in the tables that are tagged can be replaced with a patch value of a corresponding register.Type: GrantFiled: October 5, 2004Date of Patent: December 14, 2010Assignee: Nvidia CorporationInventors: Christian Rouet, Rui Bastos, Lordson Yue
-
Publication number: 20080094405Abstract: A scalable shader architecture is disclosed. In accord with that architecture, a shader includes multiple shader pipelines, each of which can perform processing operations on rasterized pixel data. Shader pipelines can be functionally removed as required, thus preventing a defective shader pipeline from causing a chip rejection. The shader includes a shader distributor that processes rasterized pixel data and then selectively distributes the processed rasterized pixel data to the various shader pipelines, beneficially in a manner that balances workloads. A shader collector formats the outputs of the various shader pipelines into proper order to form shaded pixel data. A shader instruction processor (scheduler) programs the individual shader pipelines to perform their intended tasks.Type: ApplicationFiled: December 14, 2007Publication date: April 24, 2008Inventors: Rui BASTOS, Karim Abdalla, Christian Rouet, Michael Toksvig, Johnny Rhoades, Roger Allen, John Tynefield, Emmett Kilgariff, Gary Tarolli, Brian Cabral, Craig Wittenbrink, Sean Treichler
-
Publication number: 20070008336Abstract: A pixel center position that is not covered by a primitive covering a portion of the pixel is displaced to lie within a fragment formed by the intersection of the primitive and the pixel. X,y coordinates of a pixel center are adjusted to displace the pixel center position to lie within the fragment, affecting actual texture map coordinates or barycentric weights. Alternatively, a centroid sub-pixel sample position is determined based on coverage data for the pixel and a multisample mode. The centroid sub-pixel sample position is used to compute pixel or sub-pixel parameters for the fragment.Type: ApplicationFiled: September 14, 2006Publication date: January 11, 2007Inventors: Rui Bastos, Michael Toksvig, Karim Abdalla
-
Publication number: 20060077209Abstract: A pixel center position that is not covered by a primitive covering a portion of the pixel is displaced to lie within a fragment formed by the intersection of the primitive and the pixel. X,y coordinates of a pixel center are adjusted to displace the pixel center position to lie within the fragment, affecting actual texture map coordinates or barycentric weights. Alternatively, a centroid sub-pixel sample position is determined based on coverage data for the pixel and a multisample mode. The centroid sub-pixel sample position is used to compute pixel or sub-pixel parameters for the fragment.Type: ApplicationFiled: October 7, 2004Publication date: April 13, 2006Inventors: Rui Bastos, Michael Toksvig, Karim Abdalla
-
Publication number: 20060055695Abstract: A fragment processor includes a fragment shader distributor, a fragment shader collector, and a plurality of fragment shader pipelines. Each fragment shader pipeline executes a fragment shader program on a segment of fragments. The plurality of fragment shader pipelines operate in parallel, executing the same or different fragment shader programs. The fragment shader distributor receives a stream of fragments from a rasterization unit and dispatches a portion of the stream of fragments to a selected fragment shader pipeline until the capacity of the selected fragment shader pipeline is reached. The fragment shader distributor then selects another fragment shader pipeline. The capacity of each of the fragment shader pipelines is limited by several different resources. As the fragment shader distributor dispatches fragments, it tracks the remaining available resources of the selected fragment shader pipeline.Type: ApplicationFiled: September 13, 2004Publication date: March 16, 2006Applicant: NVIDIA CorporationInventors: Karim Abdalla, Emmett Kilgariff, Rui Bastos
-
Publication number: 20050225554Abstract: A scalable shader architecture is disclosed. In accord with that architecture, a shader includes multiple shader pipelines, each of which can perform processing operations on rasterized pixel data. Shader pipelines can be functionally removed as required, thus preventing a defective shader pipeline from causing a chip rejection. The shader includes a shader distributor that processes rasterized pixel data and then selectively distributes the processed rasterized pixel data to the various shader pipelines, beneficially in a manner that balances workloads. A shader collector formats the outputs of the various shader pipelines into proper order to form shaded pixel data. A shader instruction processor (scheduler) programs the individual shader pipelines to perform their intended tasks.Type: ApplicationFiled: September 10, 2004Publication date: October 13, 2005Inventors: Rui Bastos, Karim Abdalla, Christian Rouet, Michael Toksvig, Johnny Rhoades, Roger Allen, John Tynefield, Emmett Kilgariff, Gary Tarolli, Brian Cabral, Craig Wittenbrink, Sean Treichler