Patents Assigned to NVidia
-
Patent number: 10451676Abstract: A method for testing. An external clock frequency is generated. Test data is supplied over a plurality of SSI connections clocked at the external clock frequency, wherein the test data is designed for testing a logic block. A DSTA module is configured for the logic block that is integrated within a chip to a bandwidth ratio, wherein the bandwidth ratio defines the plurality of SSI connections and a plurality of PSI connections of the chip. The external clock frequency is divided down using the bandwidth ratio to generate an internal clock frequency, wherein the bandwidth ratio defines the external clock frequency and the internal clock frequency. The test data is scanned over the plurality of PSI connections clocked at the internal clock frequency according to the bandwidth ratio, wherein the plurality of PSI connections is configured for inputting the test data to the plurality of scan chains.Type: GrantFiled: October 27, 2016Date of Patent: October 22, 2019Assignee: Nvidia CorporationInventors: Milind Sonawane, Amit Sanghani, Shantanu Sarangi, Jonathon E. Colburn, Bala Tarun Nelapatla, Sailendra Chadalavda, Rajendra Kumar Reddy.S, Mahmut Yilmaz, Pavan Kumar Datla Jagannadha
-
Patent number: 10453168Abstract: A tile coalescer within a graphics processing pipeline coalesces coverage data into tiles. The coverage data indicates, for a set of XY positions, whether a graphics primitive covers those XY positions. The tile indicates, for a larger set of XY positions, whether one or more graphics primitives cover those XY positions. The tile coalescer includes coverage data in the tile only once for each XY position, thereby allowing the API ordering of the graphics primitives covering each XY position to be preserved. The tile is then distributed to a set of streaming multiprocessors for shading and blending operations. The different streaming multiprocessors execute thread groups to process the tile. In doing so, those thread groups may perform read-modify-write operations with data stored in memory. Each such thread group is scheduled to execute via atomic operations, and according to the API order of the associated graphics primitives.Type: GrantFiled: August 17, 2018Date of Patent: October 22, 2019Assignee: NVIDIA CORPORATIONInventors: Ziyad Hakura, Eric Lum, Dale Kirkland, Jack Choquette, Patrick R. Brown, Yury Y. Uralsky, Jeffrey Bolz
-
Patent number: 10444280Abstract: Granular dynamic test systems and methods facilitate efficient and effective timing of test operations. In one embodiment, a chip test system comprises: a first test partition operable to perform test operations based upon a first local test clock signal; a second test partition operable to perform test operations based upon a second local test clock signal; and a centralized controller configured to coordinate testing between the plurality of test partitions, wherein the coordination includes managing communication of test information between the plurality of test partitions and external pins. In one exemplary implementation, a trigger edge of the first local test clock signal is staggered with respect to a trigger edge of the second local test clock signal, wherein the stagger is coordinated to mitigate power consumption by test operations in the first test partition and test operations in the second test partition.Type: GrantFiled: October 27, 2016Date of Patent: October 15, 2019Assignee: NVIDIA CORPORATIONInventors: Dheepakkumaran Jayaraman, Karthikeyan Natarajan, Shantanu Sarangi, Amit Sanghani, Milind Sonawane, Sailendra Chadalavda, Jonathon E. Colburn, Kevin Wilder, Mahmut Yilmaz, Pavan Kumar Datla Jagannadha
-
Patent number: 10445177Abstract: A method for updating a DRAM memory array is disclosed. The method comprises: a) transitioning the DRAM memory array from an idle state to a refresh state in accordance with a command from a memory controller; b) initiating a refresh on the DRAM memory array using DRAM internal control circuitry by activating a row of data into an associated sense amplifier buffer; and c) during the refresh, performing an ERR Correction Code (ECC) scrub operation of selected bits in the activated row of the DRAM memory array.Type: GrantFiled: July 3, 2018Date of Patent: October 15, 2019Assignee: Nvidia CorporationInventors: David Reed, Alok Gupta
-
Patent number: 10445243Abstract: A system for managing virtual memory. The system includes a first processing unit configured to execute a first operation that references a first virtual memory address. The system also includes a first memory management unit (MMU) associated with the first processing unit and configured to generate a first page fault upon determining that a first page table that is stored in a first memory unit associated with the first processing unit does not include a mapping corresponding to the first virtual memory address. The system further includes a first copy engine associated with the first processing unit. The first copy engine is configured to read a first command queue to determine a first mapping that corresponds to the first virtual memory address and is included in a first page state directory. The first copy engine is also configured to update the first page table to include the first mapping.Type: GrantFiled: October 16, 2013Date of Patent: October 15, 2019Assignee: NVIDIA CORPORATIONInventors: Jerome F. Duluk, Jr., Cameron Buschardt, Sherry Cheung, James Leroy Deming, Samuel H. Duncan, Lucien Dunning, Robert George, Arvind Gopalakrishnan, Mark Hairgrove, Chenghuan Jia, John Mashey
-
Patent number: 10438314Abstract: One embodiment of the present invention sets forth a graphics processing system. The graphics processing system includes a screen-space pipeline and a tiling unit. The screen-space pipeline is configured to perform visibility testing and fragment shading. The tiling unit is configured to determine that a first set of primitives overlaps a first cache tile. The tiling unit is also configured to first transmit the first set of primitives to the screen-space pipeline with a command configured to cause the screen-space pipeline to process the first set of primitives in a z-only mode, and then transmit the first set of primitives to the screen-space pipeline with a command configured to cause the screen-space pipeline to process the first set of primitives in a normal mode. In the z-only mode, at least some fragment shading operations are disabled in the screen-space pipeline. In the normal mode, fragment shading operations are enabled.Type: GrantFiled: April 23, 2018Date of Patent: October 8, 2019Assignee: NVIDIA CORPORATIONInventors: Ziyad S. Hakura, Jerome F. Duluk, Jr.
-
Patent number: 10437593Abstract: A synchronization instruction causes a processor to ensure that specified threads included within a warp concurrently execute a single subsequent instruction. The specified threads include at least a first thread and a second thread. In operation, the first thread arrives at the synchronization instruction. The processor determines that the second thread has not yet arrived at the synchronization instruction and configures the first thread to stop executing instructions. After issuing at least one instruction for the second thread, the processor determines that all the specified threads have arrived at the synchronization instruction. The processor then causes all the specified threads to execute the subsequent instruction. Advantageously, unlike conventional approaches to synchronizing threads, the synchronization instruction enables the processor to reliably and properly execute code that includes complex control flows and/or instructions that presuppose that threads are converged.Type: GrantFiled: April 27, 2017Date of Patent: October 8, 2019Assignee: NVIDIA CORPORATIONInventors: Ajay Sudarshan Tirumala, Olivier Giroux, Peter Nelson, Jack Choquette
-
Patent number: 10438400Abstract: A method, computer readable medium, and system are disclosed for rendering images utilizing a foveated rendering algorithm with post-process filtering to enhance a contrast of the foveated image. The method includes the step of receiving a three-dimensional scene, rendering the 3D scene according to a foveated rendering algorithm to generate a foveated image, and filtering the foveated image using a contrast-enhancing filter to generate a filtered foveated image. The foveated rendering algorithm may incorporate aspects of coarse pixel shading, mipmapped texture maps, linear efficient anti-aliased normal maps, exponential variance shadow maps, and specular anti-aliasing techniques. The foveated rendering algorithm may also be combined with temporal anti-aliasing techniques to further reduce artifacts in the foveated image.Type: GrantFiled: March 8, 2017Date of Patent: October 8, 2019Assignee: NVIDIA CorporationInventors: Anjul Patney, Marco Salvi, Joohwan Kim, Anton S. Kaplanyan, Christopher Ryan Wyman, Nir Benty, David Patrick Luebke, Aaron Eliot Lefohn
-
Patent number: 10436840Abstract: A distributed test circuit includes partitions arranged in series to form a scan path, each partition including a scan multiplexer, a test data register, and a segment insertion bit component. The scan multiplexer of each partition provides inputs to the corresponding test data register of the each partition. Broadcast control logic generates a select signal to the scan multiplexer of each partition to place the test circuit in a broadcast mode when the select signal is asserted, and to switch the test circuit to a daisy mode when select signal is de-asserted. The segment insertion bit is operable to include or bypass each partition from the scan path.Type: GrantFiled: March 26, 2018Date of Patent: October 8, 2019Assignee: NVIDIA Corp.Inventors: Jau Wu, Saurabh Gupta
-
Patent number: 10430915Abstract: One or more copy commands are scheduled for locating one or more pages of data in a local memory of a graphics processing unit (GPU) for more efficient access to the pages of data during rendering. A first processing unit that is coupled to a first GPU receives a notification that an access request count has reached a specified threshold. The first processing unit schedules a copy command to copy the first page of data to a first memory circuit of the first GPU from a second memory circuit of the second GPU. The copy command is included within a GPU command stream.Type: GrantFiled: January 24, 2018Date of Patent: October 1, 2019Assignee: NVIDIA CorporationInventors: Andrei Khodakovsky, Kirill A. Dmitriev, Rouslan L. Dimitrov, Tzyywei Hwang, Wishwesh Anil Gandhi, Lacky Vasant Shah
-
Patent number: 10430356Abstract: Embodiments of the present invention set forth techniques for resolving page faults associated with a copy engine. A copy engine within a parallel processor receives a copy operation that includes a set of copy commands. The copy engine executes a first copy command included in the set of copy commands that results in a page fault. The copy engine stores the set of copy commands to the memory. At least one advantage of the disclosed techniques is that the copy engine can perform copy operations that involve source and destination memory pages that are not pinned, leading to reduced memory demand and greater flexibility.Type: GrantFiled: April 28, 2017Date of Patent: October 1, 2019Assignee: NVIDIA CORPORATIONInventors: M. Wasiur Rashid, Jonathon Evans, Gary Ward, Philip Browning Johnson
-
Patent number: 10430989Abstract: A multi-pass unit interoperates with a device driver to configure a screen space pipeline to perform multiple processing passes with buffered graphics primitives. The multi-pass unit receives primitive data and state bundles from the device driver. The primitive data includes a graphics primitive and a primitive mask. The primitive mask indicates the specific passes when the graphics primitive should be processed. The state bundles include one or more state settings and a state mask. The state mask indicates the specific passes where the state settings should be applied. The primitives and state settings are interleaved. For a given pass, the multi-pass unit extracts the interleaved state settings for that pass and configures the screen space pipeline according to those state settings. The multi-pass unit also extracts the interleaved graphics primitives to be processed in that pass. Then, the multi-pass unit causes the screen space pipeline to process those graphics primitives.Type: GrantFiled: November 25, 2015Date of Patent: October 1, 2019Assignee: NVIDIA CORPORATIONInventors: Ziyad Hakura, Cynthia Allison, Dale Kirkland, Jeffrey Bolz, Yury Uralsky, Jonah Alben
-
Patent number: 10432954Abstract: The present disclosure discloses a video encoder, a video encoding system and a video encoding method. The video encoder comprises a logic control module and an encoding module. Wherein, the logic control module is configured to receive a control command sent from an external controller for encoding a specified portion of each frame of image, and send the control command to the encoding module; and the encoding module is configured to receive the control command from the logic control module, and encode the specified portion of each frame of image according to the control command, so as to cooperate with a plurality of other video encoders to complete encoding each frame of image.Type: GrantFiled: August 8, 2016Date of Patent: October 1, 2019Assignee: NVIDIA CORPORATIONInventors: Jianjun Chen, Xi He, Chunfeng Yang, Zejun Hu
-
Patent number: 10423424Abstract: Techniques are disclosed for performing an auxiliary operation via a compute engine associated with a host computing device. The method includes determining that the auxiliary operation is directed to the compute engine, and determining that the auxiliary operation is associated with a first context comprising a first set of state parameters. The method further includes determining a first subset of state parameters related to the auxiliary operation based on the first set of state parameters. The method further includes transmitting the first subset of state parameters to the compute engine, and transmitting the auxiliary operation to the compute engine. One advantage of the disclosed technique is that surface area and power consumption are reduced within the processor by utilizing copy engines that have no context switching capability.Type: GrantFiled: September 28, 2012Date of Patent: September 24, 2019Assignee: NVIDIA CORPORATIONInventors: Lincoln G. Garlick, Philip Browning Johnson, Rafal Zboinski, Jeff Tuckey, Samuel H. Duncan, Peter C. Mills
-
Patent number: 10424069Abstract: A method, computer readable medium, and system are disclosed for estimating optical flow between two images. A first pyramidal set of features is generated for a first image and a partial cost volume for a level of the first pyramidal set of features is computed, by a neural network, using features at the level of the first pyramidal set of features and warped features extracted from a second image, where the partial cost volume is computed across a limited range of pixels that is less than a full resolution of the first image, in pixels, at the level. The neural network processes the features and the partial cost volume to produce a refined optical flow estimate for the first image and the second image.Type: GrantFiled: March 30, 2018Date of Patent: September 24, 2019Assignee: NVIDIA CorporationInventors: Deqing Sun, Xiaodong Yang, Ming-Yu Liu, Jan Kautz
-
Patent number: 10424074Abstract: Methods and apparatuses are disclosed for reporting texture footprint information. A texture footprint identifies the portion of a texture that will be utilized in rendering a pixel in a scene. The disclosed methods and apparatuses advantageously improve system efficiency in decoupled shading systems by first identifying which texels in a given texture map are needed for subsequently rendering a scene. Therefore, the number of texels that are generated and stored may be reduced to include the identified texels. Texels that are not identified need not be rendered and/or stored.Type: GrantFiled: July 3, 2018Date of Patent: September 24, 2019Assignee: NVIDIA CorporationInventors: Yury Uralsky, Henry Packard Moreton, Eric Brian Lum, Jonathan J. Dunaisky, Steven James Heinrich, Stefano Pescador, Shirish Gadre, Michael Alan Fetterman
-
Patent number: 10417989Abstract: Disclosed herein is a GPU for improved multitasking by a user, a GPU computing system including the GPU and a method of manufacturing a GPU system. In one embodiment, the GPU includes: (1) a video overlayer configured to create an operating area over a portion of a video image generated by the graphical processing unit and (2) an overlay interface configured to provide a virtual space input to the video overlayer to operate a virtual machine within the operating area.Type: GrantFiled: January 2, 2014Date of Patent: September 17, 2019Assignee: Nvidia CorporationInventor: Andrew Fear
-
Patent number: 10417813Abstract: A method for generating temporally stable hash values reduces visual artifacts associated with stochastic sampling of data for graphics applications. A given hash value can be generated from a scaled and discretized object-space for a geometric object within a scene. Through appropriate scaling, the hash value can be discretized and remain constant within a threshold distance from a pixel center. As the geometric object moves within the scene, a hash value associated with a given feature of the geometric object remains constant because the hash value is generated using an object-space coordinate anchored to the feature. In one embodiment, alpha testing threshold values are assigned random, but temporally stable hash output values generated using object-space coordinate positions for primitive fragments undergoing alpha testing. Alpha tested fragments are temporally stable, beneficially improving image quality.Type: GrantFiled: November 7, 2017Date of Patent: September 17, 2019Assignee: NVIDIA CorporationInventors: Christopher Ryan Wyman, Morgan McGuire
-
Patent number: 10417817Abstract: A method, computer readable medium, and system are disclosed for supersampling a large-scale and disjoined data set. The data set may include point cloud, voxel, or polygonal mesh data. The data set may be rendered using a distributed, sort-last rendering system that includes a plurality of rendering nodes and one or more compositing nodes. The method includes the steps of receiving graphics data at a plurality of rendering nodes, rendering at least a portion of the graphics data by one or more rendering nodes to produce multi-sample image data, encoding the multi-sample image data using a difference encoding technique, and transmitting the encoded multi-sample image data to a compositing node. The multi-sample image data comprises a plurality of values per pixel of a target image corresponding to a plurality of sample locations defined for each pixel of the target image.Type: GrantFiled: October 30, 2015Date of Patent: September 17, 2019Assignee: NVIDIA CorporationInventors: Henning Christopher Lux, Marc Nienhaus, Joerg Mensmann
-
Patent number: 10417990Abstract: A method of binding graphics resources is provided that includes: (1) identifying graphics resources for binding, (2) generating a bind group for the graphics resources, (3) organizing the bind group into a bind group memory using a bind group layout and (4) providing bind group control for processing of the bind group. A method of organizing graphics resources and a resource organizing unit are also provided.Type: GrantFiled: September 16, 2015Date of Patent: September 17, 2019Assignee: Nvidia CorporationInventor: Jeffrey A. Bolz