Patents Assigned to NVidia

Method and system for dynamic standard test access (DSTA) for a logic block reuse

Patent number: 10451676

Abstract: A method for testing. An external clock frequency is generated. Test data is supplied over a plurality of SSI connections clocked at the external clock frequency, wherein the test data is designed for testing a logic block. A DSTA module is configured for the logic block that is integrated within a chip to a bandwidth ratio, wherein the bandwidth ratio defines the plurality of SSI connections and a plurality of PSI connections of the chip. The external clock frequency is divided down using the bandwidth ratio to generate an internal clock frequency, wherein the bandwidth ratio defines the external clock frequency and the internal clock frequency. The test data is scanned over the plurality of PSI connections clocked at the internal clock frequency according to the bandwidth ratio, wherein the plurality of PSI connections is configured for inputting the test data to the plurality of scan chains.

Type: Grant

Filed: October 27, 2016

Date of Patent: October 22, 2019

Assignee: Nvidia Corporation

Inventors: Milind Sonawane, Amit Sanghani, Shantanu Sarangi, Jonathon E. Colburn, Bala Tarun Nelapatla, Sailendra Chadalavda, Rajendra Kumar Reddy.S, Mahmut Yilmaz, Pavan Kumar Datla Jagannadha
Techniques for maintaining atomicity and ordering for pixel shader operations

Patent number: 10453168

Abstract: A tile coalescer within a graphics processing pipeline coalesces coverage data into tiles. The coverage data indicates, for a set of XY positions, whether a graphics primitive covers those XY positions. The tile indicates, for a larger set of XY positions, whether one or more graphics primitives cover those XY positions. The tile coalescer includes coverage data in the tile only once for each XY position, thereby allowing the API ordering of the graphics primitives covering each XY position to be preserved. The tile is then distributed to a set of streaming multiprocessors for shading and blending operations. The different streaming multiprocessors execute thread groups to process the tile. In doing so, those thread groups may perform read-modify-write operations with data stored in memory. Each such thread group is scheduled to execute via atomic operations, and according to the API order of the associated graphics primitives.

Type: Grant

Filed: August 17, 2018

Date of Patent: October 22, 2019

Assignee: NVIDIA CORPORATION

Inventors: Ziyad Hakura, Eric Lum, Dale Kirkland, Jack Choquette, Patrick R. Brown, Yury Y. Uralsky, Jeffrey Bolz
Independent test partition clock coordination across multiple test partitions

Patent number: 10444280

Abstract: Granular dynamic test systems and methods facilitate efficient and effective timing of test operations. In one embodiment, a chip test system comprises: a first test partition operable to perform test operations based upon a first local test clock signal; a second test partition operable to perform test operations based upon a second local test clock signal; and a centralized controller configured to coordinate testing between the plurality of test partitions, wherein the coordination includes managing communication of test information between the plurality of test partitions and external pins. In one exemplary implementation, a trigger edge of the first local test clock signal is staggered with respect to a trigger edge of the second local test clock signal, wherein the stagger is coordinated to mitigate power consumption by test operations in the first test partition and test operations in the second test partition.

Type: Grant

Filed: October 27, 2016

Date of Patent: October 15, 2019

Assignee: NVIDIA CORPORATION

Inventors: Dheepakkumaran Jayaraman, Karthikeyan Natarajan, Shantanu Sarangi, Amit Sanghani, Milind Sonawane, Sailendra Chadalavda, Jonathon E. Colburn, Kevin Wilder, Mahmut Yilmaz, Pavan Kumar Datla Jagannadha
Controller-based memory scrub for DRAMs with internal error-correcting code (ECC) bits contemporaneously during auto refresh or by using masked write commands

Patent number: 10445177

Abstract: A method for updating a DRAM memory array is disclosed. The method comprises: a) transitioning the DRAM memory array from an idle state to a refresh state in accordance with a command from a memory controller; b) initiating a refresh on the DRAM memory array using DRAM internal control circuitry by activating a row of data into an associated sense amplifier buffer; and c) during the refresh, performing an ERR Correction Code (ECC) scrub operation of selected bits in the activated row of the DRAM memory array.

Type: Grant

Filed: July 3, 2018

Date of Patent: October 15, 2019

Assignee: Nvidia Corporation

Inventors: David Reed, Alok Gupta
Fault buffer for resolving page faults in unified virtual memory system

Patent number: 10445243

Abstract: A system for managing virtual memory. The system includes a first processing unit configured to execute a first operation that references a first virtual memory address. The system also includes a first memory management unit (MMU) associated with the first processing unit and configured to generate a first page fault upon determining that a first page table that is stored in a first memory unit associated with the first processing unit does not include a mapping corresponding to the first virtual memory address. The system further includes a first copy engine associated with the first processing unit. The first copy engine is configured to read a first command queue to determine a first mapping that corresponds to the first virtual memory address and is included in a first page state directory. The first copy engine is also configured to update the first page table to include the first mapping.

Type: Grant

Filed: October 16, 2013

Date of Patent: October 15, 2019

Assignee: NVIDIA CORPORATION

Inventors: Jerome F. Duluk, Jr., Cameron Buschardt, Sherry Cheung, James Leroy Deming, Samuel H. Duncan, Lucien Dunning, Robert George, Arvind Gopalakrishnan, Mark Hairgrove, Chenghuan Jia, John Mashey
Two-pass cache tile processing for visibility testing in a tile-based architecture

Patent number: 10438314

Abstract: One embodiment of the present invention sets forth a graphics processing system. The graphics processing system includes a screen-space pipeline and a tiling unit. The screen-space pipeline is configured to perform visibility testing and fragment shading. The tiling unit is configured to determine that a first set of primitives overlaps a first cache tile. The tiling unit is also configured to first transmit the first set of primitives to the screen-space pipeline with a command configured to cause the screen-space pipeline to process the first set of primitives in a z-only mode, and then transmit the first set of primitives to the screen-space pipeline with a command configured to cause the screen-space pipeline to process the first set of primitives in a normal mode. In the z-only mode, at least some fragment shading operations are disabled in the screen-space pipeline. In the normal mode, fragment shading operations are enabled.

Type: Grant

Filed: April 23, 2018

Date of Patent: October 8, 2019

Assignee: NVIDIA CORPORATION

Inventors: Ziyad S. Hakura, Jerome F. Duluk, Jr.
Techniques for comprehensively synchronizing execution threads

Patent number: 10437593

Abstract: A synchronization instruction causes a processor to ensure that specified threads included within a warp concurrently execute a single subsequent instruction. The specified threads include at least a first thread and a second thread. In operation, the first thread arrives at the synchronization instruction. The processor determines that the second thread has not yet arrived at the synchronization instruction and configures the first thread to stop executing instructions. After issuing at least one instruction for the second thread, the processor determines that all the specified threads have arrived at the synchronization instruction. The processor then causes all the specified threads to execute the subsequent instruction. Advantageously, unlike conventional approaches to synchronizing threads, the synchronization instruction enables the processor to reliably and properly execute code that includes complex control flows and/or instructions that presuppose that threads are converged.

Type: Grant

Filed: April 27, 2017

Date of Patent: October 8, 2019

Assignee: NVIDIA CORPORATION

Inventors: Ajay Sudarshan Tirumala, Olivier Giroux, Peter Nelson, Jack Choquette
Perceptually-based foveated rendering using a contrast-enhancing filter

Patent number: 10438400

Abstract: A method, computer readable medium, and system are disclosed for rendering images utilizing a foveated rendering algorithm with post-process filtering to enhance a contrast of the foveated image. The method includes the step of receiving a three-dimensional scene, rendering the 3D scene according to a foveated rendering algorithm to generate a foveated image, and filtering the foveated image using a contrast-enhancing filter to generate a filtered foveated image. The foveated rendering algorithm may incorporate aspects of coarse pixel shading, mipmapped texture maps, linear efficient anti-aliased normal maps, exponential variance shadow maps, and specular anti-aliasing techniques. The foveated rendering algorithm may also be combined with temporal anti-aliasing techniques to further reduce artifacts in the foveated image.

Type: Grant

Filed: March 8, 2017

Date of Patent: October 8, 2019

Assignee: NVIDIA Corporation

Inventors: Anjul Patney, Marco Salvi, Joohwan Kim, Anton S. Kaplanyan, Christopher Ryan Wyman, Nir Benty, David Patrick Luebke, Aaron Eliot Lefohn
Broadcast scan network

Patent number: 10436840

Abstract: A distributed test circuit includes partitions arranged in series to form a scan path, each partition including a scan multiplexer, a test data register, and a segment insertion bit component. The scan multiplexer of each partition provides inputs to the corresponding test data register of the each partition. Broadcast control logic generates a select signal to the scan multiplexer of each partition to place the test circuit in a broadcast mode when the select signal is asserted, and to switch the test circuit to a daisy mode when select signal is de-asserted. The segment insertion bit is operable to include or bypass each partition from the scan path.

Type: Grant

Filed: March 26, 2018

Date of Patent: October 8, 2019

Assignee: NVIDIA Corp.

Inventors: Jau Wu, Saurabh Gupta
Multi-GPU frame rendering

Patent number: 10430915

Abstract: One or more copy commands are scheduled for locating one or more pages of data in a local memory of a graphics processing unit (GPU) for more efficient access to the pages of data during rendering. A first processing unit that is coupled to a first GPU receives a notification that an access request count has reached a specified threshold. The first processing unit schedules a copy command to copy the first page of data to a first memory circuit of the first GPU from a second memory circuit of the second GPU. The copy command is included within a GPU command stream.

Type: Grant

Filed: January 24, 2018

Date of Patent: October 1, 2019

Assignee: NVIDIA Corporation

Inventors: Andrei Khodakovsky, Kirill A. Dmitriev, Rouslan L. Dimitrov, Tzyywei Hwang, Wishwesh Anil Gandhi, Lacky Vasant Shah
Low overhead copy engine fault and switch mechanism

Patent number: 10430356

Abstract: Embodiments of the present invention set forth techniques for resolving page faults associated with a copy engine. A copy engine within a parallel processor receives a copy operation that includes a set of copy commands. The copy engine executes a first copy command included in the set of copy commands that results in a page fault. The copy engine stores the set of copy commands to the memory. At least one advantage of the disclosed techniques is that the copy engine can perform copy operations that involve source and destination memory pages that are not pinned, leading to reduced memory demand and greater flexibility.

Type: Grant

Filed: April 28, 2017

Date of Patent: October 1, 2019

Assignee: NVIDIA CORPORATION

Inventors: M. Wasiur Rashid, Jonathon Evans, Gary Ward, Philip Browning Johnson
Multi-pass rendering in a screen space pipeline

Patent number: 10430989

Abstract: A multi-pass unit interoperates with a device driver to configure a screen space pipeline to perform multiple processing passes with buffered graphics primitives. The multi-pass unit receives primitive data and state bundles from the device driver. The primitive data includes a graphics primitive and a primitive mask. The primitive mask indicates the specific passes when the graphics primitive should be processed. The state bundles include one or more state settings and a state mask. The state mask indicates the specific passes where the state settings should be applied. The primitives and state settings are interleaved. For a given pass, the multi-pass unit extracts the interleaved state settings for that pass and configures the screen space pipeline according to those state settings. The multi-pass unit also extracts the interleaved graphics primitives to be processed in that pass. Then, the multi-pass unit causes the screen space pipeline to process those graphics primitives.

Type: Grant

Filed: November 25, 2015

Date of Patent: October 1, 2019

Assignee: NVIDIA CORPORATION

Inventors: Ziyad Hakura, Cynthia Allison, Dale Kirkland, Jeffrey Bolz, Yury Uralsky, Jonah Alben
Video encoder, video encoding system and video encoding method

Patent number: 10432954

Abstract: The present disclosure discloses a video encoder, a video encoding system and a video encoding method. The video encoder comprises a logic control module and an encoding module. Wherein, the logic control module is configured to receive a control command sent from an external controller for encoding a specified portion of each frame of image, and send the control command to the encoding module; and the encoding module is configured to receive the control command from the logic control module, and encode the specified portion of each frame of image according to the control command, so as to cooperate with a plurality of other video encoders to complete encoding each frame of image.

Type: Grant

Filed: August 8, 2016

Date of Patent: October 1, 2019

Assignee: NVIDIA CORPORATION

Inventors: Jianjun Chen, Xi He, Chunfeng Yang, Zejun Hu
Replicated stateless copy engine

Patent number: 10423424

Abstract: Techniques are disclosed for performing an auxiliary operation via a compute engine associated with a host computing device. The method includes determining that the auxiliary operation is directed to the compute engine, and determining that the auxiliary operation is associated with a first context comprising a first set of state parameters. The method further includes determining a first subset of state parameters related to the auxiliary operation based on the first set of state parameters. The method further includes transmitting the first subset of state parameters to the compute engine, and transmitting the auxiliary operation to the compute engine. One advantage of the disclosed technique is that surface area and power consumption are reduced within the processor by utilizing copy engines that have no context switching capability.

Type: Grant

Filed: September 28, 2012

Date of Patent: September 24, 2019

Assignee: NVIDIA CORPORATION

Inventors: Lincoln G. Garlick, Philip Browning Johnson, Rafal Zboinski, Jeff Tuckey, Samuel H. Duncan, Peter C. Mills
System and method for optical flow estimation

Patent number: 10424069

Abstract: A method, computer readable medium, and system are disclosed for estimating optical flow between two images. A first pyramidal set of features is generated for a first image and a partial cost volume for a level of the first pyramidal set of features is computed, by a neural network, using features at the level of the first pyramidal set of features and warped features extracted from a second image, where the partial cost volume is computed across a limited range of pixels that is less than a full resolution of the first image, in pixels, at the level. The neural network processes the features and the partial cost volume to produce a refined optical flow estimate for the first image and the second image.

Type: Grant

Filed: March 30, 2018

Date of Patent: September 24, 2019

Assignee: NVIDIA Corporation

Inventors: Deqing Sun, Xiaodong Yang, Ming-Yu Liu, Jan Kautz
Method and apparatus for obtaining sampled positions of texturing operations

Patent number: 10424074

Abstract: Methods and apparatuses are disclosed for reporting texture footprint information. A texture footprint identifies the portion of a texture that will be utilized in rendering a pixel in a scene. The disclosed methods and apparatuses advantageously improve system efficiency in decoupled shading systems by first identifying which texels in a given texture map are needed for subsequently rendering a scene. Therefore, the number of texels that are generated and stored may be reduced to include the identified texels. Texels that are not identified need not be rendered and/or stored.

Type: Grant

Filed: July 3, 2018

Date of Patent: September 24, 2019

Assignee: NVIDIA Corporation

Inventors: Yury Uralsky, Henry Packard Moreton, Eric Brian Lum, Jonathan J. Dunaisky, Steven James Heinrich, Stefano Pescador, Shirish Gadre, Michael Alan Fetterman
GPU and GPU computing system for providing a virtual machine and a method of manufacturing the same

Patent number: 10417989

Abstract: Disclosed herein is a GPU for improved multitasking by a user, a GPU computing system including the GPU and a method of manufacturing a GPU system. In one embodiment, the GPU includes: (1) a video overlayer configured to create an operating area over a portion of a video image generated by the graphical processing unit and (2) an overlay interface configured to provide a virtual space input to the video overlayer to operate a virtual machine within the operating area.

Type: Grant

Filed: January 2, 2014

Date of Patent: September 17, 2019

Assignee: Nvidia Corporation

Inventor: Andrew Fear
System and method for generating temporally stable hashed values

Patent number: 10417813

Abstract: A method for generating temporally stable hash values reduces visual artifacts associated with stochastic sampling of data for graphics applications. A given hash value can be generated from a scaled and discretized object-space for a geometric object within a scene. Through appropriate scaling, the hash value can be discretized and remain constant within a threshold distance from a pixel center. As the geometric object moves within the scene, a hash value associated with a given feature of the geometric object remains constant because the hash value is generated using an object-space coordinate anchored to the feature. In one embodiment, alpha testing threshold values are assigned random, but temporally stable hash output values generated using object-space coordinate positions for primitive fragments undergoing alpha testing. Alpha tested fragments are temporally stable, beneficially improving image quality.

Type: Grant

Filed: November 7, 2017

Date of Patent: September 17, 2019

Assignee: NVIDIA Corporation

Inventors: Christopher Ryan Wyman, Morgan McGuire
Supersampling for spatially distributed and disjoined large-scale data

Patent number: 10417817

Abstract: A method, computer readable medium, and system are disclosed for supersampling a large-scale and disjoined data set. The data set may include point cloud, voxel, or polygonal mesh data. The data set may be rendered using a distributed, sort-last rendering system that includes a plurality of rendering nodes and one or more compositing nodes. The method includes the steps of receiving graphics data at a plurality of rendering nodes, rendering at least a portion of the graphics data by one or more rendering nodes to produce multi-sample image data, encoding the multi-sample image data using a difference encoding technique, and transmitting the encoded multi-sample image data to a compositing node. The multi-sample image data comprises a plurality of values per pixel of a target image corresponding to a plurality of sample locations defined for each pixel of the target image.

Type: Grant

Filed: October 30, 2015

Date of Patent: September 17, 2019

Assignee: NVIDIA Corporation

Inventors: Henning Christopher Lux, Marc Nienhaus, Joerg Mensmann
Efficient binding of resource groups in a graphics application programming interface

Patent number: 10417990

Abstract: A method of binding graphics resources is provided that includes: (1) identifying graphics resources for binding, (2) generating a bind group for the graphics resources, (3) organizing the bind group into a bind group memory using a bind group layout and (4) providing bind group control for processing of the bind group. A method of organizing graphics resources and a resource organizing unit are also provided.

Type: Grant

Filed: September 16, 2015

Date of Patent: September 17, 2019

Assignee: Nvidia Corporation

Inventor: Jeffrey A. Bolz

prev … 84 85 86 87 88 89 90 91 92 … next