Patents Assigned to NVidia

Efficient CPU mailbox read access to GPU memory

Patent number: 9727521

Abstract: Techniques are disclosed for peer-to-peer data transfers where a source device receives a request to read data words from a target device. The source device creates a first and second read command for reading a first portion and a second portion of a plurality of data words from the target device, respectively. The source device transmits the first read command to the target device, and, before a first read operation associated with the first read command is complete, transmits the second read command to the target device. The first and second portions of the plurality of data words are stored in a first and second portion a buffer memory, respectively. Advantageously, an arbitrary number of multiple read operations may be in progress at a given time without using multiple peer-to-peer memory buffers. Performance for large data block transfers is improved without consuming peer-to-peer memory buffers needed by other peer GPUs.

Type: Grant

Filed: September 14, 2012

Date of Patent: August 8, 2017

Assignee: NVIDIA Corporation

Inventors: Dennis K. Ma, Karan Gupta, Lei Tian, Franck R. Diard, Praveen Jain, Wei-Je Huang, Atul Kalambur
Method and system for distributed shader optimization

Patent number: 9727339

Abstract: Embodiments of the present invention are operable to communicate a list of important shaders and their current best-known compilations to remote client devices over a communications network. Client devices are allowed to produce modified shader compilations by varying optimizations. If a client device produces a modified compilation that beats an important shader's current best-known compilation, embodiments of the present invention can communicate this new best-known shader compilation back to a host computer system. Furthermore, embodiments of the present invention may periodically broadcast the new best-known shader compilation back to client devices for possible further optimization or for efficient rendering operations using the best-known shader compilation.

Type: Grant

Filed: July 18, 2013

Date of Patent: August 8, 2017

Assignee: Nvidia Corporation

Inventor: Jeremy Zelsnack
System and method for translating program functions for correct handling of local-scope variables and computing system incorporating the same

Patent number: 9727338

Abstract: A system and method of translating functions of a program. In one embodiment, the system includes: (1) a local-scope variable identifier operable to identify local-scope variables employed in the at least some of the functions as being either thread-shared local-scope variables or thread-private local-scope variables and (2) a function translator associated with the local-scope variable identifier and operable to translate the at least some of the functions to cause thread-shared memory to be employed to store the thread-shared local-scope variables and thread-private memory to be employed to store the thread-private local-scope variables.

Type: Grant

Filed: December 21, 2012

Date of Patent: August 8, 2017

Assignee: Nvidia Corporation

Inventors: Yuan Lin, Gautam Chakrabarti, Jaydeep Marathe, Okwan Kwon, Amit Sabne
Techniques for render pass dependencies in an API

Patent number: 9727392

Abstract: Techniques for passing dependencies in an application programming interface API includes identifying a plurality of passes of execution commands. For each set of passes, wherein one pass is a destination pass and the other pass is a source pass to each other, one or more dependencies, of one or more dependency types, are determined between the execution commands of the destination pass and the source pass. Pass objects are then created for each identified pass, wherein each pass object records the execution commands and dependencies between the corresponding destination and source passes.

Type: Grant

Filed: September 16, 2015

Date of Patent: August 8, 2017

Assignee: NVIDIA CORPORATION

Inventor: Jeffrey Bolz
System, method, and computer program product for discarding pixel samples

Patent number: 9721381

Abstract: A system, method, and computer program product are provided for discarding pixel samples. The method includes the steps of completing shading operations for a pixel set including one or more pixels to generate per-sample shaded attributes according to a shader program executed by a processing pipeline. Discard information for the pixel set is evaluated and one or more per-sample shaded attributes for at least one pixel in the pixel set are discarded based on the evaluated discard information.

Type: Grant

Filed: October 11, 2013

Date of Patent: August 1, 2017

Assignee: NVIDIA Corporation

Inventors: Christian Jean Rouet, Manan Maheshkumar Patel, Shirish Gadre, Daniel Paul Wilde
Adaptive multilevel binning to improve hierarchical caching

Patent number: 9720842

Abstract: A device driver calculates a tile size for a plurality of cache memories in a cache hierarchy. The device driver calculates a storage capacity of a first cache memory. The device driver calculates a first tile size based on the storage capacity of the first cache memory and one or more additional characteristics. The device driver calculates a storage capacity of a second cache memory. The device driver calculates a second tile size based on the storage capacity of the second cache memory and one or more additional characteristics, where the second tile size is different than the first tile size. The device driver transmits the second tile size to a second coalescing binning unit. One advantage of the disclosed techniques is that data locality and cache memory hit rates are improved where tile size is optimized for each cache level in the cache hierarchy.

Type: Grant

Filed: February 20, 2013

Date of Patent: August 1, 2017

Assignee: NVIDIA Corporation

Inventors: Rouslan Dimitrov, Rui Bastos, Ziyad S. Hakura, Eric B. Lum
Fully parallel in-place construction of 3D acceleration structures and bounding volume hierarchies in a graphics processing unit

Patent number: 9721320

Abstract: A non-transitory computer-readable storage medium having computer-executable instructions for causing a computer system to perform a method for constructing bounding volume hierarchies from binary trees is disclosed. The method includes providing a binary tree including a plurality of leaf nodes and a plurality of internal nodes. Each of the plurality of internal nodes is uniquely associated with two child nodes, wherein each child node comprises either an internal node or leaf node. The method also includes determining a plurality of bounding volumes for nodes in the binary tree by traversing the binary tree from the plurality of leaf nodes upwards toward a root node, wherein each parent node is processed once by a later arriving corresponding child node.

Type: Grant

Filed: December 31, 2012

Date of Patent: August 1, 2017

Assignee: NVIDIA CORPORATION

Inventor: Tero Karras
System and method for early packet header verification

Patent number: 9720768

Abstract: A receiver, transmitter and method for early packet header verification are provided. In one embodiment, the method includes: (1) receiving a payload flit of a preceding packet and a header flit of a current packet; and (2) using a Cyclic Redundancy Check (CRC) in the header flit to verify the payload flit of the preceding packet and the header flit of the current packet.

Type: Grant

Filed: October 6, 2015

Date of Patent: August 1, 2017

Assignee: Nvidia Corporation

Inventors: Stephen D. Glaser, Eric Tyson, Mark Hummel, Michael Osborn, Jonathan Owen, Marvin Denman, Dennis Ma, Denis Foley
Technique for performing memory access operations via texture hardware

Patent number: 9720858

Abstract: A texture processing pipeline can be configured to service memory access requests that represent texture data access operations or generic data access operations. When the texture processing pipeline receives a memory access request that represents a texture data access operation, the texture processing pipeline may retrieve texture data based on texture coordinates. When the memory access request represents a generic data access operation, the texture pipeline extracts a virtual address from the memory access request and then retrieves data based on the virtual address. The texture processing pipeline is also configured to cache generic data retrieved on behalf of a group of threads and to then invalidate that generic data when the group of threads exits.

Type: Grant

Filed: December 19, 2012

Date of Patent: August 1, 2017

Assignee: NVIDIA CORPORATION

Inventors: Brian Fahs, Eric T. Anderson, Nick Barrow-Williams, Shirish Gadre, Joel James McCormack, Bryon S. Nordquist, Nirmal Raj Saxena, Lacky V. Shah
Method and system for generating an image including optically zoomed and digitally zoomed regions

Patent number: 9723216

Abstract: A method for generating images. The method includes capturing first image data representing a first scene taken optically at a first magnification index, wherein the first image data comprises a first region of an image. The method includes capturing second image data representing a second scene taken optically at a second magnification index that is less than the first magnification index, wherein the second image data comprises a second region of the image. The method includes digitally zooming the second image data in the second region to the first magnification index. The method includes digitally stitching the second image data in the second region to the first image data in the first region.

Type: Grant

Filed: February 13, 2014

Date of Patent: August 1, 2017

Assignee: NVIDIA CORPORATION

Inventor: Rajat Aggarwal
System, method, and computer program product for a stereoscopic image lasso

Patent number: 9721187

Abstract: A system, method, and computer program product for providing a lasso selection tool for a stereoscopic image is disclosed. The method includes the steps of obtaining a lasso region of a stereoscopic image pair based on a path defined by a user using a lasso selection tool. An object in a first image of the stereoscopic image pair is identified, where the object is at least partially included within the lasso region and the object is identified in a second image of the stereoscopic image pair.

Type: Grant

Filed: August 30, 2013

Date of Patent: August 1, 2017

Assignee: NVIDIA Corporation

Inventor: David R. Cook
Open solder mask and or dielectric to increase lid or ring thickness and contact area to improve package coplanarity

Patent number: 9716051

Abstract: A packaging substrate, a packaged semiconductor device, a computing device and methods for forming the same are provided. In one embodiment, a packaging substrate is provided that includes a packaging structure having a chip mounting surface and a bottom surface. The packaging structure has at a plurality of conductive paths formed between the chip mounting surface and the bottom surface. The conductive paths are configured to provide electrical connection between an integrated circuit chip disposed on the chip mounting surface and the bottom surface of the packaging structure. The packaging structure has an opening formed in the chip mounting surface proximate a perimeter of the packaging structure. A stiffening microstructure is disposed in the opening and is coupled to the packaging structure.

Type: Grant

Filed: November 2, 2012

Date of Patent: July 25, 2017

Assignee: NVIDIA Corporation

Inventors: Leilei Zhang, Ron Boja, Abraham Yee, Zuhair Bokharey
Execution state analysis for assigning tasks to streaming multiprocessors

Patent number: 9715413

Abstract: One embodiment of the present invention sets forth a technique for selecting a first processor included in a plurality of processors to receive work related to a compute task. The technique involves analyzing state data of each processor in the plurality of processors to identify one or more processors that have already been assigned one compute task and are eligible to receive work related to the one compute task, receiving, from each of the one or more processors identified as eligible, an availability value that indicates the capacity of the processor to receive new work, selecting a first processor to receive work related to the one compute task based on the availability values received from the one or more processors, and issuing, to the first processor via a cooperative thread array (CTA), the work related to the one compute task.

Type: Grant

Filed: January 18, 2012

Date of Patent: July 25, 2017

Assignee: NVIDIA Corporation

Inventors: Karim M. Abdalla, Lacky V. Shah, Jerome F. Duluk, Jr., Timothy John Purcell, Tanmoy Mandal, Gentaro Hirota
MANAGING DEFERRED CONTEXTS IN A CACHE TILING ARCHITECTURE

Publication number: 20170206623

Abstract: A method for managing bind-render-target commands in a tile-based architecture. The method includes receiving a requested set of bound render targets and a draw command. The method also includes, upon receiving the draw command, determining whether a current set of bound render targets includes each of the render targets identified in the requested set. The method further includes, if the current set does not include each render target identified in the requested set, then issuing a flush-tiling-unit-command to a parallel processing subsystem, modifying the current set to include each render target identified in the requested set, and issuing bind-render-target commands identifying the requested set to the tile-based architecture for processing. The method further includes, if the current set of render targets includes each render target identified in the requested set, then not issuing the flush-tiling-unit-command.

Type: Application

Filed: October 1, 2013

Publication date: July 20, 2017

Applicant: NVIDIA CORPORATION

Inventors: Ziyad S. HAKURA, Jeffrey A. BOLZ, Amanpreet GREWAL, Matthew JOHNSON, Andrei KHODAKOVSKY
Techniques for avoiding and remedying DC bias buildup on a flat panel variable refresh rate display

Patent number: 9711099

Abstract: A method for driving a display panel having a variable refresh rate is disclosed. The method comprises receiving a current input frame from an image source. Next, it comprises determining a number of re-scanned frames to insert between the current input frame and a subsequent input frame, wherein the re-scanned frames repeat the input frame, and wherein the number of re-scanned frames depends on the minimum refresh interval (MRI) of the display panel. Further, it comprises calculating respective intervals at which to insert the re-scanned frames between the current input frame and the subsequent input frame. Subsequently, it comprises determining if a charge accumulation in pixels of the display panel has crossed over a predetermined threshold value. Finally, responsive to a determination that the charge accumulation has crossed over a predetermined threshold value, it comprises performing a counter-measure to remediate the charge accumulation.

Type: Grant

Filed: February 26, 2014

Date of Patent: July 18, 2017

Assignee: NVIDIA CORPORATION

Inventors: Rudolf Bloks, Robert Schutten, Tom Verbeure
Mid-primitive graphics execution preemption

Patent number: 9710874

Abstract: One embodiment of the present invention sets forth a technique for mid-primitive execution preemption. When preemption is initiated no new instructions are issued, in-flight instructions progress to an execution unit boundary, and the execution state is unloaded from the processing pipeline. The execution units within the processing pipeline, including the coarse rasterization unit complete execution of in-flight instructions and become idle. However, rasterization of a triangle may be preempted at a coarse raster region boundary. The amount of context state to be stored is reduced because the execution units are idle. Preempting at the mid-primitive level during rasterization reduces the time from when preemption is initiated to when another process can execute because the entire triangle is not rasterized.

Type: Grant

Filed: December 27, 2012

Date of Patent: July 18, 2017

Assignee: NVIDIA Corporation

Inventors: Gregory Scott Palmer, Ziyad S. Hakura, Emmett M. Kilgariff, Dale L. Kirkland, Lacky V. Shah
Methods and apparatus for auto-throttling encapsulated compute tasks

Patent number: 9710306

Abstract: Systems and methods for auto-throttling encapsulated compute tasks. A device driver may configure a parallel processor to execute compute tasks in a number of discrete throttled modes. The device driver may also allocate memory to a plurality of different processing units in a non-throttled mode. The device driver may also allocate memory to a subset of the plurality of processing units in each of the throttling modes. Data structures defined for each task include a flag that instructs the processing unit whether the task may be executed in the non-throttled mode or in the throttled mode. A work distribution unit monitors each of the tasks scheduled to run on the plurality of processing units and determines whether the processor should be configured to run in the throttled mode or in the non-throttled mode.

Type: Grant

Filed: April 9, 2012

Date of Patent: July 18, 2017

Assignee: NVIDIA Corporation

Inventors: Jerome F. Duluk, Jr., Jesse David Hall, Philip Alexander Cuadra, Karim M. Abdalla
System and method for allocating memory of differing properties to shared data objects

Patent number: 9710275

Abstract: A system and method for allocating shared memory of differing properties to shared data objects and a hybrid stack data structure. In one embodiment, the system includes: (1) a hybrid stack creator configured to create, in the shared memory, a hybrid stack data structure having a lower portion having a more favorable property and a higher portion having a less favorable property and (2) a data object allocator associated with the hybrid stack creator and configured to allocate storage for shared data object in the lower portion if the lower portion has a sufficient remaining capacity to contain the shared data object and alternatively allocate storage for the shared data object in the higher portion if the lower portion has an insufficient remaining capacity to contain the shared data object.

Type: Grant

Filed: December 21, 2012

Date of Patent: July 18, 2017

Assignee: Nvidia Corporation

Inventors: Jaydeep Marathe, Yuan Lin, Gautam Chakrabarti, Okwan Kwon, Amit Sabne
System and method for enhanced multi-sample anti-aliasing

Patent number: 9710894

Abstract: A system and method for enhanced multi-sample anti-aliasing. The method includes determining a sampling pattern corresponding to a pixel and adjusting the sampling pattern based on a visual effect (e.g., post-processing visual effect). The method further includes accessing a first plurality of samples based on the sampling pattern. The first plurality of samples may comprise a second plurality of samples within the pixel and a third plurality of pixels outside of the pixel. The method further includes performing anti-aliasing filtering of the pixel based on the first plurality of samples and the sampling pattern.

Type: Grant

Filed: June 4, 2013

Date of Patent: July 18, 2017

Assignee: NVIDIA CORPORATION

Inventor: Timothy Paul Lottes
System and method for image processing

Patent number: 9704212

Abstract: A system and method for image processing are provided. The system comprises a main computing device and a secondary computing device. The main computing device comprises a main graphics card and a main central processing unit, and the secondary computing device comprises a secondary graphics card and a secondary central processing unit. The main computing device is configured to detect the secondary computing device. The main central processing unit is configured to send a request to process raw image data together to the secondary central processing unit and allocate the raw image data to the main graphics card and the secondary graphics card after receiving a response from the secondary central processing unit. The main graphics card and the secondary graphics card are configured to process images based on the allocation of the main central processing unit.

Type: Grant

Filed: February 7, 2014

Date of Patent: July 11, 2017

Assignee: Nvidia Corporation

Inventor: Maojiang (Jacen) Lin

prev … 107 108 109 110 111 112 113 114 115 … next