Patents Assigned to NVidia
  • Patent number: 9727521
    Abstract: Techniques are disclosed for peer-to-peer data transfers where a source device receives a request to read data words from a target device. The source device creates a first and second read command for reading a first portion and a second portion of a plurality of data words from the target device, respectively. The source device transmits the first read command to the target device, and, before a first read operation associated with the first read command is complete, transmits the second read command to the target device. The first and second portions of the plurality of data words are stored in a first and second portion a buffer memory, respectively. Advantageously, an arbitrary number of multiple read operations may be in progress at a given time without using multiple peer-to-peer memory buffers. Performance for large data block transfers is improved without consuming peer-to-peer memory buffers needed by other peer GPUs.
    Type: Grant
    Filed: September 14, 2012
    Date of Patent: August 8, 2017
    Assignee: NVIDIA Corporation
    Inventors: Dennis K. Ma, Karan Gupta, Lei Tian, Franck R. Diard, Praveen Jain, Wei-Je Huang, Atul Kalambur
  • Patent number: 9727339
    Abstract: Embodiments of the present invention are operable to communicate a list of important shaders and their current best-known compilations to remote client devices over a communications network. Client devices are allowed to produce modified shader compilations by varying optimizations. If a client device produces a modified compilation that beats an important shader's current best-known compilation, embodiments of the present invention can communicate this new best-known shader compilation back to a host computer system. Furthermore, embodiments of the present invention may periodically broadcast the new best-known shader compilation back to client devices for possible further optimization or for efficient rendering operations using the best-known shader compilation.
    Type: Grant
    Filed: July 18, 2013
    Date of Patent: August 8, 2017
    Assignee: Nvidia Corporation
    Inventor: Jeremy Zelsnack
  • Patent number: 9727338
    Abstract: A system and method of translating functions of a program. In one embodiment, the system includes: (1) a local-scope variable identifier operable to identify local-scope variables employed in the at least some of the functions as being either thread-shared local-scope variables or thread-private local-scope variables and (2) a function translator associated with the local-scope variable identifier and operable to translate the at least some of the functions to cause thread-shared memory to be employed to store the thread-shared local-scope variables and thread-private memory to be employed to store the thread-private local-scope variables.
    Type: Grant
    Filed: December 21, 2012
    Date of Patent: August 8, 2017
    Assignee: Nvidia Corporation
    Inventors: Yuan Lin, Gautam Chakrabarti, Jaydeep Marathe, Okwan Kwon, Amit Sabne
  • Patent number: 9727392
    Abstract: Techniques for passing dependencies in an application programming interface API includes identifying a plurality of passes of execution commands. For each set of passes, wherein one pass is a destination pass and the other pass is a source pass to each other, one or more dependencies, of one or more dependency types, are determined between the execution commands of the destination pass and the source pass. Pass objects are then created for each identified pass, wherein each pass object records the execution commands and dependencies between the corresponding destination and source passes.
    Type: Grant
    Filed: September 16, 2015
    Date of Patent: August 8, 2017
    Assignee: NVIDIA CORPORATION
    Inventor: Jeffrey Bolz
  • Patent number: 9721381
    Abstract: A system, method, and computer program product are provided for discarding pixel samples. The method includes the steps of completing shading operations for a pixel set including one or more pixels to generate per-sample shaded attributes according to a shader program executed by a processing pipeline. Discard information for the pixel set is evaluated and one or more per-sample shaded attributes for at least one pixel in the pixel set are discarded based on the evaluated discard information.
    Type: Grant
    Filed: October 11, 2013
    Date of Patent: August 1, 2017
    Assignee: NVIDIA Corporation
    Inventors: Christian Jean Rouet, Manan Maheshkumar Patel, Shirish Gadre, Daniel Paul Wilde
  • Patent number: 9720842
    Abstract: A device driver calculates a tile size for a plurality of cache memories in a cache hierarchy. The device driver calculates a storage capacity of a first cache memory. The device driver calculates a first tile size based on the storage capacity of the first cache memory and one or more additional characteristics. The device driver calculates a storage capacity of a second cache memory. The device driver calculates a second tile size based on the storage capacity of the second cache memory and one or more additional characteristics, where the second tile size is different than the first tile size. The device driver transmits the second tile size to a second coalescing binning unit. One advantage of the disclosed techniques is that data locality and cache memory hit rates are improved where tile size is optimized for each cache level in the cache hierarchy.
    Type: Grant
    Filed: February 20, 2013
    Date of Patent: August 1, 2017
    Assignee: NVIDIA Corporation
    Inventors: Rouslan Dimitrov, Rui Bastos, Ziyad S. Hakura, Eric B. Lum
  • Patent number: 9721320
    Abstract: A non-transitory computer-readable storage medium having computer-executable instructions for causing a computer system to perform a method for constructing bounding volume hierarchies from binary trees is disclosed. The method includes providing a binary tree including a plurality of leaf nodes and a plurality of internal nodes. Each of the plurality of internal nodes is uniquely associated with two child nodes, wherein each child node comprises either an internal node or leaf node. The method also includes determining a plurality of bounding volumes for nodes in the binary tree by traversing the binary tree from the plurality of leaf nodes upwards toward a root node, wherein each parent node is processed once by a later arriving corresponding child node.
    Type: Grant
    Filed: December 31, 2012
    Date of Patent: August 1, 2017
    Assignee: NVIDIA CORPORATION
    Inventor: Tero Karras
  • Patent number: 9720768
    Abstract: A receiver, transmitter and method for early packet header verification are provided. In one embodiment, the method includes: (1) receiving a payload flit of a preceding packet and a header flit of a current packet; and (2) using a Cyclic Redundancy Check (CRC) in the header flit to verify the payload flit of the preceding packet and the header flit of the current packet.
    Type: Grant
    Filed: October 6, 2015
    Date of Patent: August 1, 2017
    Assignee: Nvidia Corporation
    Inventors: Stephen D. Glaser, Eric Tyson, Mark Hummel, Michael Osborn, Jonathan Owen, Marvin Denman, Dennis Ma, Denis Foley
  • Patent number: 9720858
    Abstract: A texture processing pipeline can be configured to service memory access requests that represent texture data access operations or generic data access operations. When the texture processing pipeline receives a memory access request that represents a texture data access operation, the texture processing pipeline may retrieve texture data based on texture coordinates. When the memory access request represents a generic data access operation, the texture pipeline extracts a virtual address from the memory access request and then retrieves data based on the virtual address. The texture processing pipeline is also configured to cache generic data retrieved on behalf of a group of threads and to then invalidate that generic data when the group of threads exits.
    Type: Grant
    Filed: December 19, 2012
    Date of Patent: August 1, 2017
    Assignee: NVIDIA CORPORATION
    Inventors: Brian Fahs, Eric T. Anderson, Nick Barrow-Williams, Shirish Gadre, Joel James McCormack, Bryon S. Nordquist, Nirmal Raj Saxena, Lacky V. Shah
  • Patent number: 9723216
    Abstract: A method for generating images. The method includes capturing first image data representing a first scene taken optically at a first magnification index, wherein the first image data comprises a first region of an image. The method includes capturing second image data representing a second scene taken optically at a second magnification index that is less than the first magnification index, wherein the second image data comprises a second region of the image. The method includes digitally zooming the second image data in the second region to the first magnification index. The method includes digitally stitching the second image data in the second region to the first image data in the first region.
    Type: Grant
    Filed: February 13, 2014
    Date of Patent: August 1, 2017
    Assignee: NVIDIA CORPORATION
    Inventor: Rajat Aggarwal
  • Patent number: 9721187
    Abstract: A system, method, and computer program product for providing a lasso selection tool for a stereoscopic image is disclosed. The method includes the steps of obtaining a lasso region of a stereoscopic image pair based on a path defined by a user using a lasso selection tool. An object in a first image of the stereoscopic image pair is identified, where the object is at least partially included within the lasso region and the object is identified in a second image of the stereoscopic image pair.
    Type: Grant
    Filed: August 30, 2013
    Date of Patent: August 1, 2017
    Assignee: NVIDIA Corporation
    Inventor: David R. Cook
  • Patent number: 9716051
    Abstract: A packaging substrate, a packaged semiconductor device, a computing device and methods for forming the same are provided. In one embodiment, a packaging substrate is provided that includes a packaging structure having a chip mounting surface and a bottom surface. The packaging structure has at a plurality of conductive paths formed between the chip mounting surface and the bottom surface. The conductive paths are configured to provide electrical connection between an integrated circuit chip disposed on the chip mounting surface and the bottom surface of the packaging structure. The packaging structure has an opening formed in the chip mounting surface proximate a perimeter of the packaging structure. A stiffening microstructure is disposed in the opening and is coupled to the packaging structure.
    Type: Grant
    Filed: November 2, 2012
    Date of Patent: July 25, 2017
    Assignee: NVIDIA Corporation
    Inventors: Leilei Zhang, Ron Boja, Abraham Yee, Zuhair Bokharey
  • Patent number: 9715413
    Abstract: One embodiment of the present invention sets forth a technique for selecting a first processor included in a plurality of processors to receive work related to a compute task. The technique involves analyzing state data of each processor in the plurality of processors to identify one or more processors that have already been assigned one compute task and are eligible to receive work related to the one compute task, receiving, from each of the one or more processors identified as eligible, an availability value that indicates the capacity of the processor to receive new work, selecting a first processor to receive work related to the one compute task based on the availability values received from the one or more processors, and issuing, to the first processor via a cooperative thread array (CTA), the work related to the one compute task.
    Type: Grant
    Filed: January 18, 2012
    Date of Patent: July 25, 2017
    Assignee: NVIDIA Corporation
    Inventors: Karim M. Abdalla, Lacky V. Shah, Jerome F. Duluk, Jr., Timothy John Purcell, Tanmoy Mandal, Gentaro Hirota
  • Publication number: 20170206623
    Abstract: A method for managing bind-render-target commands in a tile-based architecture. The method includes receiving a requested set of bound render targets and a draw command. The method also includes, upon receiving the draw command, determining whether a current set of bound render targets includes each of the render targets identified in the requested set. The method further includes, if the current set does not include each render target identified in the requested set, then issuing a flush-tiling-unit-command to a parallel processing subsystem, modifying the current set to include each render target identified in the requested set, and issuing bind-render-target commands identifying the requested set to the tile-based architecture for processing. The method further includes, if the current set of render targets includes each render target identified in the requested set, then not issuing the flush-tiling-unit-command.
    Type: Application
    Filed: October 1, 2013
    Publication date: July 20, 2017
    Applicant: NVIDIA CORPORATION
    Inventors: Ziyad S. HAKURA, Jeffrey A. BOLZ, Amanpreet GREWAL, Matthew JOHNSON, Andrei KHODAKOVSKY
  • Patent number: 9711099
    Abstract: A method for driving a display panel having a variable refresh rate is disclosed. The method comprises receiving a current input frame from an image source. Next, it comprises determining a number of re-scanned frames to insert between the current input frame and a subsequent input frame, wherein the re-scanned frames repeat the input frame, and wherein the number of re-scanned frames depends on the minimum refresh interval (MRI) of the display panel. Further, it comprises calculating respective intervals at which to insert the re-scanned frames between the current input frame and the subsequent input frame. Subsequently, it comprises determining if a charge accumulation in pixels of the display panel has crossed over a predetermined threshold value. Finally, responsive to a determination that the charge accumulation has crossed over a predetermined threshold value, it comprises performing a counter-measure to remediate the charge accumulation.
    Type: Grant
    Filed: February 26, 2014
    Date of Patent: July 18, 2017
    Assignee: NVIDIA CORPORATION
    Inventors: Rudolf Bloks, Robert Schutten, Tom Verbeure
  • Patent number: 9710874
    Abstract: One embodiment of the present invention sets forth a technique for mid-primitive execution preemption. When preemption is initiated no new instructions are issued, in-flight instructions progress to an execution unit boundary, and the execution state is unloaded from the processing pipeline. The execution units within the processing pipeline, including the coarse rasterization unit complete execution of in-flight instructions and become idle. However, rasterization of a triangle may be preempted at a coarse raster region boundary. The amount of context state to be stored is reduced because the execution units are idle. Preempting at the mid-primitive level during rasterization reduces the time from when preemption is initiated to when another process can execute because the entire triangle is not rasterized.
    Type: Grant
    Filed: December 27, 2012
    Date of Patent: July 18, 2017
    Assignee: NVIDIA Corporation
    Inventors: Gregory Scott Palmer, Ziyad S. Hakura, Emmett M. Kilgariff, Dale L. Kirkland, Lacky V. Shah
  • Patent number: 9710306
    Abstract: Systems and methods for auto-throttling encapsulated compute tasks. A device driver may configure a parallel processor to execute compute tasks in a number of discrete throttled modes. The device driver may also allocate memory to a plurality of different processing units in a non-throttled mode. The device driver may also allocate memory to a subset of the plurality of processing units in each of the throttling modes. Data structures defined for each task include a flag that instructs the processing unit whether the task may be executed in the non-throttled mode or in the throttled mode. A work distribution unit monitors each of the tasks scheduled to run on the plurality of processing units and determines whether the processor should be configured to run in the throttled mode or in the non-throttled mode.
    Type: Grant
    Filed: April 9, 2012
    Date of Patent: July 18, 2017
    Assignee: NVIDIA Corporation
    Inventors: Jerome F. Duluk, Jr., Jesse David Hall, Philip Alexander Cuadra, Karim M. Abdalla
  • Patent number: 9710275
    Abstract: A system and method for allocating shared memory of differing properties to shared data objects and a hybrid stack data structure. In one embodiment, the system includes: (1) a hybrid stack creator configured to create, in the shared memory, a hybrid stack data structure having a lower portion having a more favorable property and a higher portion having a less favorable property and (2) a data object allocator associated with the hybrid stack creator and configured to allocate storage for shared data object in the lower portion if the lower portion has a sufficient remaining capacity to contain the shared data object and alternatively allocate storage for the shared data object in the higher portion if the lower portion has an insufficient remaining capacity to contain the shared data object.
    Type: Grant
    Filed: December 21, 2012
    Date of Patent: July 18, 2017
    Assignee: Nvidia Corporation
    Inventors: Jaydeep Marathe, Yuan Lin, Gautam Chakrabarti, Okwan Kwon, Amit Sabne
  • Patent number: 9710894
    Abstract: A system and method for enhanced multi-sample anti-aliasing. The method includes determining a sampling pattern corresponding to a pixel and adjusting the sampling pattern based on a visual effect (e.g., post-processing visual effect). The method further includes accessing a first plurality of samples based on the sampling pattern. The first plurality of samples may comprise a second plurality of samples within the pixel and a third plurality of pixels outside of the pixel. The method further includes performing anti-aliasing filtering of the pixel based on the first plurality of samples and the sampling pattern.
    Type: Grant
    Filed: June 4, 2013
    Date of Patent: July 18, 2017
    Assignee: NVIDIA CORPORATION
    Inventor: Timothy Paul Lottes
  • Patent number: 9704212
    Abstract: A system and method for image processing are provided. The system comprises a main computing device and a secondary computing device. The main computing device comprises a main graphics card and a main central processing unit, and the secondary computing device comprises a secondary graphics card and a secondary central processing unit. The main computing device is configured to detect the secondary computing device. The main central processing unit is configured to send a request to process raw image data together to the secondary central processing unit and allocate the raw image data to the main graphics card and the secondary graphics card after receiving a response from the secondary central processing unit. The main graphics card and the secondary graphics card are configured to process images based on the allocation of the main central processing unit.
    Type: Grant
    Filed: February 7, 2014
    Date of Patent: July 11, 2017
    Assignee: Nvidia Corporation
    Inventor: Maojiang (Jacen) Lin