Patents Assigned to NVidia
  • Patent number: 9934145
    Abstract: In one embodiment of the present invention a cache unit organizes data stored in an attached memory to optimize accesses to compressed data. In operation, the cache unit introduces a layer of indirection between a physical address associated with a memory access request and groups of blocks in the attached memory. The layer of indirection—virtual tiles—enables the cache unit to selectively store compressed data that would conventionally be stored in separate physical tiles included in a group of blocks in a single physical tile. Because the cache unit stores compressed data associated with multiple physical tiles in a single physical tile and, more specifically, in adjacent locations within the single physical tile, the cache unit coalesces the compressed data into contiguous blocks. Subsequently, upon performing a read operation, the cache unit may retrieve the compressed data conventionally associated with separate physical tiles in a single read operation.
    Type: Grant
    Filed: October 28, 2015
    Date of Patent: April 3, 2018
    Assignee: NVIDIA Corporation
    Inventors: Praveen Krishnamurthy, Peter B. Holmquist, Wishwesh Gandhi, Timothy Purcell, Karan Mehra, Lacky Shah
  • Patent number: 9928642
    Abstract: A system and method uses the capabilities of a geometry shader unit within the multi-threaded graphics processor to implement algorithms with variable input and output.
    Type: Grant
    Filed: January 3, 2017
    Date of Patent: March 27, 2018
    Assignee: NVIDIA CORPORATION
    Inventor: Franck Diard
  • Patent number: 9928033
    Abstract: One embodiment of the present invention performs a parallel prefix scan in a single pass that incorporates variable look-back. A parallel processing unit (PPU) subdivides a list of inputs into sequentially-ordered segments and assigns each segment to a streaming multiprocessor (SM) included in the PPU. Notably, the SMs may operate in parallel. Each SM executes write operations on a segment descriptor that includes the status, aggregate, and inclusive-prefix associated with the assigned segment. Further, each SM may execute read operations on segment descriptors associated with other segments. In operation, each SM may perform reduction operations to determine a segment-wide aggregate, may perform look-back operations across multiple preceding segments to determine an exclusive-prefix, and may perform a scan seeded with the exclusive prefix to generate output data.
    Type: Grant
    Filed: October 1, 2013
    Date of Patent: March 27, 2018
    Assignee: NVIDIA Corporation
    Inventor: Duane Merrill
  • Patent number: 9928109
    Abstract: One embodiment of the present disclosure sets forth a technique for enforcing cross stream dependencies in a parallel processing subsystem such as a graphics processing unit. The technique involves queuing waiting events to create cross stream dependencies and signaling events to indicated completion to the waiting events. A scheduler kernel examines a task status data structure from a corresponding stream and updates dependency counts for tasks and events within the stream. When each task dependency for a waiting event is satisfied, an associated task may execute.
    Type: Grant
    Filed: May 9, 2012
    Date of Patent: March 27, 2018
    Assignee: NVIDIA Corporation
    Inventor: Luke Durant
  • Patent number: 9928644
    Abstract: A solution is proposed for efficiently determining whether or not a set of elements (such as convex shapes) in a multi-dimensional space mutually intersects. The solution may be applied to elements in any closed subset of real numbers for any number of spatial dimensions of the multi-dimensional space. The solutions provided herein include iterative processes for calculating the point displacement from boundaries of the elements (shapes), and devices for implementing the iterative process(es). The processes and devices herein may be extended to abstract (functional) definitions of convex shapes, allowing for simple and economical representations. As an embodiment of the present invention, an object called a “void simplex” may be determined, allowing the process to terminate even earlier when found, thereby avoiding unnecessary computation without excess memory requirements.
    Type: Grant
    Filed: July 1, 2015
    Date of Patent: March 27, 2018
    Assignee: NVIDIA CORPORATION
    Inventor: Bryan Galdrikian
  • Patent number: 9928104
    Abstract: A system, method, and computer program product are provided for accessing a queue. The method includes receiving a first request to reserve a data record entry in a queue, updating a queue state block based on the first request, and returning a response to the request. A second request is received to commit the data record entry and the queue state block is updated based on the second request.
    Type: Grant
    Filed: June 19, 2013
    Date of Patent: March 27, 2018
    Assignee: NVIDIA Corporation
    Inventors: William J. Dally, James David Balfour, Ignacio Llamas Ubieto
  • Patent number: 9928639
    Abstract: A system and method for facilitating increased graphics processing without deadlock. Embodiments of the present invention provide storage for execution unit pipeline results (e.g., texture pipeline results). The storage allows increased processing of multiple threads as a texture unit may be used to store information while corresponding locations of the register file are available for reallocation to other threads. Embodiments further provide for preventing deadlock by limiting the number of requests and ensuring that a set of requests is not issued unless there are resources available to complete each request of the set of requests. Embodiments of the present invention thus provide for deadlock free increased performance.
    Type: Grant
    Filed: November 27, 2013
    Date of Patent: March 27, 2018
    Assignee: NVIDIA CORPORATION
    Inventors: Michael Toksvig, Erik Lindholm
  • Patent number: 9930082
    Abstract: A system and method for network driven automatic adaptive rendering impedance are presented. Embodiments of the present invention are operable to dynamically throttle the frame rate associated with an application using a server based graphics processor based on determined communication network conditions between a server based application and a remote server. Embodiments of the present invention are operable to monitor network conditions between the server and the client using a network monitoring module and correspondingly adjust the frame rate for a graphics processor used by an application through the use of a throttling signal in response to the determined network conditions. By throttling the application in the manner described by embodiments of the present invention, power resources of the server may be conserved, computational efficiency of the server may be promoted and user density of the server may be increased.
    Type: Grant
    Filed: November 20, 2012
    Date of Patent: March 27, 2018
    Assignee: NVIDIA CORPORATION
    Inventor: Lawrence Ibarria
  • Patent number: 9928034
    Abstract: A method, computer readable medium, and system are disclosed for processing a segmented data set. The method includes the steps of receiving a data structure storing a plurality of values segmented into a plurality of sequences; assigning a plurality of processing elements to process the plurality of values; and processing the plurality of values by the plurality of processing elements according to a merge-based algorithm. Each processing element in the plurality of processing elements identifies a portion of values in the plurality of values allocated to the processing element based on the merge-based algorithm. In one embodiment, the processing elements are threads executed in parallel by a parallel processing unit.
    Type: Grant
    Filed: December 16, 2015
    Date of Patent: March 27, 2018
    Assignee: NVIDIA Corporation
    Inventor: Duane George Merrill, III
  • Patent number: 9922457
    Abstract: A system and method for performing tessellation of three-dimensional surface patches performs some tessellation operations using programmable processing units and other tessellation operations using fixed function units with limited precision. (u,v) parameter coordinates for each vertex are computed using fixed function units to offload programmable processing engines. The (u,v) computation is a symmetric operation and is based on integer coordinates of the vertex, tessellation level of detail values, and a spacing mode.
    Type: Grant
    Filed: December 2, 2013
    Date of Patent: March 20, 2018
    Assignee: NVIDIA CORPORATION
    Inventors: Justin S. Legakis, Emmett M. Kilgariff, Michael C. Shebanow
  • Patent number: 9921847
    Abstract: In one embodiment of the present invention, a streaming multiprocessor (SM) uses a tree of nodes to manage threads. Each node specifies a set of active threads and a program counter. Upon encountering a conditional instruction that causes an execution path to diverge, the SM creates child nodes corresponding to each of the divergent execution paths. Based on the conditional instruction, the SM assigns each active thread included in the parent node to at most one child node, and the SM temporarily discontinues executing instructions specified by the parent node. Instead, the SM concurrently executes instructions specified by the child nodes. After all the divergent paths reconverge to the parent path, the SM resumes executing instructions specified by the parent node. Advantageously, the disclosed techniques enable the SM to execute divergent paths in parallel, thereby reducing undesirable program behavior associated with conventional techniques that serialize divergent paths across thread groups.
    Type: Grant
    Filed: January 21, 2014
    Date of Patent: March 20, 2018
    Assignee: NVIDIA Corporation
    Inventor: John Erik Lindholm
  • Patent number: 9921873
    Abstract: A technique for controlling the distribution of compute task processing in a multi-threaded system encodes each processing task as task metadata (TMD) stored in memory. The TMD includes work distribution parameters specifying how the processing task should be distributed for processing. Scheduling circuitry selects a task for execution when entries of a work queue for the task have been written. The work distribution parameters may define a number of work queue entries needed before a cooperative thread array” (“CTA”) may be launched to process the work queue entries according to the compute task. The work distribution parameters may define a number of CTAs that are launched to process the same work queue entries. Finally, the work distribution parameters may define a step size that is used to update pointers to the work queue entries.
    Type: Grant
    Filed: January 31, 2012
    Date of Patent: March 20, 2018
    Assignee: NVIDIA Corporation
    Inventors: Lacky V. Shah, Karim M. Abdalla, Sean J. Treichler, Abraham B. de Waal
  • Patent number: 9916674
    Abstract: One embodiment of the present invention sets forth a technique for improving path rendering on computer systems with an available graphics processing unit. The technique involves reducing complex path objects to simpler geometric objects suitable for rendering on a graphics processing unit. The process involves a central processing unit “baking” a set of complex path rendering objects to generate a set of simpler graphics objects. A graphics processing unit then renders the simpler graphics objects. This division of processing load can advantageously yield higher overall rendering performance.
    Type: Grant
    Filed: May 19, 2011
    Date of Patent: March 13, 2018
    Assignee: NVIDIA Corporation
    Inventor: Mark J. Kilgard
  • Patent number: 9916680
    Abstract: Techniques are disclosed for suppressing access to a depth processing unit associated with a graphics processing pipeline. The method includes receiving a graphics primitive from a first pipeline stage associated with the graphics processing pipeline. The method further includes determining that the graphics primitive is visible over one or more graphics primitives previously rendered to a frame buffer, and determining that the depth buffer is in a read-only mode. The method further includes suppressing an operation to transmit the graphics primitive to the depth processing unit. One advantage of the disclosed technique is that power consumption is reduced within the GPU by avoiding unnecessary accesses to the depth processing unit.
    Type: Grant
    Filed: October 12, 2012
    Date of Patent: March 13, 2018
    Assignee: NVIDIA CORPORATION
    Inventors: Christian Amsinck, Christian Rouet, Tony Louca
  • Patent number: 9918098
    Abstract: In the claimed approach, a high efficiency video coding codec optimizes the memory resources used during motion vector (MV) prediction. As the codec processes block of pixels, known as coding units (CUs), the codec performs read and write operations on a fixed-sized neighbor union buffer representing the MVs associated with processed CUs. In operation, for each CU, the codec determines the indices at which proximally-located “neighbor” MVs are stored within the neighbor union buffer. The codec then uses these neighbor MVs to compute new MVs. Subsequently, the codec deterministically updates the neighbor union buffer—replacing irrelevant MVs with those new MVs that are useful for computing the MVs of unprocessed CUs. By contrast, many conventional codecs not only redundantly store MVs, but also retain irrelevant MVs. Consequently, the codec reduces memory usage and memory operations compared to conventional codecs, thereby decreasing power consumption and improving codec efficiency.
    Type: Grant
    Filed: January 23, 2014
    Date of Patent: March 13, 2018
    Assignee: NVIDIA Corporation
    Inventors: Stefan Eckart, Yu Xinyang
  • Patent number: 9910589
    Abstract: A virtual keyboard with dynamically adjusted recognition zones for predicted user-intended characters. When a user interaction with the virtual keyboard is received on the virtual keyboard, a character in a recognition zone encompassing the detected interaction location is selected as the current input character. Characters likely to be the next input character are predicted based on the current input character. The recognition zones of the predicted next input characters are adjusted to be larger than their original sizes.
    Type: Grant
    Filed: October 30, 2014
    Date of Patent: March 6, 2018
    Assignee: Nvidia Corporation
    Inventors: Zhen Jia, Jing Guo, Lina Yu, Yuqi Cui
  • Patent number: 9911470
    Abstract: A memory circuit that presents input data at a data output promptly on receiving a clock pulse includes upstream and downstream memory logic and selection logic. The upstream memory logic is configured to latch the input data on receiving the clock pulse. The downstream memory logic is configured to store the latched input data. The selection logic is configured to expose a logic level dependent on whether the upstream memory logic has latched the input data, the exposed logic level derived from the input data before the input data is latched, and from the latched input data after the input data is latched.
    Type: Grant
    Filed: April 13, 2012
    Date of Patent: March 6, 2018
    Assignee: NVIDIA CORPORATION
    Inventors: Venkata Kottapalli, Scott Pitkethly, Christian Klingner, Matthew Gerlach
  • Patent number: 9912322
    Abstract: Clock generation circuit that track critical path across process, voltage and temperature variation. In accordance with a first embodiment of the present invention, an integrated circuit device includes an oscillator electronic circuit on the integrated circuit device configured to produce an oscillating signal and a receiving electronic circuit configured to use the oscillating signal as a system clock. The oscillating signal tracks a frequency-voltage characteristic of the receiving electronic circuit across process, voltage and temperature variations. The oscillating signal may be independent of any off-chip oscillating reference signal.
    Type: Grant
    Filed: September 12, 2016
    Date of Patent: March 6, 2018
    Assignee: NVIDIA CORPORATION
    Inventors: Kalyana Bollapalli, Tezaswi Raja
  • Patent number: 9910865
    Abstract: A method for storing digital images is presented. The method includes capturing an image using a digital camera system. It also comprises capturing metadata associated with the image or a moment of capture of the image. Further, it comprises storing the metadata in at least one field within a file format, wherein the file format defines a structure for the image, and wherein the at least one field is located within an extensible segment of the file format. In one embodiment, the metadata is selected from a group that comprises audio data, GPS data, time data, related image information, heat sensor data, gyroscope data, annotated text, and annotated audio.
    Type: Grant
    Filed: August 5, 2013
    Date of Patent: March 6, 2018
    Assignee: NVIDIA Corporation
    Inventors: Peter Mikolajczyk, Patrick Shehane, Guanghua Gary Zhang
  • Patent number: 9910760
    Abstract: An aspect of the present invention proposes a solution for correctly intercepting, capturing, and replaying tasks (such as functions and methods) in an interception layer operating between an application programming interface (API) and the driver of a processor by using synchronization objects such as fences. According to one or more embodiments of the present invention, the application will use what appears to the application to be a single synchronization object to signal (from a processor) and to wait (on a processor), but will actually be two separate synchronization objects in the interception layer. According to one or more embodiments, the solution proposed herein may be implemented as part of an module or tool that works as an interception layer between an application and an API exposed by a device driver of a resource, and allows for an efficient and effective approach to frame-debugging and live capture and replay of function bundles.
    Type: Grant
    Filed: September 3, 2015
    Date of Patent: March 6, 2018
    Assignee: Nvidia Corporation
    Inventors: Jeffrey Kiel, Dan Price, Mike Strauss