Patents Assigned to NVidia
  • Patent number: 9697641
    Abstract: One embodiment of the present invention sets forth a technique for converting alpha values into pixel coverage masks. Geometric coverage is sampled at a number of “real” sample positions within each pixel. Color and depth values are computed for each of these real samples. Fragment alpha values are used to determine an alpha coverage mask for the real samples and additional “virtual” samples, in which the number of bits set in the mask bits is proportional to the alpha value. An alpha-to-coverage mode uses the virtual samples to increase the number of transparency levels for each pixel compared with using only real samples. The alpha-to-coverage mode may be used in conjunction with virtual coverage anti-aliasing to provide higher-quality transparency for rendering anti-aliased images.
    Type: Grant
    Filed: October 14, 2010
    Date of Patent: July 4, 2017
    Assignee: NVIDIA CORPORATION
    Inventors: Steven E. Molnar, Emmett M. Kilgariff, Walter E. Donovan, Christian Amsinck, Robert Ohannessian
  • Patent number: 9697006
    Abstract: A texture processing pipeline can be configured to service memory access requests that represent texture data access operations or generic data access operations. When the texture processing pipeline receives a memory access request that represents a texture data access operation, the texture processing pipeline may retrieve texture data based on texture coordinates. When the memory access request represents a generic data access operation, the texture pipeline extracts a virtual address from the memory access request and then retrieves data based on the virtual address. The texture processing pipeline is also configured to cache generic data retrieved on behalf of a group of threads and to then invalidate that generic data when the group of threads exits.
    Type: Grant
    Filed: December 19, 2012
    Date of Patent: July 4, 2017
    Assignee: NVIDIA Corporation
    Inventors: Brian Fahs, Eric T. Anderson, Nick Barrow-Williams, Shirish Gadre, Joel James McCormack, Bryon S. Nordquist, Nirmal Raj Saxena, Lacky V. Shah
  • Patent number: 9697044
    Abstract: An application programming interface (API) provides various software constructs that allow a developer to assemble a processing pipeline having arbitrary structure and complexity. Once assembled, the processing pipeline is configured to include a set of interconnected pipestages. Those pipestages are associated with one or more different CTAs that may execute in parallel with one another on a parallel processing unit. The developer specifies the configuration of the pipestages, including the configuration of the different CTAs across all pipestages, as well as the different processing operations performed by each different CTA.
    Type: Grant
    Filed: May 21, 2013
    Date of Patent: July 4, 2017
    Assignee: NVIDIA Corporation
    Inventor: Ignacio Llamas
  • Publication number: 20170185526
    Abstract: A system for managing virtual memory. The system includes a first processing unit configured to execute a first operation that references a first virtual memory address. The system also includes a first memory management unit (MMU) associated with the first processing unit and configured to generate a first page fault upon determining that a first page table that is stored in a first memory unit associated with the first processing unit does not include a mapping corresponding to the first virtual memory address. The system further includes a first copy engine associated with the first processing unit. The first copy engine is configured to read a first command queue to determine a first mapping that corresponds to the first virtual memory address and is included in a first page state directory. The first copy engine is also configured to update the first page table to include the first mapping.
    Type: Application
    Filed: October 16, 2013
    Publication date: June 29, 2017
    Applicant: NVIDIA CORPORATION
    Inventors: Jerome F. DULUK, JR., Cameron BUSCHARDT, Sherry CHEUNG, James Leroy DEMING, Samuel H. DUNCAN, Lucien DUNNING, Robert GEORGE, Arvind GOPALAKRISHNAN, Mark HAIRGROVE, Chenghuan JIA, John MASHEY
  • Patent number: 9690715
    Abstract: One embodiment of the present invention includes a hash selector that facilitates performing effective hashing operations. In operation, the hash selector creates a transformation matrix that reflects specific optimization criteria. For each hash value, the hash selector generates a potential hash value and then computes the rank of a submatrix included in the transformation matrix. Based on this rank in conjunction with the optimization criteria, the hash selector either re-generates the potential hash value or accepts the potential hash value. Advantageously, the optimization criteria may be tailored to create desired correlations between input patterns and the results of performing hashing operations based on the transformation matrix.
    Type: Grant
    Filed: September 3, 2014
    Date of Patent: June 27, 2017
    Assignee: NVIDIA Corporation
    Inventor: James M. Van Dyke
  • Patent number: 9690736
    Abstract: A microprocessor within a processing unit is configured to manage to operation of a finite state machine (FSM) that, in turn, manages the operation of a data connector. The FSM may be a hardwired chip component that adheres to a communication protocol associated with the data connector. The microprocessor is configured to execute a software application in order to (i) apply configuration changes to the processing unit during state transitions initiated by the FSM and (ii) cause the FSM to initiate specific state transitions.
    Type: Grant
    Filed: July 10, 2012
    Date of Patent: June 27, 2017
    Assignee: NVIDIA Corporation
    Inventors: Dennis Ma, Samuel Vincent
  • Patent number: 9685207
    Abstract: A synchronous sequential latch array generated by an automated system for generating master-slave latch structures is disclosed. A master-slave latch structure includes N/2 rows of master-slave latch pairs, an N/2-to-1 multiplexer and control logic. N is equal to the number of latches that are included in the latch array.
    Type: Grant
    Filed: December 4, 2012
    Date of Patent: June 20, 2017
    Assignee: Nvidia Corporation
    Inventor: Robert A. Alfieri
  • Patent number: 9684581
    Abstract: One embodiment of the present invention includes a dependency extractor and a dependency investigator that, together, facilitate performance analysis of computer systems. In operation, the dependency extractor instruments a software application to generate run-time execution data for each work task. This execution data includes per-task performance data and dependency data reflecting linkages between tasks. After the instrumented software application finishes executing, the dependency investigator evaluates the captured execution data and identifies the critical path of tasks that establishes the overall run-time of the software application. Advantageously, since the execution data includes both task-level performance data and dependencies between tasks, the dependency investigator enables the developer to effectively optimize software and hardware in computer systems that are capable of concurrently executing tasks.
    Type: Grant
    Filed: May 21, 2014
    Date of Patent: June 20, 2017
    Assignee: NVIDIA Corporation
    Inventors: Andrew Robert Kerr, Matthew Grant Bolitho, Igor Sevastiyanov, Scott Ricketts, Michael Andersch
  • Patent number: 9684998
    Abstract: One embodiment includes determining a first z-range for a first portion of a coarse raster tile, where the first portion includes a plurality of pixels having a first set of pixel locations, retrieving from a memory a corresponding z-range related to a second set of pixel locations associated with the coarse raster tile, where the first set of pixel locations comprises a subset of the second set of pixel locations, and comparing the first z-range to the corresponding z-range to determine whether the plurality of pixels is occluded. If the plurality of pixels determined to be occluded, then the plurality of pixels is culled. If the plurality of pixels is determined to not be occluded, then the plurality of pixels is transmitted to a fine raster unit for further processing. The coarse raster tile comprises a plurality of portions, including the first portion, and those portions are processed serially.
    Type: Grant
    Filed: July 22, 2013
    Date of Patent: June 20, 2017
    Assignee: NVIDIA CORPORATION
    Inventors: Eric B. Lum, Justin Cobb, Barry N. Rodgers
  • Patent number: 9678775
    Abstract: Computer code written to execute on a multi-threaded computing environment is transformed into code designed to execute on a single-threaded computing environment and simulate concurrent executing threads. Optimization techniques during the transformation process are utilized to identify local variables for scalar expansion. A first set of local variables is defined that includes those local variables in the code identified as “Downward exposed Defined” (DD). A second set of local variables is defined that includes those local variables in the code identified as “Upward exposed Use” (UU). The intersection of the first set and the second set identifies local variables for scalar expansion.
    Type: Grant
    Filed: February 26, 2009
    Date of Patent: June 13, 2017
    Assignee: NVIDIA Corporation
    Inventors: Vinod Grover, John A. Stratton
  • Patent number: 9679530
    Abstract: One embodiment of the present invention sets forth a method for compressing via a pixel shader color information associated with a line of pixels. An intermediary representation of an uncompressed stream of color information is first generated that indicates, for each pixel, whether a previous adjacent pixel shares color information with the pixel. A set of cascading buffers is then generated based on intermediary representation, where each cascading buffer represents a number of unique color codes across different groups of pixels. Finally, a compressed output stream that specifies the unique color codes as well as the number of pixels that share each unique color code is generated based on the set of cascading buffers.
    Type: Grant
    Filed: April 30, 2012
    Date of Patent: June 13, 2017
    Assignee: NVIDIA Corporation
    Inventor: Franck Diard
  • Patent number: 9678529
    Abstract: One aspect of the disclosure provides a computer system. In one embodiment, the computer system comprises a clock generator, at least one processor, and a clock frequency controller. The clock generator is configured to provide a clock signal at a clock frequency. The at least one processor is configured to receive the clock signal and to operate at a speed dependent on the clock frequency. The clock frequency controller is configured to receive efficiency information indicating a current efficiency of the at least one processor. The clock frequency controller is further configured to receive a request from the processor for a target number of processor instructions to be handled in a particular time period. The clock frequency controller is further configured to output a frequency control signal to the clock generator for controlling the clock frequency in dependence thereon.
    Type: Grant
    Filed: September 2, 2014
    Date of Patent: June 13, 2017
    Assignee: Nvidia Corporation
    Inventors: Marcin Hlond, Peter Cumming
  • Patent number: 9678897
    Abstract: A streaming multiprocessor in a parallel processing subsystem processes atomic operations for multiple threads in a multi-threaded architecture. The streaming multiprocessor receives a request from a thread in a thread group to acquire access to a memory location in a lock-protected shared memory, and determines whether a address lock in a plurality of address locks is asserted, where the address lock is associated the memory location. If the address lock is asserted, then the streaming multiprocessor refuses the request. Otherwise, the streaming multiprocessor asserts the address lock, asserts a thread group lock in a plurality of thread group locks, where the thread group lock is associated with the thread group, and grants the request. One advantage of the disclosed techniques is that acquired locks are released when a thread is preempted. As a result, a preempted thread that has previously acquired a lock does not retain the lock indefinitely.
    Type: Grant
    Filed: December 27, 2012
    Date of Patent: June 13, 2017
    Assignee: NVIDIA Corporation
    Inventors: Nicholas Wang, Shirish Gadre, Robert Ohannessian, Lacky V. Shah, Matthew Brockmeyer, Stewart Glenn Carlton
  • Patent number: 9672653
    Abstract: A stereo viewpoint graphics processing subsystem and a method of sharing geometry data between stereo images in screen-space processing. One embodiment of the stereo viewpoint graphics processing subsystem configured to render a scene includes: (1) stereo frame buffers configured to contain respective pixel-wise rendered scene data for stereo images, and (2) a sharing decision circuit operable to determine when to share geometric data between the stereo frame buffers for carrying out screen-space effect processes to render the scene in the stereo images.
    Type: Grant
    Filed: December 14, 2012
    Date of Patent: June 6, 2017
    Assignee: Nvidia Corporation
    Inventors: G. Evan Hart, Cem Cebenoyan, Louis Bavoil
  • Patent number: 9672008
    Abstract: A system, method, and computer program product are provided for a pausible bisynchronous FIFO. Data is written synchronously with a first clock signal of a first clock domain to an entry of a dual-port memory array and an increment signal is generated in the first clock domain. The increment signal is determined to transition near an edge of a second dock signal, where the second clock signal is a pausible clock signal. A next edge of the second clock signal of the second clock domain is delayed and the increment signal to the second clock domain and is transmitted.
    Type: Grant
    Filed: November 20, 2015
    Date of Patent: June 6, 2017
    Assignee: NVIDIA Corporation
    Inventors: Benjamin Andrew Keller, Matthew Rudolph Fojtik, Brucek Kurdo Khailany
  • Patent number: 9671877
    Abstract: A passive stylus with a deformable tip is described herein. In one embodiment, a thin annular body configured to be hand-held with a chisel shaped tip disposed at the first end of the body is provided. The chisel shaped tip includes a deformable material such that the chisel shaped tip is operable to interface with a touch a sensitive surface with a detectable surface area when a first pressure is exerted on the body and translated to the chisel shaped tip. The chisel shaped tip is operable to interface with the touch sensitive surface with a second detectable surface area, this one different from the first detectable surface area, when a second pressure is exerted on the body and translated to the chisel shaped tip. The stylus may include a second tip on the back end for providing an erase function.
    Type: Grant
    Filed: January 27, 2014
    Date of Patent: June 6, 2017
    Assignee: NVIDIA CORPORATION
    Inventors: Berhanu Zerayohannes, Siarhei Murauyou, Tommy Lee, Glenn Wernig, Nelson Au, Arman Toorians, Jen-Hsun Huang
  • Patent number: 9667068
    Abstract: A system, method, and computer program product are provided for merging two or more supply rails into a merged supply rail. The method comprises receiving two or more current measurement signals associated with two or more supply rails, selecting one supply rail from the two or more supply rails, based on the current measurement signals, and enabling the selected supply rail to source current into a merged supply rail.
    Type: Grant
    Filed: December 18, 2013
    Date of Patent: May 30, 2017
    Assignee: NVIDIA Corporation
    Inventors: Samuel Richard Duell, Gabriele Gorla, Yaoshun Jia, Qi Lin, Andrew Bell
  • Patent number: 9665958
    Abstract: A system, method, and computer program product are provided for redistributing multi-sample processing workloads between threads. A workload for a plurality of multi-sample pixels is received and each thread in a parallel thread group is associated with a corresponding multi-sample pixel of the plurality of pixels. The workload is redistributed between the threads in the parallel thread group based on a characteristic of the workload and the workload is processed by the parallel thread group. In one embodiment, the characteristic is rasterized coverage information for the plurality of multi-sample pixels.
    Type: Grant
    Filed: August 26, 2013
    Date of Patent: May 30, 2017
    Assignee: NVIDIA Corporation
    Inventors: Jeffrey Alan Bolz, Patrick R. Brown, Tyson Bergland, Alexander Lev Minkin
  • Patent number: 9665969
    Abstract: One embodiment of the present invention discloses a method for processing video data within a video data processing path of a processing unit. The video data processing path includes three stages. In the first stage, source operands are extracted from a local register file and are ordered to map efficiently onto the downstream data path. In the second stage, arithmetic operations are performed on the source operands based on video processing instructions to generate intermediate results. In the third stage, additional operations are performed on the intermediate results based on the video processing instructions. In some embodiment, the intermediate results are combined with additional operands retrieved from the local register file.
    Type: Grant
    Filed: May 24, 2010
    Date of Patent: May 30, 2017
    Assignee: NVIDIA Corporation
    Inventors: Shirish Gadre, Robert Jan Schutten, David Conrad Tannenbaum
  • Patent number: 9667230
    Abstract: A method for operating a latch and a latch circuit are disclosed. The latch circuit comprises a storage sub-circuit, a propagation sub-circuit, and a shared clock-enabled transistor. The storage sub-circuit is configured to capture a level of an input signal when a clock signal transitions from first level to a second level and hold the captured level to generate an output signal while the clock signal is at the second level. The propagation sub-circuit is configured to enable a path through a blocking transistor to the shared clock-enabled supply node to propagate the captured level of the input signal to the storage sub-circuit. The shared clock-enabled transistor is configured to couple the shared clock-enabled supply node to a power supply while the clock signal is at the first level and decouple the shared clock-enabled supply node from the power supply while the clock signal is at the second level.
    Type: Grant
    Filed: March 23, 2016
    Date of Patent: May 30, 2017
    Assignee: NVIDIA Corporation
    Inventors: Matthew Rudolph Fojtik, Ilyas Elkin, Yanqing Zhang