Patents Assigned to NVidia

Alpha-to-coverage using virtual samples

Patent number: 9697641

Abstract: One embodiment of the present invention sets forth a technique for converting alpha values into pixel coverage masks. Geometric coverage is sampled at a number of “real” sample positions within each pixel. Color and depth values are computed for each of these real samples. Fragment alpha values are used to determine an alpha coverage mask for the real samples and additional “virtual” samples, in which the number of bits set in the mask bits is proportional to the alpha value. An alpha-to-coverage mode uses the virtual samples to increase the number of transparency levels for each pixel compared with using only real samples. The alpha-to-coverage mode may be used in conjunction with virtual coverage anti-aliasing to provide higher-quality transparency for rendering anti-aliased images.

Type: Grant

Filed: October 14, 2010

Date of Patent: July 4, 2017

Assignee: NVIDIA CORPORATION

Inventors: Steven E. Molnar, Emmett M. Kilgariff, Walter E. Donovan, Christian Amsinck, Robert Ohannessian
Technique for performing memory access operations via texture hardware

Patent number: 9697006

Abstract: A texture processing pipeline can be configured to service memory access requests that represent texture data access operations or generic data access operations. When the texture processing pipeline receives a memory access request that represents a texture data access operation, the texture processing pipeline may retrieve texture data based on texture coordinates. When the memory access request represents a generic data access operation, the texture pipeline extracts a virtual address from the memory access request and then retrieves data based on the virtual address. The texture processing pipeline is also configured to cache generic data retrieved on behalf of a group of threads and to then invalidate that generic data when the group of threads exits.

Type: Grant

Filed: December 19, 2012

Date of Patent: July 4, 2017

Assignee: NVIDIA Corporation

Inventors: Brian Fahs, Eric T. Anderson, Nick Barrow-Williams, Shirish Gadre, Joel James McCormack, Bryon S. Nordquist, Nirmal Raj Saxena, Lacky V. Shah
Application programming interface to enable the construction of pipeline parallel programs

Patent number: 9697044

Abstract: An application programming interface (API) provides various software constructs that allow a developer to assemble a processing pipeline having arbitrary structure and complexity. Once assembled, the processing pipeline is configured to include a set of interconnected pipestages. Those pipestages are associated with one or more different CTAs that may execute in parallel with one another on a parallel processing unit. The developer specifies the configuration of the pipestages, including the configuration of the different CTAs across all pipestages, as well as the different processing operations performed by each different CTA.

Type: Grant

Filed: May 21, 2013

Date of Patent: July 4, 2017

Assignee: NVIDIA Corporation

Inventor: Ignacio Llamas
MIGRATION SCHEME FOR UNIFIED VIRTUAL MEMORY SYSTEM

Publication number: 20170185526

Abstract: A system for managing virtual memory. The system includes a first processing unit configured to execute a first operation that references a first virtual memory address. The system also includes a first memory management unit (MMU) associated with the first processing unit and configured to generate a first page fault upon determining that a first page table that is stored in a first memory unit associated with the first processing unit does not include a mapping corresponding to the first virtual memory address. The system further includes a first copy engine associated with the first processing unit. The first copy engine is configured to read a first command queue to determine a first mapping that corresponds to the first virtual memory address and is included in a first page state directory. The first copy engine is also configured to update the first page table to include the first mapping.

Type: Application

Filed: October 16, 2013

Publication date: June 29, 2017

Applicant: NVIDIA CORPORATION

Inventors: Jerome F. DULUK, JR., Cameron BUSCHARDT, Sherry CHEUNG, James Leroy DEMING, Samuel H. DUNCAN, Lucien DUNNING, Robert GEORGE, Arvind GOPALAKRISHNAN, Mark HAIRGROVE, Chenghuan JIA, John MASHEY
Selecting hash values based on matrix rank

Patent number: 9690715

Abstract: One embodiment of the present invention includes a hash selector that facilitates performing effective hashing operations. In operation, the hash selector creates a transformation matrix that reflects specific optimization criteria. For each hash value, the hash selector generates a potential hash value and then computes the rank of a submatrix included in the transformation matrix. Based on this rank in conjunction with the optimization criteria, the hash selector either re-generates the potential hash value or accepts the potential hash value. Advantageously, the optimization criteria may be tailored to create desired correlations between input patterns and the results of performing hashing operations based on the transformation matrix.

Type: Grant

Filed: September 3, 2014

Date of Patent: June 27, 2017

Assignee: NVIDIA Corporation

Inventor: James M. Van Dyke
Managing state transitions of a data connector using a finite state machine

Patent number: 9690736

Abstract: A microprocessor within a processing unit is configured to manage to operation of a finite state machine (FSM) that, in turn, manages the operation of a data connector. The FSM may be a hardwired chip component that adheres to a communication protocol associated with the data connector. The microprocessor is configured to execute a software application in order to (i) apply configuration changes to the processing unit during state transitions initiated by the FSM and (ii) cause the FSM to initiate specific state transitions.

Type: Grant

Filed: July 10, 2012

Date of Patent: June 27, 2017

Assignee: NVIDIA Corporation

Inventors: Dennis Ma, Samuel Vincent
Sequential access memory with master-slave latch pairs and method of operating

Patent number: 9685207

Abstract: A synchronous sequential latch array generated by an automated system for generating master-slave latch structures is disclosed. A master-slave latch structure includes N/2 rows of master-slave latch pairs, an N/2-to-1 multiplexer and control logic. N is equal to the number of latches that are included in the latch array.

Type: Grant

Filed: December 4, 2012

Date of Patent: June 20, 2017

Assignee: Nvidia Corporation

Inventor: Robert A. Alfieri
Determining overall performance characteristics of a concurrent software application

Patent number: 9684581

Abstract: One embodiment of the present invention includes a dependency extractor and a dependency investigator that, together, facilitate performance analysis of computer systems. In operation, the dependency extractor instruments a software application to generate run-time execution data for each work task. This execution data includes per-task performance data and dependency data reflecting linkages between tasks. After the instrumented software application finishes executing, the dependency investigator evaluates the captured execution data and identifies the critical path of tasks that establishes the overall run-time of the software application. Advantageously, since the execution data includes both task-level performance data and dependencies between tasks, the dependency investigator enables the developer to effectively optimize software and hardware in computer systems that are capable of concurrently executing tasks.

Type: Grant

Filed: May 21, 2014

Date of Patent: June 20, 2017

Assignee: NVIDIA Corporation

Inventors: Andrew Robert Kerr, Matthew Grant Bolitho, Igor Sevastiyanov, Scott Ricketts, Michael Andersch
Pixel serialization to improve conservative depth estimation

Patent number: 9684998

Abstract: One embodiment includes determining a first z-range for a first portion of a coarse raster tile, where the first portion includes a plurality of pixels having a first set of pixel locations, retrieving from a memory a corresponding z-range related to a second set of pixel locations associated with the coarse raster tile, where the first set of pixel locations comprises a subset of the second set of pixel locations, and comparing the first z-range to the corresponding z-range to determine whether the plurality of pixels is occluded. If the plurality of pixels determined to be occluded, then the plurality of pixels is culled. If the plurality of pixels is determined to not be occluded, then the plurality of pixels is transmitted to a fine raster unit for further processing. The coarse raster tile comprises a plurality of portions, including the first portion, and those portions are processed serially.

Type: Grant

Filed: July 22, 2013

Date of Patent: June 20, 2017

Assignee: NVIDIA CORPORATION

Inventors: Eric B. Lum, Justin Cobb, Barry N. Rodgers
Allocating memory for local variables of a multi-threaded program for execution in a single-threaded environment

Patent number: 9678775

Abstract: Computer code written to execute on a multi-threaded computing environment is transformed into code designed to execute on a single-threaded computing environment and simulate concurrent executing threads. Optimization techniques during the transformation process are utilized to identify local variables for scalar expansion. A first set of local variables is defined that includes those local variables in the code identified as “Downward exposed Defined” (DD). A second set of local variables is defined that includes those local variables in the code identified as “Upward exposed Use” (UU). The intersection of the first set and the second set identifies local variables for scalar expansion.

Type: Grant

Filed: February 26, 2009

Date of Patent: June 13, 2017

Assignee: NVIDIA Corporation

Inventors: Vinod Grover, John A. Stratton
Compressing graphics data rendered on a primary computer for transmission to a remote computer

Patent number: 9679530

Abstract: One embodiment of the present invention sets forth a method for compressing via a pixel shader color information associated with a line of pixels. An intermediary representation of an uncompressed stream of color information is first generated that indicates, for each pixel, whether a previous adjacent pixel shares color information with the pixel. A set of cascading buffers is then generated based on intermediary representation, where each cascading buffer represents a number of unique color codes across different groups of pixels. Finally, a compressed output stream that specifies the unique color codes as well as the number of pixels that share each unique color code is generated based on the set of cascading buffers.

Type: Grant

Filed: April 30, 2012

Date of Patent: June 13, 2017

Assignee: NVIDIA Corporation

Inventor: Franck Diard
Efficiency-based clock frequency adjustment

Patent number: 9678529

Abstract: One aspect of the disclosure provides a computer system. In one embodiment, the computer system comprises a clock generator, at least one processor, and a clock frequency controller. The clock generator is configured to provide a clock signal at a clock frequency. The at least one processor is configured to receive the clock signal and to operate at a speed dependent on the clock frequency. The clock frequency controller is configured to receive efficiency information indicating a current efficiency of the at least one processor. The clock frequency controller is further configured to receive a request from the processor for a target number of processor instructions to be handled in a particular time period. The clock frequency controller is further configured to output a frequency control signal to the clock generator for controlling the clock frequency in dependence thereon.

Type: Grant

Filed: September 2, 2014

Date of Patent: June 13, 2017

Assignee: Nvidia Corporation

Inventors: Marcin Hlond, Peter Cumming
Approach for context switching of lock-bit protected memory

Patent number: 9678897

Abstract: A streaming multiprocessor in a parallel processing subsystem processes atomic operations for multiple threads in a multi-threaded architecture. The streaming multiprocessor receives a request from a thread in a thread group to acquire access to a memory location in a lock-protected shared memory, and determines whether a address lock in a plurality of address locks is asserted, where the address lock is associated the memory location. If the address lock is asserted, then the streaming multiprocessor refuses the request. Otherwise, the streaming multiprocessor asserts the address lock, asserts a thread group lock in a plurality of thread group locks, where the thread group lock is associated with the thread group, and grants the request. One advantage of the disclosed techniques is that acquired locks are released when a thread is preempted. As a result, a preempted thread that has previously acquired a lock does not retain the lock indefinitely.

Type: Grant

Filed: December 27, 2012

Date of Patent: June 13, 2017

Assignee: NVIDIA Corporation

Inventors: Nicholas Wang, Shirish Gadre, Robert Ohannessian, Lacky V. Shah, Matthew Brockmeyer, Stewart Glenn Carlton
Stereo viewpoint graphics processing subsystem and method of sharing geometry data between stereo images in screen-spaced processing

Patent number: 9672653

Abstract: A stereo viewpoint graphics processing subsystem and a method of sharing geometry data between stereo images in screen-space processing. One embodiment of the stereo viewpoint graphics processing subsystem configured to render a scene includes: (1) stereo frame buffers configured to contain respective pixel-wise rendered scene data for stereo images, and (2) a sharing decision circuit operable to determine when to share geometric data between the stereo frame buffers for carrying out screen-space effect processes to render the scene in the stereo images.

Type: Grant

Filed: December 14, 2012

Date of Patent: June 6, 2017

Assignee: Nvidia Corporation

Inventors: G. Evan Hart, Cem Cebenoyan, Louis Bavoil
Pausible bisynchronous FIFO

Patent number: 9672008

Abstract: A system, method, and computer program product are provided for a pausible bisynchronous FIFO. Data is written synchronously with a first clock signal of a first clock domain to an entry of a dual-port memory array and an increment signal is generated in the first clock domain. The increment signal is determined to transition near an edge of a second dock signal, where the second clock signal is a pausible clock signal. A next edge of the second clock signal of the second clock domain is delayed and the increment signal to the second clock domain and is transmitted.

Type: Grant

Filed: November 20, 2015

Date of Patent: June 6, 2017

Assignee: NVIDIA Corporation

Inventors: Benjamin Andrew Keller, Matthew Rudolph Fojtik, Brucek Kurdo Khailany
Stylus tool with deformable tip

Patent number: 9671877

Abstract: A passive stylus with a deformable tip is described herein. In one embodiment, a thin annular body configured to be hand-held with a chisel shaped tip disposed at the first end of the body is provided. The chisel shaped tip includes a deformable material such that the chisel shaped tip is operable to interface with a touch a sensitive surface with a detectable surface area when a first pressure is exerted on the body and translated to the chisel shaped tip. The chisel shaped tip is operable to interface with the touch sensitive surface with a second detectable surface area, this one different from the first detectable surface area, when a second pressure is exerted on the body and translated to the chisel shaped tip. The stylus may include a second tip on the back end for providing an erase function.

Type: Grant

Filed: January 27, 2014

Date of Patent: June 6, 2017

Assignee: NVIDIA CORPORATION

Inventors: Berhanu Zerayohannes, Siarhei Murauyou, Tommy Lee, Glenn Wernig, Nelson Au, Arman Toorians, Jen-Hsun Huang
System, method, and computer program product for a switch mode current balancing rail merge circuit

Patent number: 9667068

Abstract: A system, method, and computer program product are provided for merging two or more supply rails into a merged supply rail. The method comprises receiving two or more current measurement signals associated with two or more supply rails, selecting one supply rail from the two or more supply rails, based on the current measurement signals, and enabling the selected supply rail to source current into a merged supply rail.

Type: Grant

Filed: December 18, 2013

Date of Patent: May 30, 2017

Assignee: NVIDIA Corporation

Inventors: Samuel Richard Duell, Gabriele Gorla, Yaoshun Jia, Qi Lin, Andrew Bell
System, method, and computer program product for redistributing a multi-sample processing workload between threads

Patent number: 9665958

Abstract: A system, method, and computer program product are provided for redistributing multi-sample processing workloads between threads. A workload for a plurality of multi-sample pixels is received and each thread in a parallel thread group is associated with a corresponding multi-sample pixel of the plurality of pixels. The workload is redistributed between the threads in the parallel thread group based on a characteristic of the workload and the workload is processed by the parallel thread group. In one embodiment, the characteristic is rasterized coverage information for the plurality of multi-sample pixels.

Type: Grant

Filed: August 26, 2013

Date of Patent: May 30, 2017

Assignee: NVIDIA Corporation

Inventors: Jeffrey Alan Bolz, Patrick R. Brown, Tyson Bergland, Alexander Lev Minkin
Data path and instruction set for packed pixel operations for video processing

Patent number: 9665969

Abstract: One embodiment of the present invention discloses a method for processing video data within a video data processing path of a processing unit. The video data processing path includes three stages. In the first stage, source operands are extracted from a local register file and are ordered to map efficiently onto the downstream data path. In the second stage, arithmetic operations are performed on the source operands based on video processing instructions to generate intermediate results. In the third stage, additional operations are performed on the intermediate results based on the video processing instructions. In some embodiment, the intermediate results are combined with additional operands retrieved from the local register file.

Type: Grant

Filed: May 24, 2010

Date of Patent: May 30, 2017

Assignee: NVIDIA Corporation

Inventors: Shirish Gadre, Robert Jan Schutten, David Conrad Tannenbaum
Latch and flip-flop circuits with shared clock-enabled supply nodes

Patent number: 9667230

Abstract: A method for operating a latch and a latch circuit are disclosed. The latch circuit comprises a storage sub-circuit, a propagation sub-circuit, and a shared clock-enabled transistor. The storage sub-circuit is configured to capture a level of an input signal when a clock signal transitions from first level to a second level and hold the captured level to generate an output signal while the clock signal is at the second level. The propagation sub-circuit is configured to enable a path through a blocking transistor to the shared clock-enabled supply node to propagate the captured level of the input signal to the storage sub-circuit. The shared clock-enabled transistor is configured to couple the shared clock-enabled supply node to a power supply while the clock signal is at the first level and decouple the shared clock-enabled supply node from the power supply while the clock signal is at the second level.

Type: Grant

Filed: March 23, 2016

Date of Patent: May 30, 2017

Assignee: NVIDIA Corporation

Inventors: Matthew Rudolph Fojtik, Ilyas Elkin, Yanqing Zhang

prev … 108 109 110 111 112 113 114 115 116 … next