Abstract: A method for performing a ray-box intersection test includes forming a span extending between a first plane-ray intersection point and a second plane-ray intersection point, and increasing the span by relocating to a new position at least one of the first and second plane-ray intersection points. A box intersection span is constructed using the increased span, and the box intersection span, which corresponds to a node in a hierarchical acceleration structure, is tested for intersection with the ray.
Type:
Grant
Filed:
May 17, 2010
Date of Patent:
October 22, 2013
Assignee:
NVIDIA Corporation
Inventors:
Timo Aila, Samuli Laine, John Erik Lindholm
Abstract: One embodiment of the invention sets forth a mechanism for compiling a vertex shader program into two portions, a culling portion and a shading portion. The culling portion of the compiled vertex shader program specifies vertex attributes and instructions of the vertex shader program needed to determine whether early vertex culling operations should be performed on a batch of vertices associated with one or more primitives of a graphics scene. The shading portion of the compiled vertex shader program specifies the remaining vertex attributes and instructions of the vertex shader program for performing vertex lighting and performing other operations on the vertices in the batch of vertices. When the compiled vertex shader program is executed by graphics processing hardware, the shading portion of the compiled vertex shader is executed only when early vertex culling operations are not performed on the batch of vertices.
Type:
Grant
Filed:
July 17, 2009
Date of Patent:
October 22, 2013
Assignee:
Nvidia Corporation
Inventors:
Ziyad S. Hakura, John Erik Lindholm, Emmett M. Kilgariff, Robert Ohannessian, Scott R. Whitman, James C. Bowman, Patrick R. Brown, Ross A. Cunniff
Abstract: In one embodiment, a micro-processing system includes a hardware structure disposed on a processor core. The hardware structure includes a plurality of entries, each of which are associated with portion of code and a translation of that code which can be executed to achieve substantially equivalent functionality. The hardware structure includes a redirection array that enables, when referenced, execution to be redirected from a portion of code to its counterpart translation. The entries enabling such redirection are maintained within or evicted from the hardware structure based on usage information for the entries.
Abstract: A system, method, and computer program product are provided for performing path tracing. In use, one or more matte objects are identified in a scene. Additionally, one or more synthetic objects are identified in the scene. Further, path tracing is performed within the scene, where the path tracing accounts for interactions between one or more of the matte objects and one or more of the synthetic objects.
Type:
Application
Filed:
April 17, 2012
Publication date:
October 17, 2013
Applicant:
NVIDIA CORPORATION
Inventors:
Daniel Lévesque, Carsten Alexander Wächter
Abstract: The presentation of stereoscopic display content for viewing with passive glasses and full resolution is provided. In use, (a) a frame of stereoscopic display content intended for viewing by one eye of a user is scanned, using a display layer of a display device; (b) the scanned frame is polarized utilizing a polarizing layer of the display device, according to a polarization associated with a lens of stereoscopic glasses worn over the same one eye of the user; (c) a backlight is activated to illuminate the polarized frame, in response to an entirety of the polarized frame being scanned; (d) the display device is held for a predetermined period of time in response to activation of the backlight, and then the backlight is de-activated; and (a)-(d) are then repeated for the other eye of the user, with another frame of stereoscopic display content intended for viewing by the other eye.
Abstract: A system and method force a display device to receive the output produced by a graphics processing unit that is configured as the video graphics array (VGA) boot device for display of critical system screens. A hybrid computer system that includes multiple graphics processors configures a display multiplexor to select image data from one of the multiple graphics processing units for output to the display device. When a critical system event occurs and the graphics processing unit that is selected is not configured as the VGA boot device, system basic input/output system (BIOS) interfaces are used to configure the multiplexor to select the one graphics processing unit that is configured as the VGA boot device to output the critical system screen to the display device.
Abstract: A method (500) for estimating at least one offset in a subcarrier that is subject to distortion in a multicarrier communication system. The method comprises receiving a plurality of subcarriers wherein the plurality of subcarriers contain the subcarrier that is subject to the distortion; and generating a plurality of first channel estimates for a respective plurality of received subcarriers that are not subject to the distortion. The method further comprises processing a number of the plurality of first channel estimates for the respective plurality of received subcarriers that are not subject to the distortion to generate a second channel estimate for the subcarrier that is subject to the distortion; and estimating an offset associated with the subcarrier that is subject to the distortion.
Abstract: One embodiment of the present invention sets forth a technique for detecting duplicate vertex indices in parallel and batching indices defining multiple primitives for parallel primitive processing. A lookback cache breaks the dependent loop for the miss processing. Because each index is compared to all previous indices (duplicate or not), each index is not dependent on whether the previous indices have hit or missed. This allows the comparison operation that detects the duplicate vertex indices to be fully pipelined. The duplicate vertex indices are removed to reduce the number of indices that define the primitives in the batch. Multiple, independent rasterizer units operate concurrently on the different batches of graphics primitives to render multiple primitives per system clock.
Abstract: One embodiment of the present invention sets forth a clamping circuit that is used to maintain a bit line of a storage cell in a memory array at a nearly constant clamp voltage. During read operations the bit line is pulled high or low from the clamp voltage by the storage cell and a change in current on the bit line is converted by the clamping circuit to produce an amplified voltage that may be sampled to read a value stored in the storage cell. The clamping circuit maintains the nearly constant clamp voltage on the bit line. Clamping the bit line to the nearly constant clamp voltage reduces the occurrence of read disturb faults. Additionally, the clamping circuit functions with a variety of storage cells and does not require that the bit lines be precharged prior to each read operation.
Abstract: One embodiment of the present invention sets forth a technique for consistently evaluating geometric patches with shared boundaries using barycentric coordinates. A barycentric parameter is generated and represented using a fixed-point fraction. The barycentric parameter is then used to generate a fixed-point barycentric coordinate. The fixed-point barycentric coordinate is then converted to a floating-point representation for evaluating the geometric patches. Computing shared boundary splits using fixed-point fractions eliminates inconsistencies in associated barycentric coordinates due to round-off errors. Evaluating geometric patch equations using consistent barycentric coordinates facilitates precise, consistent computation of vertices along shared boundaries.
Abstract: Systems and methods for encoding a data word using an 8b/9b encoding scheme that eliminates two-aggressor crosstalk are disclosed. The 8b/9b encoding scheme enables a data word that can be subdivided into portions of eight bits or less to be encoded using code words having one extra bit than the corresponding portion of the data word. Each of the valid code words does not include any three consecutive bits having a logic level of logic-high (i.e., ‘1’), and represent transition vectors for consecutive symbols transmitted over the high speed parallel bus. An encoder and corresponding decoder are disclosed for implementing the 8b/9b encoding scheme. In one embodiment, the encoder/decoder implements a modified Fibonacci sequence algorithm. In another embodiment, the encoder/decoder implements a look-up table. In some embodiments, data words may be less than eight bits wide.
Abstract: One embodiment of the present invention sets forth a technique for efficiently creating and accessing an A-Buffer that supports multi-sample compression techniques. The A-Buffer is organized in stacks of uniformly-sized tiles, wherein the tile size is selected to facilitate compression techniques. Each stack represents the samples included in a group of pixels. Each tile within a stack represents the set of sample data at a specific per-sample rendering order index that are associated with the group of pixels represented by the stack. Advantageously, each tile includes tile compression bits that enable the tile to maintain data using existing compression formats. As the A-Buffer is created, a corresponding stack compression buffer is also created. For each stack, the stack compression buffer includes a bit that indicates whether all of the tiles in the stack are similarly compressed and, consequently, whether the GPU may operate on the stack at an efficient per pixel granularity.
Abstract: A system includes a processor having an instruction register for storing an instruction having a predefined opcode, a predicate register for storing a predicate condition to select an output register for a result of the instruction, a first output register, and a second output register. The processor further includes processor circuitry operable to execute the instruction to produce a result, and processor circuitry operable to store the result of the instruction in the first output register if the predicate condition to select the output is true, and to store the second output register if the predicate condition to select the output is false. A single instruction is used to produce the result, and to store the result of the instruction.
Abstract: One embodiment of the present invention sets forth a technique for using a multi-bank register file that reduces the size of or eliminates a switch and/or staging registers that are used to gather input operands for instructions. Each function unit input may be directly connected to one bank of the multi-bank register file with neither a switch nor a staging register. A compiler or register allocation unit ensures that the register file accesses for each instruction are conflict-free (no instruction can access the same bank more than once in the same cycle). The compiler or register allocation unit may also ensure that the register file accesses for each instruction are also aligned (each input of a function unit can only come from the bank connected to that input).
Abstract: A method of mitigating interference between carrier frequency bands of a carrier aggregation scheme. The method comprises: at a wireless device, receiving a first signal on a first carrier frequency band of the carrier aggregation scheme; mixing a second signal onto a second carrier frequency band of the carrier aggregation scheme and transmitting the second signal from the wireless device; executing code on a processing apparatus of the device to generate a reconstructed interference signal, by mixing an instance of the signal with a frequency location of an interfering harmonic from the second carrier frequency band falling in the first carrier frequency band; and removing the reconstructed interference signal from the first signal.
Abstract: A system, method, and computer program product are provided for preparing a substrate post. In use, a first solder mask is applied to a substrate. Additionally, a post is affixed to each of one or more pads of the substrate. Further, a second solder mask is applied to the substrate.
Type:
Application
Filed:
April 3, 2012
Publication date:
October 3, 2013
Applicant:
NVIDIA CORPORATION
Inventors:
Leilei Zhang, Abraham F. Yee, Shantanu Kalchuri, Zuhair Bokharey
Abstract: A system and method are provided for performing the retransmission of data in a network. Included is an offload engine in communication with system memory and a network. The offload engine serves for managing the retransmission of data transmitted in the network.
Type:
Grant
Filed:
December 19, 2003
Date of Patent:
October 1, 2013
Assignee:
NVIDIA Corporation
Inventors:
John Shigeto Minami, Michael Ward Johnson, Andrew Currid, Mrudula Kanuri
Abstract: Methods, apparatuses, and systems are presented for performing asynchronous communications involving using an asynchronous interface to send signals between a source device and a plurality of client devices, the source device and the plurality of client devices being part of a processing unit capable of performing graphics operations, the source device being coupled to the plurality of client devices using the asynchronous interface, wherein the asynchronous interface includes at least one request signal, at least one address signal, at least one acknowledge signal, and at least one data signal, and wherein the asynchronous interface operates in accordance with at least one programmable timing characteristic associated with the source device.
Type:
Grant
Filed:
August 10, 2006
Date of Patent:
October 1, 2013
Assignee:
NVIDIA Corporation
Inventors:
Lincoln G. Garlick, Richard A. Silkebakken, Prakash G. Apte, Paolo E. Sabella, Samuel H. Duncan, Dennis K. Ma, Sean J. Treichler
Abstract: A computer-implemented graphics system has a mode of operation in which primitive coverage information is generated by a rasterizer for real sample locations and virtual sample locations for use in anti-aliasing. An individual pixel includes a single real sample location and at least one virtual sample location. If the coverage information cannot be changed by a pixel shader, then the rasterizer can write the coverage information to a framebuffer. If, however, the coverage information can be changed by the shader, then the rasterizer sends the coverage information to the shader.
Type:
Grant
Filed:
December 20, 2006
Date of Patent:
October 1, 2013
Assignee:
NVIDIA Corporation
Inventors:
Edward A. Hutchins, Christopher D. S. Donham, Gary C. King, Michael J. M. Toksvig
Abstract: The present invention is a flexible input/output translation system and method that facilitates conservation of chip pin resources while permitting flexible and dynamic changes to processor support operations on the fly. A present invention input/output translator includes a consolidated indication port, translation logic, a plurality of translated indication ports and an initialization port. The consolidated indication port receives a consolidated indication signal (e.g., indicating a desired voltage level) from a general purpose input/output port of a processor. The translation logic translates the consolidated indication signal into a plurality of translated indication signals. The plurality of translated indication ports communicate the plurality of translated indication signals. The initialization port receives an initialization signal.