Abstract: A system, method, and computer program product are provided for computing values for pixels in an image plane. In use, a low discrepancy sequence associated with an image plane is identified. Additionally, a function with the set of pixels of the image plane as a domain is determined. Further, a value is computed for each pixel in the image plane, utilizing the low discrepancy sequence and the function with the set of pixels of the image plane as a domain.
Type:
Application
Filed:
April 16, 2013
Publication date:
May 29, 2014
Applicant:
NVIDIA Corporation
Inventors:
Matthias Raab, Carsten Alexander Wächter, Alexander Keller
Abstract: A system, method, and computer program product are provided for debugging graphics programs via a system with a single graphics processing unit. The method includes the steps of storing an initial state of an application programming interface context in a memory, intercepting a stream of API commands associated with the frame, transmitting the stream of API commands to a software layer that implements the API to render the frame, and in response to a breakpoint, storing a graphics processing unit context in the memory. The initial state of the API context corresponds to the start of a frame, and the stream of API commands are generated by a graphics application.
Abstract: Attributes of access requests can be used to distinguish one set of access requests from another set of access requests. The prefetcher can determine a pattern for each set of access requests and then prefetch cache lines accordingly. In an embodiment in which there are multiple caches, a prefetcher can determine a destination for prefetched cache lines associated with a respective set of access requests. For example, the prefetcher can prefetch one set of cache lines into one cache, and another set of cache lines into another cache. Also, the prefetcher can determine a prefetch distance for each set of access requests. For example, the prefetch distances for the sets of access requests can be different.
Abstract: A system, method, and computer program product are provided for transposing a matrix. In use, a matrix is identified. Additionally, the matrix is transposed utilizing row-wise operations and column-wise operations, where the row-wise operations and the column-wise operations are performed independently.
Type:
Application
Filed:
October 24, 2013
Publication date:
May 29, 2014
Applicant:
NVIDIA Corporation
Inventors:
Bryan Christopher Catanzaro, Manjunath Kudlur
Abstract: A static random-access memory (SRAM) module includes a column select (RSEL) driver coupled to an input/output (I/O) circuit by an RSEL line. The I/O circuit is configured to read bit line signals from a bit cell within the SRAM module. During a read operation, the RSEL driver pulls the RSEL line to zero in order to cause p-type metal-oxide-semiconductors (PMOSs) within the I/O circuit to sample the bit line signals output by the bit cell. In response, an aggressor driver drives the RSEL line to a negative voltage, thereby reducing the resistance of the PMOSs within the I/O circuit.
Abstract: Cache hit information is used to manage (e.g., cap) the prefetch distance for a cache. In an embodiment in which there is a first cache and a second cache, where the second cache (e.g., a level two cache) has greater latency than the first cache (e.g., a level one cache), a prefetcher prefetches cache lines to the second cache and is configured to receive feedback from that cache. The feedback indicates whether an access request issued in response to a cache miss in the first cache results in a cache hit in the second cache. The prefetch distance for the second cache is determined according to the feedback.
Abstract: A method, computer program product, and system are provided for multi-input bitwise logical operations. The method includes the steps of receiving a multi-input bitwise logical operation instruction that specifies two or more input operands and a function operand, where a first input operand of the two or more input operands comprises a number of bits, each bit having a corresponding bit in each of the additional input operands in the two or more input operands. The function operand is written to a lookup table. Then, the lookup table is accessed for each set of corresponding input operand bits in the two or more input operands to generate an output for the multi-input bitwise logical operation instruction.
Abstract: A system, process, and computer program product are provided for sampling a hierarchical depth map. An approach for sampling the hierarchical depth map includes the steps of generating a hierarchical depth map and reading a value associated with a sample pixel from a target level of the hierarchical depth map based on a difference between the sample pixel and a target pixel. The hierarchical depth map includes at least two levels.
Type:
Application
Filed:
November 26, 2012
Publication date:
May 29, 2014
Applicant:
NVIDIA CORPORATION
Inventors:
Morgan McGuire, David Patrick Luebke, Michael Thomas Mara
Abstract: Circuits, methods, and systems that reduce or eliminate the number of data transfers between a system memory and a graphics processor under certain conditions. After inactivity by a user of an electronic device is detected, the color fidelity of pixels being displayed is reduced. Color fidelity can be reduced by compressing pixel values, and the compression may be non-lossless, for example, pixel data bits may be truncated. The degree of compression can be progressively increased for longer durations of inactivity, and this progression may be limited by a threshold. Inactivity may be detected by a lack of input from devices such as a keyboard, pen, mouse, or other input device. Once activity is resumed, uncompressed pixel data, or pixel data that is compressed in a lossless manner, is displayed.
Abstract: A present invention pixel processing system and method permit complicated three dimensional images to be rendered with shallow graphics pipelines including reduced gate counts and also facilitates power conservation. Pixel packet information includes pixel surface attribute values are retrieved in a single unified data fetch stage. At a data fetch pipestage a determination may be made if the pixel packet information contributes to an image display presentation (e.g., a depth comparison of Z values is performed determine if the pixel is occluded). A pixel packet status indicator (e.g., a kill bit) is set in the sideband portion of a pixel packet and the pixel packet is forwarded for processing in accordance with the pixel packet status indicator.
Abstract: A present invention pixel processing system and method permit complicated three dimensional images to be rendered with shallow graphics pipelines including reduced gate counts and facilitates power conservation by utilizing a single unified data fetch stage (e.g., unified data fetch module) that retrieves a variety of different pixel surface attribute values for different attribute types (e.g., depth, color, and/or texture values) in a single stage. Different types of pixel surface attribute data (e.g., depth, color, texture) associated with multiple graphics processing functions (e.g., color blending, texture mapping, etc.) are retrieved in the single unified data fetch graphics pipeline stage. The pixel packet rows including the pixel surface attribute values are forwarded to other graphics pipeline stages for single thread processing (e.g. to a universal arithmetic logic unit capable of performing multiple graphics functions on the pixel surface attribute values).
Abstract: A method for using a programmable DMA engine to implement memory transfers and video processing for a video processor. A DMA control program is configured for controlling DMA memory transfers between a frame buffer memory and a video processor. The DMA control program is stored in the DMA engine. A DMA request can be received from the video processor. The DMA control program is executable to implement the DMA request for the video processor. The DMA engine is operable to execute low-level command for accessing the frame buffer memory to implement a high-level command.
Type:
Grant
Filed:
November 4, 2005
Date of Patent:
May 27, 2014
Assignee:
NVIDIA Corporation
Inventors:
Stephen D. Lew, Shirish Gadre, Ashish Karandikar, Franciscus W. Sijstermans
Abstract: A method of displaying graphics data is described. The method involves accessing the graphics data in a memory subsystem associated with one graphics subsystem. The graphics data is transmitted to a second graphics subsystem, where it is displayed on a monitor coupled to the second graphics subsystem.
Type:
Grant
Filed:
August 4, 2008
Date of Patent:
May 27, 2014
Assignee:
Nvidia Corporation
Inventors:
Stephen Lew, Bruce R. Intihar, Abraham B. de Waal, David G. Reed, Tony Tamasi, David Wyatt, Franck R. Diard, Brad Simeral
Abstract: Detailed herein are approaches to enabling conditional execution of instructions in a graphics pipeline. In one embodiment, a method of conditional execution controller operation is detailed. The method involves configuring the conditional execution controller to evaluate conditional test. A pixel data packet is received into the conditional execution controller, and evaluated, with reference to the conditional test. A conditional execution flag, associated with the pixel data packet, is set, to indicate whether a conditional operation should be performed on the pixel data packet.
Type:
Grant
Filed:
August 15, 2007
Date of Patent:
May 27, 2014
Assignee:
NVIDIA Corporation
Inventors:
Justin Michael Mahan, Edward A. Hutchins
Abstract: A flicker band automated detection system and method are presented. In one embodiment an incidental motion mitigation exposure setting method includes receiving image input information; performing a motion mitigating flicker band automatic detection process; and implementing exposure settings based upon results of the motion mitigating flicker band automatic detection process. The auto flicker band detection process includes performing a motion mitigating process on an illumination intensity indication. Content impacts on an the motion mitigated illumination intensity indication are minimized. The motion mitigated illumination intensity indication is binarized. A correlation of the motion mitigated illumination intensity and a reference illumination intensity frequency is established.
Type:
Grant
Filed:
February 9, 2007
Date of Patent:
May 27, 2014
Assignee:
NVIDIA Corporation
Inventors:
Shang-Hung Lin, Hu He, Ignatius B. Tjandrasuwita
Abstract: Cyclic redundancy check (CRC) values are efficiently calculated using an improved linear feedback shift register (LFSR) circuit. CRC value generation is separated into two sub-calculations, which are then combined to form a final CRC value. A programmable XOR engine performs logic functions via a table lookup rather than via a random logic circuit. LCRC and ECRC calculations are performed using a single shared LFSR circuit. Multiple links share the same CRC value generator. One advantage of the present invention is that CRC values are generated using smaller and fewer LFSR circuits relative to conventional circuit designs. As a result, a CRC value generator utilizing the disclosed techniques consumes less surface area of an integrated circuit and consumes less power, resulting in cooler operation.
Type:
Grant
Filed:
July 19, 2012
Date of Patent:
May 27, 2014
Assignee:
NVIDIA Corporation
Inventors:
Eric Lyell Hill, Richard L. Schober, Jr., Hungse Cha
Abstract: Described are data structures, and methodology for forming same, for network protocol processing. A method for creating data structures for firewalling and network address translating is described. A method for creating data structures for physical layer addressing is described. A method for security protocol support using a data structure is described. A method for creating at least one data structure sized responsive to whether a firewall is activated is described. A data structure for routing packets is described. A method of forming hashing table chains is described. Additionally, method and apparatus for tracking packet states is described. More particularly, Transmission Control Protocol (“TCP”) tracking of states for packets is described. In an embodiment, a division between software states and hardware states is made as a packet is processed by both software and hardware. Additionally, method and apparatus for network protocol processing are described.
Type:
Grant
Filed:
December 3, 2007
Date of Patent:
May 27, 2014
Assignee:
NVIDIA Corporation
Inventors:
Thomas A. Maufer, Paul J. Gyugyi, Sameer Nanda, Paul J. Sidenblad
Abstract: A memory controller and a dynamic random access memory (DRAM) interface are disclosed. The memory controller implements signals for the DRAM interface. The DRAM interface includes a differential clock signal, an uncalibrated parallel command bus, and a high-speed, serial address bus. The command bus may be used to initiate communication with the memory device upon power-up and to initiate calibration of the address bus.
Abstract: An aspect of the present invention stores files of a source directory in a target directory. In an embodiment, a unique identifier is generated for each of the files and a new location and a new name are generated for the file. The new location represents the specific sub-directory of the target at which the file is stored. The file is stored at the new location with the new name. Such storing in a new location with a new name can be advantageously used to address various issues in corresponding environments. In one environment, the target directory is stored in an embedded system, with limited resources and the source directory contains several files with substantial overlapping names (which can require substantial resources to search for a specific file). The unique identifiers are generated according to media transfer protocol (MTP), which generates an object identifier for each of the files/directories, etc.
Abstract: A method for implementing command acceleration. The method includes receiving a first set of instructions from a first processor, wherein the first set of instructions are formatted in accordance with a microarchitecture of the first processor. The first set of instructions are translated into a second set of instructions, wherein the second set of instructions are formatted in accordance with a microarchitecture of a second processor. The second set instructions are then transmitted to the second processor for execution by the second processor.
Type:
Grant
Filed:
November 4, 2005
Date of Patent:
May 27, 2014
Assignee:
NVIDIA Corporation
Inventors:
Ashish Karandikar, Shirish Gadre, Amir H. Salek