Patents Assigned to NVidia

System and method for multi-color dilu preconditioner

Patent number: 9798698

Abstract: A system and method for preconditioning or smoothing (e.g., multi-color DILU preconditioning) for iterative solving of a system of equations. The method includes accessing a matrix comprising a plurality of coefficients of a system of equations and accessing coloring information corresponding to the matrix. The method further includes determining a diagonal matrix based on the matrix and the coloring information corresponding to the matrix. The determining of the diagonal matrix may be determined in parallel on a per color basis. The method may further include determining an updated solution to the system of equations where the updated solution is determined in parallel on a per color basis using the diagonal matrix.

Type: Grant

Filed: August 13, 2012

Date of Patent: October 24, 2017

Assignee: Nvidia Corporation

Inventors: Patrice Castonguay, Robert Strzodka
Reordering buffer for memory access locality

Patent number: 9798544

Abstract: Systems and methods for scheduling instructions for execution on a multi-core processor reorder the execution of different threads to ensure that instructions specified as having localized memory access behavior are executed over one or more sequential clock cycles to benefit from memory access locality. At compile time, code sequences including memory access instructions that may be localized are delineated into separate batches. A scheduling unit ensures that multiple parallel threads are processed over one or more sequential scheduling cycles to execute the batched instructions. The scheduling unit waits to schedule execution of instructions that are not included in the particular batch until execution of the batched instructions is done so that memory access locality is maintained for the particular batch. In between the separate batches, instructions that are not included in a batch are scheduled so that threads executing non-batched instructions are also processed and not starved.

Type: Grant

Filed: December 10, 2012

Date of Patent: October 24, 2017

Assignee: NVIDIA CORPORATION

Inventors: Olivier Giroux, Jack Hilaire Choquette, Xiaogang Qiu, Robert J. Stoll
System and method for retrieving values of captured local variables for lambda functions in Java

Patent number: 9798569

Abstract: A system for and method of retrieving values of captured local variables for a lambda function in Java. In one embodiment, the system includes: (1) a Java virtual machine and (2) a captured variable retriever that interacts with the Java virtual machine and configured to retrieve a signature of the lambda function from a classfile of a Java class containing the lambda function, compare the signature with a declaration of the lambda function to identify arguments corresponding to the captured local variables, modify the lambda function and cause the Java virtual machine to execute the modified lambda function.

Type: Grant

Filed: February 15, 2016

Date of Patent: October 24, 2017

Assignee: Nvidia Corporation

Inventors: Michael Lai, Vinod Grover, Sean Lee, Jaydeep Marathe
Methods and apparatus for scheduling instructions using pre-decode data

Patent number: 9798548

Abstract: Systems and methods for scheduling instructions using pre-decode data corresponding to each instruction. In one embodiment, a multi-core processor includes a scheduling unit in each core for selecting instructions from two or more threads each scheduling cycle for execution on that particular core. As threads are scheduled for execution on the core, instructions from the threads are fetched into a buffer without being decoded. The pre-decode data is determined by a compiler and is extracted by the scheduling unit during runtime and used to control selection of threads for execution. The pre-decode data may specify a number of scheduling cycles to wait before scheduling the instruction. The pre-decode data may also specify a scheduling priority for the instruction. Once the scheduling unit selects an instruction to issue for execution, a decode unit fully decodes the instruction.

Type: Grant

Filed: December 21, 2011

Date of Patent: October 24, 2017

Assignee: NVIDIA Corporation

Inventors: Jack Hilaire Choquette, Robert J. Stoll, Olivier Giroux
Fast mapping table register file allocation algorithm for SIMT processors

Patent number: 9798543

Abstract: One embodiment of the present invention sets forth a technique for allocating register file entries included in a register file to a thread group. A request to allocate a number of register file entries to the thread group is received. A required number of mapping table entries included in a register file mapping table (RFMT) is determined based on the request, where each mapping table entry included in the RFMT is associated with a different plurality of register file entries included in the register file. The RFMT is parsed to locate an available mapping table entry in the RFMT for each of the required mapping table entries. For each available mapping table entry, a register file pointer is associated with an address that corresponds to a first register file entry in the plurality of register file entries associated with the available mapping table entry.

Type: Grant

Filed: September 3, 2010

Date of Patent: October 24, 2017

Assignee: NVIDIA Corporation

Inventors: Michael Fiyak, Ming Y. Siu
Current-parking switching regulator downstream controller

Patent number: 9800158

Abstract: A system and method are provided for regulating a voltage level at a load. A current source generates a current and a voltage control mechanism provides a portion of the current to regulate the voltage level at the load. When the voltage level at the load is greater than a maximum voltage level, the current source is decoupled from the load and the current source is coupled to a current sink to reduce the voltage level at the load. An electric power conversion comprises the current source and the voltage control mechanism. A downstream controller is configured to control the voltage control mechanism to decouple the current source from the load and couple the current source to a current sink to reduce the voltage level at the load when the voltage level at the load is greater than a maximum voltage level.

Type: Grant

Filed: January 30, 2013

Date of Patent: October 24, 2017

Assignee: NVIDIA Corporation

Inventor: William J. Dally
Migrating pages of different sizes between heterogeneous processors

Patent number: 9798487

Abstract: One embodiment of the present invention sets forth a computer-implemented method for migrating a memory page from a first memory to a second memory. The method includes determining a first page size supported by the first memory. The method also includes determining a second page size supported by the second memory. The method further includes determining a use history of the memory page based on an entry in a page state directory associated with the memory page. The method also includes migrating the memory page between the first memory and the second memory based on the first page size, the second page size, and the use history.

Type: Grant

Filed: August 22, 2016

Date of Patent: October 24, 2017

Assignee: NVIDIA Corporation

Inventors: Jerome F. Duluk, Jr., Cameron Buschardt, James Leroy Deming, Lucien Dunning, Brian Fahs, Mark Hairgrove, Chenghuan Jia, John Mashey, James M. Van Dyke
Video decoder architecture for processing out-of-order macro-blocks of a video stream

Patent number: 9794593

Abstract: A video decoder architecture for processing out-of-order macro-blocks of a video stream. A microcode engine receives compressed data representing macro-blocks of a frame of a video stream, wherein at least one macro-block is received out-of-order. The microcode engine is for buffering the compressed data and for ordering the macro-blocks of the frame in raster scan order. A digital video decoder receives the macro-blocks in raster scan order and is for decoding the macro-blocks.

Type: Grant

Filed: December 9, 2005

Date of Patent: October 17, 2017

Assignee: Nvidia Corporation

Inventors: Iole Moccagatta, Ignatius B. Tjandrasuwita, Harikrishna M. Reddy
Heuristics for improving performance in a tile based architecture

Patent number: 9792122

Abstract: One embodiment of the present invention includes a technique for processing graphics primitives in a tile-based architecture. The technique includes storing, in a buffer, a first plurality of graphics primitives and a first plurality of state bundles received from the world-space pipeline. The technique further includes determining, based on a first condition, that the first plurality of graphics primitives should be replayed from the buffer, and, in response, replaying the first plurality of graphics primitives against a first tile included in a first plurality of tiles. Replaying the first plurality of graphics primitives includes comparing each graphics primitive against the first tile to determine whether the graphics primitive intersects the first tile, determining that one or more graphics primitives intersects the first tile, and transmitting the one or more graphics primitives and one or more associated state bundles to a screen-space pipeline for processing.

Type: Grant

Filed: October 4, 2013

Date of Patent: October 17, 2017

Assignee: NVIDIA CORPORATION

Inventors: Ziyad S. Hakura, Walter R. Steiner, Cynthia Ann Edgeworth Allison, Rouslan Dimitrov, Karim M. Abdalla, Dale L. Kirkland, Emmett M. Kilgariff
Microcontroller for memory management unit

Patent number: 9792220

Abstract: One embodiment of the present invention includes a microcontroller coupled to a memory management unit (MMU). The MMU is coupled to a page table included in a physical memory, and the microcontroller is configured to perform one or more virtual memory operations associated with the physical memory and the page table. In operation, the microcontroller receives a page fault generated by the MMU in response to an invalid memory access via a virtual memory address. To remedy such a page fault, the microcontroller performs actions to map the virtual memory address to an appropriate location in the physical memory. By contrast, in prior-art systems, a fault handler would typically remedy the page fault. Advantageously, because the microcontroller executes these tasks locally with respect to the MMU and the physical memory, latency associated with remedying page faults may be decreased. Consequently, overall system performance may be increased.

Type: Grant

Filed: August 27, 2013

Date of Patent: October 17, 2017

Assignee: NVIDIA Corporation

Inventors: Cameron Buschardt, Jerome F. Duluk, Jr., John Mashey, Mark Hairgrove, James Leroy Deming, Brian Fahs
Dynamic frame repetition in a variable refresh rate system

Patent number: 9786255

Abstract: A method, computer program product, and system for adjusting a dynamic refresh frequency of a display device are disclosed. The method includes the steps of obtaining a current frame duration associated with a first image, computing, based on the current frame duration, a repetition value for a second image, and repeating presentation of the second image on a display device based on the repetition value. The logic for implementing the method may be included in a graphics processing unit or within the display device itself.

Type: Grant

Filed: May 22, 2015

Date of Patent: October 10, 2017

Assignee: NVIDIA Corporation

Inventors: Tom Verbeure, Robert Jan Schutten, Gerrit A. Slavenburg, Thomas F. Fox
Influence clock data recovery settling point by applying decision feedback equalization to a crossing sample

Patent number: 9787509

Abstract: An apparatus including a receiver coupled to receive an input signal from a communication link and operable to employ decision feedback equalization to the input signal of the communication link and generate an edge sample signal. The apparatus also includes a timing recovery module coupled to the receiver and operable to receive the edge sample signal and use the edge sample signal to generate a data sampling phase signal, wherein the edge sample signal influences a settling point of the data sampling phase signal.

Type: Grant

Filed: January 5, 2016

Date of Patent: October 10, 2017

Assignee: Nvidia Corporation

Inventors: Lizhi Zhong, Vishnu Balan, Arif Al Amin, Sanjeev Maheshwari
Fourier transform for a signal to be transmitted on a random access channel

Patent number: 9788347

Abstract: Provided is a recursive method and apparatus for processing a signal for determining a plurality of frequency components of the signal, the signal being a chirp-like polyphase sequence. In one embodiment, the method includes: (1) determining a first frequency component of the plurality of frequency components, (2) determining a component factor by accessing a factor table, (3) determining the second frequency component using the determined first frequency component and the determined component factor.

Type: Grant

Filed: June 3, 2015

Date of Patent: October 10, 2017

Assignee: Nvidia Corporation

Inventors: Tarik Tabet, Godfrey Costa, Nallepilli Ramesh
Technique for simulating the dynamics of hair

Patent number: 9785729

Abstract: A simulation engine is configured to generate a physical simulation of a chain of particles by implementing a physics-based algorithm. The simulation engine is configured to generate a predicted position for each particle and to then adjust the predicted position of each particle based on a set of constraints associated with the physics-based algorithm. The simulation engine may then generate a predicted velocity for a given particle based on the adjusted, predicted position of that particle and based on the adjusted, predicted position of an adjacent particle.

Type: Grant

Filed: December 14, 2012

Date of Patent: October 10, 2017

Assignee: NVIDIA Corporation

Inventors: Matthias Muller-Fischer, Nuttapong Chentanez, Tae-Yong Kim
FAULT BUFFER FOR TRACKING PAGE FAULTS IN UNIFIED VIRTUAL MEMORY SYSTEM

Publication number: 20170286198

Abstract: A system for managing virtual memory. The system includes a first processing unit configured to execute a first operation that references a first virtual memory address. The system also includes a first memory management unit (MMU) associated with the first processing unit and configured to generate a first page fault upon determining that a first page table that is stored in a first memory unit associated with the first processing unit does not include a mapping corresponding to the first virtual memory address. The system further includes a first copy engine associated with the first processing unit. The first copy engine is configured to read a first command queue to determine a first mapping that corresponds to the first virtual memory address and is included in a first page state directory. The first copy engine is also configured to update the first page table to include the first mapping.

Type: Application

Filed: October 16, 2013

Publication date: October 5, 2017

Applicant: NVIDIA CORPORATION

Inventors: Jerome F. DULUK, JR., Cameron BUSCHARDT, Sherry CHEUNG, James Leroy DEMING, Samuel H. DUNCAN, Lucien DUNNING, Robert GEORGE, Arvind GOPALAKRISHNAN, Mark HAIRGROVE, Chenghuan JIA, John MASHEY
Hierarchical tiled caching

Patent number: 9779533

Abstract: One embodiment of the present invention includes a method for processing graphics objects. The method includes receiving a first draw-call and a second draw-call. The method also includes dividing the first draw-call into a first set of sub-draw-calls and the second draw-call into a second set of sub-draw-calls. The method further includes identifying a first screen tile. The method also includes identifying a first group of sub-draw-calls included in the first set of sub-draw-calls that overlap the first screen tile and a second group of sub-draw-calls included in the second set of sub-draw-calls that overlap the second screen tile. The method further includes causing the first group of sub-draw-calls and the second group of sub-draw-calls to be processed together.

Type: Grant

Filed: January 27, 2014

Date of Patent: October 3, 2017

Assignee: NVIDIA Corporation

Inventors: Rouslan Dimitrov, Ziyad S. Hakura
System, method, and computer program product for combining low motion blur and variable refresh rate in a display

Patent number: 9773460

Abstract: A system, method, and computer program product are provided for combining low motion blur and variable refresh rate in a display. In one embodiment, a hold-type display is operated in a first mode of operation where the hold-type display is dynamically refreshed such that the hold type display handles updates to image frames at unpredictable times and where for each of the image frames a backlight of the hold-type display is activated for an entire duration of display of the image frame. Additionally, it is determined that at least one predefined condition has been met. Further, in response to the determination, the hold-type display is operated in a second mode of operation where the hold-type display is statically refreshed such that the hold-type display handles updates to image frames at regular intervals and where for each of the image frames the backlight of the hold-type display is flashed.

Type: Grant

Filed: October 18, 2013

Date of Patent: September 26, 2017

Assignee: NVIDIA Corporation

Inventors: Tom Verbeure, Gerrit A. Slavenburg, Thomas F. Fox, Robert Jan Schutten, Luis Mariano Lucas, Marcel Dominicus Janssens
Rendering cover geometry without internal edges

Patent number: 9773341

Abstract: One embodiment of the present invention includes techniques for rasterizing geometries. First, a processing unit defines a bounding primitive that covers the geometry and does not include any internal edges. If the bounding primitive intersects any enabled clip plane, then the processing unit generates fragments to fill a current viewport. Alternatively, the processing unit generates fragments to fill the bounding primitive. Because the rasterized region includes no internal edges, conflation artifacts caused when the number of coverage samples per pixel exceeds the number of color samples per pixel may be reduced. In prior-art techniques, reducing such conflation artifacts typically involves increasing the number of color samples per pixel to equal the number of coverage samples per pixel.

Type: Grant

Filed: August 20, 2013

Date of Patent: September 26, 2017

Assignee: NVIDIA Corporation

Inventors: Jeffrey A. Bolz, Mark J. Kilgard
Techniques for determining instruction dependencies

Patent number: 9772827

Abstract: One embodiment sets forth a method for efficiently determining memory resource dependencies between instructions included in a software application. For each instruction, a dependency analyzer uses overlapping search techniques to identify one or more overlaps between the memory elements included in the current instruction and the memory elements included in previous instructions. The dependency analyzer then maps objects included in the instructions to a set of partition elements wherein each partition element represents a set of memory elements that are functionally equivalent for dependency analysis. Subsequently, the dependency analyzer uses the set of partition elements to determine memory dependencies between the instructions at the memory element level.

Type: Grant

Filed: April 22, 2013

Date of Patent: September 26, 2017

Assignee: NVIDIA Corporation

Inventor: Julius Vanderspek
Graphics processor clock scaling based on idle time

Patent number: 9773344

Abstract: A method for graphics processor clock scaling comprises the following steps. A percentage of idle-time is calculated, based upon an elapsed idle-time and an elapsed active time. A graphics processor clock rate is reduced if the percentage of idle time is higher than a high limit threshold. The graphics processor clock rate is increased if the percentage of idle time is lower than a low limit threshold.

Type: Grant

Filed: December 12, 2012

Date of Patent: September 26, 2017

Assignee: Nvidia Corporation

Inventors: Ilan Aelion, Terje Bergstrom, Matthew R. Longnecker

prev … 104 105 106 107 108 109 110 111 112 … next