Patents Assigned to NVidia
-
Patent number: 9424383Abstract: An integrated circuit (IC) is designed that includes one variant having a plurality of a modular circuits communicatively coupled together and a second variant having a sub-set of the plurality of modular circuits. The modular circuits are then laid out on a wafer for fabricating each of the variants of the IC. The layout includes routing communicative couplings between the sub-set of the modular circuits of the second variant to the other modular circuits of the first variant in one or more metallization layers to be fabricated last. Fabricating the IC is then started, up to but not including the one or more metallization layers to be fabricated last. One or more of the plurality of variants of the IC is selected based upon a demand predicted during fabrication. Fabrication then continues with the last metallization layers of the IC according to the selected layout.Type: GrantFiled: January 25, 2014Date of Patent: August 23, 2016Assignee: NVIDIA CORPORATIONInventor: Brian Kelleher
-
Patent number: 9424038Abstract: A compiler-controlled technique for scheduling threads to execute different regions of a program. A compiler analyzes program code to determine a control flow graph for the program code. The control flow graph contains regions and directed edges between regions. The regions have associated execution priorities. The directed edges indicate the direction of program control flow. Each region has a thread frontier which contains one or more regions. The compiler inserts one or more update predicate mask variable instructions at the end of a region. The compiler also inserts one or more conditional branch instructions at the end of the region. The conditional branch instructions are arranged in order of execution priority of the regions in the thread frontier of the region, to enforce execution priority of the regions at runtime.Type: GrantFiled: December 10, 2012Date of Patent: August 23, 2016Assignee: NVIDIA CorporationInventors: Gregory Diamos, Mojtaba Mehrara
-
Patent number: 9424227Abstract: Non-contiguous or tiled payload data are efficiently transferred between peers over a fabric. Specifically, a client transfers a byte enable message to a peer device via a mailbox mechanism, where the byte enable message specifies which bytes of the payload data being transferred via the data packet are to be written to the frame buffer on the peer device and which bytes are not to be written. The client transfers the non-contiguous or tiled payload payload data to the peer device. Upon receiving the payload data, the peer device writes bytes from the payload data into the target frame buffer for only those bytes enabled via the byte enable message. One advantage of the present invention is that non-contiguous or tiled data are transferred over a fabric with improved efficiency.Type: GrantFiled: July 3, 2012Date of Patent: August 23, 2016Assignee: NVIDIA CORPORATIONInventors: Samuel H. Duncan, Dennis K. Ma, Wei-Je Huang, Gary Ward
-
Patent number: 9425772Abstract: The described systems and methods can facilitate examination of device parameters including analysis of relatively dominant characteristic impacts on delays. In one embodiment, at least some coupling components (e.g., metal layer wires, traces, lines, etc.) have a relatively dominant impact on delays and the delay is in part a function of both capacitance and resistance of the coupling component. In one embodiment, a system comprises a plurality of dominant characteristic oscillating rings, wherein each respective one of the plurality of dominant characteristic oscillating rings includes a respective dominant characteristic. Additional analysis can be performed correlating the dominant characteristic delay impact results with device fabrication and operation.Type: GrantFiled: June 20, 2012Date of Patent: August 23, 2016Assignee: NVIDIA CORPORATIONInventors: Wojciech Jakub Poppe, Ilyas Elkin, Puneet Gupta
-
Patent number: 9424201Abstract: One embodiment of the present invention sets forth a computer-implemented method for migrating a memory page from a first memory to a second memory. The method includes determining a first page size supported by the first memory. The method also includes determining a second page size supported by the second memory. The method further includes determining a use history of the memory page based on an entry in a page state directory associated with the memory page. The method also includes migrating the memory page between the first memory and the second memory based on the first page size, the second page size, and the use history.Type: GrantFiled: December 19, 2013Date of Patent: August 23, 2016Assignee: NVIDIA CorporationInventors: Jerome F. Duluk, Jr., Cameron Buschardt, James Leroy Deming, Lucien Dunning, Brian Fahs, Mark Hairgrove, Chenghuan Jia, John Mashey, James M. Van Dyke
-
Patent number: 9424138Abstract: Various embodiments relating to saving and recovering a hardware architecture state are provided. In one embodiment, during a first mode of operation, entries in a first portion of a random-access memory (RAM) are manipulated. A current version of less than all of the entries of the first portion is saved to a checkpointed version in response to a checkpoint event that triggers operation in a second mode of operation. During the second mode of operation, entries in a second portion of the RAM are manipulated. The checkpointed version of less than all of the entries of the first portion is recovered as the current version in response to a restore event that triggers resumption of operation in the first mode.Type: GrantFiled: June 14, 2013Date of Patent: August 23, 2016Assignee: NVIDIA CORPORATIONInventors: Madhu Swarna, Jinghua Jiang
-
Patent number: 9425171Abstract: One embodiment of the present invention sets forth a technique for packaging an integrated circuit die. The technique includes bonding a first surface of the integrated circuit die to a first substrate via a first plurality of solder bump structures and bonding a second substrate to a second surface of the integrated circuit die. The technique further includes bonding the first substrate to a third substrate via a second plurality of solder bump structures and, after bonding the first substrate to the third substrate, removing the second substrate from the second surface of the integrated circuit die. The technique further includes disposing a heat sink on the second surface of the integrated circuit die.Type: GrantFiled: June 25, 2015Date of Patent: August 23, 2016Assignee: NVIDIA CorporationInventors: Joseph Minacapelli, Teckgyu (Terry) Kang
-
Patent number: 9418437Abstract: One embodiment of the present invention includes techniques for rasterizing primitives that include edges shared between paths. For each edge, a rasterizer unit selects and applies a sample rule from multiple sample rules. If the edge is shared, then the selected sample rule causes each group of coverage samples associated with a single color sample to be considered as either fully inside or fully outside the edge. Consequently, conflation artifacts caused when the number of coverage samples per pixel exceeds the number of color samples per pixel may be reduced. In prior-art techniques, reducing such conflation artifacts typically involves increasing the number of color samples per pixel to equal the number of coverage samples per pixel. Advantageously, the disclosed techniques enable rendering using algorithms that reduce the ratio of color to coverage samples, thereby decreasing memory consumption and memory bandwidth use, without causing conflation artifacts associated with shared edges.Type: GrantFiled: September 16, 2013Date of Patent: August 16, 2016Assignee: NVIDIA CORPORATIONInventors: Mark J. Kilgard, Jeffrey A. Bolz
-
Patent number: 9417875Abstract: One embodiment of the present invention sets forth a technique for performing aggregation operations across multiple threads that execute independently. Aggregation is specified as part of a barrier synchronization or barrier arrival instruction, where in addition to performing the barrier synchronization or arrival, the instruction aggregates (using reduction or scan operations) values supplied by each thread. When a thread executes the barrier aggregation instruction the thread contributes to a scan or reduction result, and waits to execute any more instructions until after all of the threads have executed the barrier aggregation instruction. A reduction result is communicated to each thread after all of the threads have executed the barrier aggregation instruction and a scan result is communicated to each thread as the barrier aggregation instruction is executed by the thread.Type: GrantFiled: September 12, 2013Date of Patent: August 16, 2016Assignee: NVIDIA CORPORATIONInventors: Brian Fahs, Ming Y. Siu, Brett W. Coon, John R. Nickolls, Lars Nyland
-
Patent number: 9418400Abstract: Systems and methods for rendering depth-of-field visual effect on images with high computing efficiency and performance. A diffusion blurring process and a Fast Fourier Transform (FFT)-based convolution are combined to achieve high-fidelity depth-of-field visual effect with Bokeh spots in real-time applications. The brightest regions in the background of an original image are enhanced with Bokeh effect by virtue of FFT convolution with a convolution kernel. A diffusion solver can be used to blur the background of the original image. By blending the Bokeh spots with the image with gradually blurred background, a resultant image can present an enhanced depth-of-field visual effect. The FFT-based convolution can be computed with multi-threaded parallelism.Type: GrantFiled: June 18, 2013Date of Patent: August 16, 2016Assignee: NVIDIA CORPORATIONInventors: Nikolay Sakharnykh, Holger Gruen
-
Patent number: 9418616Abstract: A graphics processing unit includes a set of geometry processing units each configured to process graphics primitives in parallel with one another. A given geometry processing unit generates one or more graphics primitives or geometry objects and buffers the associated vertex data locally. The geometry processing unit also buffers different sets of indices to those vertices, where each such set represents a different graphics primitive or geometry object. The geometry processing units may then stream the buffered vertices and indices to global buffers in parallel with one another. A stream output synchronization unit coordinates the parallel streaming of vertices and indices by providing each geometry processing unit with a different base address within a global vertex buffer where vertices may be written. The stream output synchronization unit also provides each geometry processing unit with a different base address within a global index buffer where indices may be written.Type: GrantFiled: December 20, 2012Date of Patent: August 16, 2016Assignee: NVIDIA CORPORATIONInventors: Jerome F. Duluk, Jr., Ziyad S. Hakura, Henry Packard Moreton
-
Patent number: 9418730Abstract: Handshaking sense amplifier. In accordance with a first embodiment, an electronic circuit includes a sense amplifier configured to differentially sense contents of a memory cell. The circuit also includes a self-timing circuit configured to detect a completion of evaluation by the sense amplifier; and to initiate a subsequent memory operation responsive to the completion. A completion of evaluation may not be aligned with a clock edge.Type: GrantFiled: June 4, 2013Date of Patent: August 16, 2016Assignee: NVIDIA CORPORATIONInventors: Andreas J. Gotterba, Jesse S. Wang
-
Patent number: 9420657Abstract: Embodiments of the present invention provide a flat panel electronic device and a current control system thereof. The current control system comprises: a detecting module for detecting a current in a main circuit of the flat panel electronic device; and a control mechanism for reducing a brightness level of a backlight unit of the flat panel electronic device when the current in the main circuit is larger than or equal to a threshold so that the current in the main circuit is reduced to be less than the threshold. The current control system provided by the present invention reduces the current in the main circuit by reducing the brightness level of the backlight unit. Therefore, it is able to reduce the processing time by more than 10 ms compared with the prior art, and the fast and effective response is an important factor for extending the life of the battery.Type: GrantFiled: February 25, 2013Date of Patent: August 16, 2016Assignee: NVIDIA CORPORATIONInventors: Jun Hua, Shuang Xu
-
Patent number: 9418714Abstract: One embodiment provides, in a sense amplifier for an electronic memory array in which a selected memory cell drives a developing voltage differential according to a logic state of the memory cell, a method to store the logic state. The method includes poising source voltages of first and second transistors at levels offset, respectively, by threshold voltages of the first and second transistors. The method also includes applying the voltage differential between a gate of the first transistor and a gate of the second transistor, the first and second transistors configured to oppose each other in a cross-coupled inverter stage of the sense amplifier.Type: GrantFiled: July 12, 2013Date of Patent: August 16, 2016Assignee: NVIDIA CORPORATIONInventors: Mahmut E. Sinangil, John W. Poulton
-
Patent number: 9417881Abstract: One embodiment of the present invention sets forth a technique for dynamically allocating memory using one or more lock-free pop-only FIFOs. One or more lock-free FIFOs are populated with FIFO nodes, where each FIFO node represents a memory allocation of a predetermined size. Each particular lock-free FIFO includes memory allocations of a single size. Different lock-free FIFOs may include memory allocations for different sizes to service allocation requests for different size memory allocations. A lock-free mechanism is used to pop FIFO nodes from the FIFO. The use of the lock-free FIFO allows multiple consumers to simultaneously attempt to pop the head FIFO node without first obtaining a lock to ensure exclusive access of the FIFO.Type: GrantFiled: January 30, 2012Date of Patent: August 16, 2016Assignee: NVIDIA CORPORATIONInventors: Stephen Jones, Xiaohuang Huang
-
Patent number: 9419638Abstract: A subsystem configured to implement an analog to digital converter that includes a high speed comparator with an embedded reference voltage level that functions as a calibrated threshold. A calibration element applies power to a reference voltage system. The calibration element then selects a differential analog voltage and applies the differential analog voltage to the inputs of the comparator. A digitally coded signal then configures an array of switches that connect complements of integrated resistors to each input of the comparator so that the switching point of the comparator occurs coincident with the applied differential analog reference voltage, nulling out the effect of the applied differential analog voltage and comparator errors. The calibration element then removes power from the reference voltage system. As a result, the comparator is configured with an embedded threshold that equals the differential analog reference voltage.Type: GrantFiled: June 1, 2015Date of Patent: August 16, 2016Assignee: NVIDIA CORPORATIONInventors: Balaji Narendran Chellappa, Paul Aymeric Fontaine
-
Patent number: 9411668Abstract: A subsystem is configured to apply an offset voltage to a test, or canary, SRAM write driver circuit to create a condition that induces failure of the write operation. The offset voltage is incrementally increased until failure of the test write operation occurs in the canary SRAM circuit. The subsystem then calculates a probability of failure for the actual, non-test SRAM write operation, which is performed by an equivalent driver circuit with zero offset. The subsystem then compares the result to a benchmark acceptable probability figure. If the calculated probability of failure is greater than the benchmark acceptable probability figure, corrective action is initiated. In this manner, actual failures of SRAM write operations are anticipated, and corrective action reduces their occurrence and their impact on system performance.Type: GrantFiled: January 14, 2014Date of Patent: August 9, 2016Assignee: NVIDIA CorporationInventors: Arijit Banerjee, Mahmut Ersin Sinangil, John W. Poulton
-
Patent number: 9413518Abstract: Systems and methods for stabilizing clock data recovery (CDR) by filtering the abrupt phase shift associated with data pattern transition in the input signal. The CDR circuit includes a data pattern detector coupled to a data pattern filter. The data pattern detector is capable of detecting the data patterns of the input signal. Accordingly, the data pattern filter can selectively generate a filter indication indicating to freeze or suppress the CDR phase caused by data pattern transition. The filter indication can be incorporated to a phase error signal, a gain function, and/or the control voltage driving the VCO.Type: GrantFiled: August 12, 2013Date of Patent: August 9, 2016Assignee: NVIDIA CORPORATIONInventors: Yu Chang, Huabo Chen, Hakki Ozguc, Michael Hopgood
-
Patent number: 9412042Abstract: A number of images of a scene are captured and stored. The images are captured over a range of values for an attribute (e.g., a camera setting). One of the images is displayed. A location of interest in the displayed image is identified. Regions that correspond to the location of interest are identified in each of the images. Those regions are evaluated to identify which of the regions is rated highest with respect to the attribute relative to the other regions. The image that includes the highest-rated region is then displayed.Type: GrantFiled: April 25, 2013Date of Patent: August 9, 2016Assignee: NVIDIA CORPORATIONInventors: Kari Pulli, Orazio Gallo, David Jacobs
-
Patent number: 9411715Abstract: A system, method, and computer program product for optimizing thread stack memory allocation is disclosed. The method includes the steps of receiving source code for a program, translating the source code into an intermediate representation, analyzing the intermediate representation to identify at least two objects that could use a first allocated memory space in a thread stack memory, and modifying the intermediate representation by replacing references to a first object of the at least two objects with a reference to a second object of the at least two objects.Type: GrantFiled: December 12, 2012Date of Patent: August 9, 2016Assignee: NVIDIA CorporationInventors: Adriana Maria Susnea, Vinod Grover, Sean Youngsung Lee