Abstract: An error reporting system utilizes a parity checker to receive data results from execution of an original instruction and a parity bit for the data. A decoder receives an error correcting code (ECC) for data resulting from execution of a shadow instruction of the original instruction, and data error correction is initiated on the original instruction result on condition of a mismatch between the parity bit and the original instruction result, and the decoder asserting a correctable error in the original instruction result.
Type:
Grant
Filed:
March 6, 2020
Date of Patent:
August 9, 2022
Assignee:
NVIDIA Corp.
Inventors:
Michael Sullivan, Siva Hari, Brian Zimmer, Timothy Tsai, Stephen W. Keckler
Abstract: A circuit includes a set of multiple bit generating cells. One or more adjustable current sources is coupled to introduce perturbations into outputs of the bit generating cells. Based on the perturbations, the outputs of a subset less than all of the bit generating cells are selected, and applied as a control.
Type:
Grant
Filed:
February 24, 2021
Date of Patent:
August 9, 2022
Assignee:
NVIDIA Corp.
Inventors:
Sudhir Shrikantha Kudva, Nikola Nedovic, Carl Thomas Gray
Abstract: Techniques to characterize driving scenarios for autonomous vehicles characterize a path in a driving scenario according to metrics such as narrowness and effort. The scenarios may be characterized using a tree-based or tensor-based approach.
Type:
Grant
Filed:
June 10, 2020
Date of Patent:
July 19, 2022
Assignee:
NVIDIA Corp.
Inventors:
Siva Kumar Sastry Hari, Iuri Frosio, Zahra Ghodsi, Anima Anandkumar, Timothy Tsai, Stephen W. Keckler
Abstract: A communication method between a source device and a target device utilizes speculative connection setup between the source device and the target device, target-device-side packet ordering, and fine-grained ordering to remove packet dependencies.
Abstract: This disclosure relates to current flattening circuits for an electrical load. The current flattening circuits incorporate randomize various parameters to add noise onto the supply current. This added noise may act to reduce the signal to noise ratio in the supply current, increasing the difficulty of identifying a computational artifact signal from power rail noise.
Type:
Application
Filed:
January 20, 2022
Publication date:
May 12, 2022
Applicant:
NVIDIA Corp.
Inventors:
Sudhir Shrikantha Kudva, Nikola Nedovic, Sanquan Song
Abstract: An adder circuit that includes an operand input and a second operand input to an XNOR cell. The XNOR cell is configured to provide the operand input and the second operand input to both a NAND gate and a first OAI cell. A second OAI cell transforms the output of the XNOR cell into a carry out signal.
Abstract: A communication method between a source device and a target device utilizes speculative connection setup between the source device and the target device, target-device-side packet ordering, and fine-grained ordering to remove packet dependencies.
Abstract: This disclosure relates to current flattening circuits for an electrical load. The current flattening circuits incorporate randomize various parameters to add noise onto the supply current. This added noise may act to reduce the signal to noise ratio in the supply current, increasing the difficulty of identifying a computational artifact signal from power rail noise.
Type:
Grant
Filed:
April 23, 2020
Date of Patent:
March 22, 2022
Assignee:
NVIDIA Corp.
Inventors:
Sudhir Shrikantha Kudva, Nikola Nedovic, Sanquan Song
Abstract: A distributed deep neural net (DNN) utilizing a distributed, tile-based architecture includes multiple chips, each with a central processing element, a global memory buffer, and a plurality of additional processing elements. Each additional processing element includes a weight buffer, an activation buffer, and vector multiply-accumulate units to combine, in parallel, the weight values and the activation values using stationary data flows.
Type:
Application
Filed:
November 19, 2021
Publication date:
March 10, 2022
Applicant:
NVIDIA Corp.
Inventors:
Yakun Shao, Rangharajan Venkatesan, Miaorong Wang, Daniel Smith, William James Dally, Joel Emer, Stephen W. Keckler, Brucek Khailany
Abstract: A distributed deep neural net (DNN) utilizing a distributed, tile-based architecture includes multiple chips, each with a central processing element, a global memory buffer, and a plurality of additional processing elements. Each additional processing element includes a weight buffer, an activation buffer, and vector multiply-accumulate units to combine, in parallel, the weight values and the activation values using stationary data flows.
Type:
Grant
Filed:
November 4, 2019
Date of Patent:
March 8, 2022
Assignee:
NVIDIA Corp.
Inventors:
Yakun Shao, Rangharajan Venkatesan, Miaorong Wang, Daniel Smith, William James Dally, Joel Emer, Stephen W. Keckler, Brucek Khailany
Abstract: Solutions improving efficiency of Softmax computation applied for efficient deep learning inference in transformers and other neural networks. The solutions utilize a reduced-precision implementation of various operations in Softmax, replacing ex with 2x to reduce instruction overhead associated with computing ex, and replacing floating point max computation with integer max computation. Further described is a scalable implementation that decomposes Softmax into UnNormalized Softmax and Normalization operations.
Type:
Application
Filed:
December 4, 2020
Publication date:
March 3, 2022
Applicant:
NVIDIA Corp.
Inventors:
Jacob Robert Stevens, Rangharajan Venkatesan, Steve Haihang Dai, Brucek Khailany
Abstract: Warp sharding techniques to switch execution between divergent shards on instructions that trigger a long stall, thereby interleaving execution between diverged threads within a warp instead of across warps. The technique may be applied to mitigate pipeline stalls in applications with low warp occupancy and high divergence. Warp data cache locality may also be improved by concentrating memory accesses within a warp rather than spreading them across warps.
Type:
Application
Filed:
February 24, 2021
Publication date:
January 27, 2022
Applicant:
NVIDIA Corp.
Inventors:
Sana Damani, Mark Stephenson, Ram Rangan, Daniel Robert Johnson, Rishkul Kulkarni
Abstract: The computational scaling challenges of holographic displays are mitigated by techniques for generating holograms that introduce foveation into a wave front recording planes approach to hologram generation. Spatial hashing is applied to organize the points or polygons of a display object into keys and values.
Type:
Application
Filed:
July 23, 2020
Publication date:
January 27, 2022
Applicant:
NVIDIA Corp.
Inventors:
Jui-Hsien Wang, Ward Lopes, Rachel Anastasia Brown, Peter Shirley
Abstract: A genetic algorithm is utilized to generate routing candidates to which a reinforcement learning model is applied to correct the design rule constraint violations incrementally. A design rule checker provides feedback on the violations to the reinforcement learning model and the model learns how to fix the violations. A layout device placer based upon a simulated annealing method may also be utilized.
Abstract: A system for data and clock recovery includes a timing error detector, a phase detector, and a phase increment injector. The phase increment injector may be used to determine an increment to affect an output of the phase detector or a clocking element. A sign of the increment is determined from a sign or direction of an accumulated version of a clock and data recovery gradient value.
Abstract: Techniques to generate driving scenarios for autonomous vehicles characterize a path in a driving scenario according to metrics such as narrowness and effort. Nodes of the path are assigned a time for action to avoid collision from the node. The generated scenarios may be simulated in a computer.
Type:
Application
Filed:
June 10, 2020
Publication date:
December 16, 2021
Applicant:
NVIDIA Corp.
Inventors:
Siva Kumar Sastry Hari, Iuri Frosio, Zahra Ghodsi, Anima Anandkumar, Timothy Tsai, Stephen W. Keckler, Alejandro Troccoli
Abstract: Techniques to characterize driving scenarios for autonomous vehicles characterize a path in a driving scenario according to metrics such as narrowness and effort. The scenarios may be characterized using a tree-based or tensor-based approach.
Type:
Application
Filed:
June 10, 2020
Publication date:
December 16, 2021
Applicant:
NVIDIA Corp.
Inventors:
Siva Kumar Sastry Hari, Iuri Frosio, Zahra Ghodsi, Anima Anandkumar, Timothy Tsai, Stephen W. Keckler
Abstract: This disclosure relates to a receiver that includes a clock and data recovery loop and a phase offset loop. The clock and data recovery loop may be controlled by a sum of gradients for a plurality of data interleaves. The phase offset loop may be controlled by an accumulated differential gradient for each of the data interleaves.
Abstract: An adder circuit provides a first operand input and a second operand input to an XNOR cell. The XNOR cell transforms these inputs to a propagate signal that is applied to an OAT cell to produce a carry out signal. A third OAT cell transforms a third operand input and the propagate signal into a sum output signal.
Abstract: A method dynamically selects one of a first sampling order and a second sampling order for a ray trace of pixels in a tile where the selection is based on a motion vector for the tile. The sampling order may be a bowtie pattern or an hourglass pattern.
Type:
Application
Filed:
July 6, 2021
Publication date:
November 4, 2021
Applicant:
NVIDIA Corp.
Inventors:
Johan Pontus Andersson, Jim Nilsson, Tomas Guy Akenine-Möller