Abstract: A circuit includes a set of multiple bit generating cells. One or more adjustable characterization circuits are coupled to inputs to the bit generating cells to affect the outputs of the bit generating cells. Based on the effect of the characterization circuit(s) on the outputs of the bit generating cells, a subset less than all of the bit generating cells is selected.
Type:
Application
Filed:
September 21, 2021
Publication date:
August 25, 2022
Applicant:
NVIDIA Corp.
Inventors:
Sudhir Shrikantha Kudva, Nikola Nedovic, Carl Thomas Gray, Stephen G Tell
Abstract: An error reporting system utilizes a parity checker to receive data results from execution of an original instruction and a parity bit for the data. A decoder receives an error correcting code (ECC) for data resulting from execution of a shadow instruction of the original instruction, and data error correction is initiated on the original instruction result on condition of a mismatch between the parity bit and the original instruction result, and the decoder asserting a correctable error in the original instruction result.
Type:
Application
Filed:
May 5, 2022
Publication date:
August 25, 2022
Applicant:
NVIDIA Corp.
Inventors:
Michael Sullivan, Siva Kumar Sastry Hari, Brian Matthew Zimmer, Timothy Tsai, Stephen W. Keckler
Abstract: In various examples, a test system is provided for executing built-in-self-test (BIST) according to JTAG and IEEE 1500 on chips deployed in-field. Hardware and software selectively connect onto the IEEE 1500 serial interface for running BIST while the chip is being used in deployment—such as in an autonomous vehicle. In addition to providing a mechanism to connect onto the serial interface, the hardware and software may reduce memory requirements and runtime associated with running the test sequences, thereby making BIST possible in deployment. Furthermore, some embodiments include components configured to store functional states of clocks, power, and input/output prior to running BIST, which permits restoration of the functional states after the BIST.
Abstract: First symbols are generated on a plurality of data channels by applying PAM-N encoding on a first subset of bits of a data burst, the first symbols generated without maximum transitions; second symbols are generated on at least one optionally-activated additional data channel, the second symbols generated by applying the PAM-N encoding on a second subset of bits of the data burst, the second symbols generated without maximum transitions; and third symbols are generated on a channel for communicating error correction bits for the first bits and second bits, the third symbols generated by applying hybrid PAM-N encoding on the error correction bits and a third subset of bits of the data burst, the hybrid PAM-N encoding comprising an interleaving of symbols with N voltage levels and symbols with less than N voltage levels.
Type:
Application
Filed:
February 11, 2022
Publication date:
August 18, 2022
Applicant:
NVIDIA Corp.
Inventors:
Sunil Sudhakaran, Gautam Bhatia, Robert Bloemer
Abstract: An end-to-end low-precision training system based on a multi-base logarithmic number system and a multiplicative weight update algorithm. The multi-base logarithmic number system is applied to update weights of the neural network, with different bases of the multi-base logarithmic number system utilized between calculation of weight updates, calculation of feed-forward signals, and calculation of feedback signals. The LNS expresses a high dynamic range and computational energy efficiency, making it advantageous for on-board training in energy-constrained edge devices.
Type:
Application
Filed:
June 11, 2021
Publication date:
August 18, 2022
Applicant:
NVIDIA Corp.
Inventors:
Jiawei Zhao, Steve Haihang Dai, Rangharajan Venkatesan, Ming-Yu Liu, William James Dally, Anima Anandkumar
Abstract: Approaches in accordance with various embodiments can reduce scheduling delays due to concurrent processing requests, as may involve VSyncs in multi-streaming systems. The software synchronization signals can be staggered relative to each other by offsetting an initial synchronization signal. These software synchronization signals can be readjusted over time such that each synchronization signal maintains the same relative offset, as may be with respect to other applications or containers.
Type:
Grant
Filed:
November 5, 2020
Date of Patent:
August 16, 2022
Assignee:
NVIDIA CORPORATION
Inventors:
Bimal Poddar, Donghan Ryu, Michael Gold, Samuel Reed Koser, Xiao Bo Zhao Zhang
Abstract: A method, computer readable medium, and system are disclosed for monitoring a pipeline to detect anomalies such as unusual latency associated with a particular stage. Each stage of the pipeline is configured to update metadata associated with content being processed by inserting a time stamp into the metadata when processing of the content is completed by the stage. The server device can collect the metadata from the last stage of the pipeline and analyze the metadata in order to generate metrics for the pipeline, including a residual latency and/or a gain for each stage of the pipeline. In an embodiment, the content is a frame of video to be displayed on a client device after being rendered by a server device, such as through a streaming service (e.g., a video game streaming service). The server device can adjust the pipeline based on the metrics to improve performance.
Abstract: Learning to estimate a 3D body pose, and likewise the pose of any type of object, from a single 2D image is of great interest for many practical graphics applications and generally relies on neural networks that have been trained with sample data which annotates (labels) each sample 2D image with a known 3D pose. Requiring this labeled training data however has various drawbacks, including for example that traditionally used training data sets lack diversity and therefore limit the extent to which neural networks are able to estimate 3D pose. Expanding these training data sets is also difficult since it requires manually provided annotations for 2D images, which is time consuming and prone to errors. The present disclosure overcomes these and other limitations of existing techniques by providing a model that is trained from unlabeled multi-view data for use in 3D pose estimation.
Abstract: In various embodiments, an encoded sequence (e.g., a compressed sequence for uncompressed data) that includes variable-length codes is decoded in an iterative fashion to generate a decoded sequence of symbols. During each iteration, a group of threads decode in parallel the codes in the encoded sequence to generate symbols. The group of threads then compute offsets based on the sizes of the symbols. Subsequently, the group of threads generates in parallel a contiguous portion of the decoded sequence based on the symbols, an output address, and the offsets.
Abstract: One or more images (e.g., images taken from one or more cameras) may be received, where each of the one or more images may depict a two-dimensional (2D) view of a three-dimensional (3D) scene. Additionally, the one or more images may be utilized to determine a three-dimensional (3D) representation of a scene. This representation may help an entity navigate an environment represented by the 3D scene.
Type:
Grant
Filed:
February 22, 2021
Date of Patent:
August 16, 2022
Assignee:
NVIDIA CORPORATION
Inventors:
Yunzhi Lin, Jonathan Tremblay, Stephen Walter Tyree, Stanley Thomas Birchfield
Abstract: A circuit includes a set of multiple bit generating cells. One or more adjustable current sources is coupled to introduce perturbations into outputs of the bit generating cells. Based on the perturbations, the outputs of a subset less than all of the bit generating cells are selected, and applied as a control.
Type:
Grant
Filed:
February 24, 2021
Date of Patent:
August 9, 2022
Assignee:
NVIDIA Corp.
Inventors:
Sudhir Shrikantha Kudva, Nikola Nedovic, Carl Thomas Gray
Abstract: An error reporting system utilizes a parity checker to receive data results from execution of an original instruction and a parity bit for the data. A decoder receives an error correcting code (ECC) for data resulting from execution of a shadow instruction of the original instruction, and data error correction is initiated on the original instruction result on condition of a mismatch between the parity bit and the original instruction result, and the decoder asserting a correctable error in the original instruction result.
Type:
Grant
Filed:
March 6, 2020
Date of Patent:
August 9, 2022
Assignee:
NVIDIA Corp.
Inventors:
Michael Sullivan, Siva Hari, Brian Zimmer, Timothy Tsai, Stephen W. Keckler
Abstract: Manufacturers perform tests on chips before the chips are shipped to customers. However, defects can occur on a chip after the manufacturer testing and when the chips are used in a system or device. The defects can occur due to aging or the environment in which the chip is employed and can be critical; especially when the chips are used in systems such as autonomous vehicles. To verify the structural integrity of the IC during the lifetime of the product, an in-system test (IST) is disclosed. The IST enables self-testing mechanisms for an IC in working systems. The IST mechanisms provide structural testing of the ICs when in a functional system and at a manufacturer's level of testing. Unlike ATE tests that are running on a separate environment, the IST provides the ability to go from a functional world view to a test mode.
Abstract: The disclosure relates to the transfer of per-pixel transparency information using video codecs that do not provide an alpha channel (alternatively referred to as “transparency-agnostic video codecs”). For example, alpha information of visual elements may be transcoded into the supported channels of a video stream to generate additional samples of a supported color space, which are representative of the alpha information. After being encoded by a “transparency-agnostic video codec” and transmitted, the received alpha information may then be extracted from the supported channels of the video stream to render the received visuals with corresponding per-pixel transparency.
Type:
Grant
Filed:
November 27, 2019
Date of Patent:
August 2, 2022
Assignee:
NVIDIA Corporation
Inventors:
Johannes Zimmermann, Andrija Bosnjakovic, Ashley Reid
Abstract: In various examples, metadata may be generated corresponding to compressed data streams that are compressed according to serial compression algorithms—such as arithmetic encoding, entropy encoding, etc.—in order to allow for parallel decompression of the compressed data. As a result, modification to the compressed data stream itself may not be required, and bandwidth and storage requirements of the system may be minimally impacted. In addition, by parallelizing the decompression, the system may benefit from faster decompression times while also reducing or entirely removing the adoption cycle for systems using the metadata for parallel decompression.
Abstract: The disclosure provides a system to render a virtual reality (VR) scene, and a method and computer program product to determine a follower pose in a VR simulator during a simulation step. In one example, the method includes: (1) computing one or more current candidate poses utilizing input parameters, wherein each of the current candidate poses is a temporal projection of a follower pose along a respective sweep direction towards a leader pose, and wherein an obstruction is located between the follower pose and the leader pose, (2) selecting a target pose from the one or more current candidate poses, (3) refining the target pose utilizing physics-based constraints and the input parameters, wherein the physics-based constraints use a surface of the obstruction, and (4) rendering a new follower pose based on the refined target pose.
Abstract: Approaches in accordance with various embodiments provide for the processing of sparse matrices for mathematical and programmatic operations. In particular, various embodiments enforce sparsity constraints for performing sparse matrix multiply-add instruction (MMA) operations. Deep neural networks can exhibit significant sparsity in the data used in operations, both in the activations and weights. The computational load can be reduced by excluding zero-valued data elements. A sparsity constraint is applied across all submatrices of a sparse matrix, providing fine-grained structured sparsity that is evenly distributed across the matrix. The matrix may then be compressed since a minimum number of elements of the matrix are known to have zero value. Matrix operations are then performed using these matrices.
Type:
Grant
Filed:
April 2, 2019
Date of Patent:
July 19, 2022
Assignee:
NVIDIA Corporation
Inventors:
Jeff Pool, Ganesh Venkatesh, Jorge Albericio Latorre, Jack Choquette, Ronny Krashinsky, John Tran, Feng Xie, Ming Y. Siu, Manan Patel
Abstract: A method for network communication includes receiving from a first network a data packet having a header specifying a first source address in the first network and a destination address in a second network and looking up the first source address in a network address translation (NAT) table. Upon finding, in response to looking up the first source address, that the first source address is not listed in the NAT table, an entry is added to the NAT table specifying a corresponding second source address in the second network. One or more additional first source addresses that are not listed in the NAT table are predictively selected, and one or more further entries are added to the NAT table specifying one or more second source addresses in the public network corresponding to the one or more additional first source addresses.
Abstract: In various examples, radio frequency conducted power of a device—such as a human interface device (HID)—may be adjusted to account for various operating conditions. For example, when a user holds the device in their hand, the radiated power level of the device may be reduced to a level that reduces transmission performance of the device. To account for this, one or more detection mechanisms may be used to determine whether the device is held in hand and, when determined to be held in hand, the radio frequency conducted power of the device may be increased—while complying with regulatory requirements governing wireless transmissions—to increase the performance of the device. When not held in hand, the radio frequency conducted power may be less, thereby resulting in consistent performance of the device under varying operating conditions.