Abstract: In various examples, applications may be executed on remote computing devices to composite and broadcast gameplay with video and audio data. Systems and methods are disclosed that distribute, between and among various computing devices, processing of tasks including rendering of gameplay, composition of various types of data, and broadcasting of composited data. The tasks may be executed on computing devices that are remote to a client device, such as a virtual machine, GPU, server, and/or other computing device in the cloud, all of which are connected through a network. Customized composited content may be generated within the system, without latency and dropped frames, by distributing tasks such as compositing and rendering of gameplay to computing devices that have high performance capability and are specialized for handling memory- and time-intensive tasks.
Type:
Grant
Filed:
April 15, 2021
Date of Patent:
June 11, 2024
Assignee:
NVIDIA Corporation
Inventors:
James van Welzen, Amit Parikh, Jonathan White, Travis Muhlestein
Abstract: Systems and methods for operating a datacenter are disclosed. In at least one embodiment, a power delivery system includes one or more fuel cells to provide a source of electrical power for a datacenter, where waste heat produced by a fuel cell is to be captured and provided to an absorption chiller to produce a cooled liquid for use in a cooling system for this datacenter.
Abstract: Machine learning systems that implement neural networks typically operate in an inference mode or a training mode. In the training mode, inference operations are performed to help guide the training process. Inference mode operation typically involves forward propagation and intensive access to certain sparse matrices, encoded as a set of vectors. Back propagation and intensive access to transposed versions of the same sparse matrices provide training refinements. Generating a transposed version of a sparse matrix can consume significant additional memory and computation resources. In one embodiment, two additional encoding vectors are generated, providing efficient operations on sparse matrices and also on transposed representations of the same sparse matrices. In a neural network the efficient operations can reduce the amount of memory needed for backpropagation and reduce power consumption.
Abstract: Systems and methods for machine learning based seatbelt position detection and classification. A number of fiducial markers are placed on a vehicle seatbelt. A camera or other sensor is placed within the vehicle, to capture images or other data relating positions of the fiducial markers when the seatbelt is in use. One or more models such as machine learning models may then determine the spatial positions of the fiducial markers from the captured image information, and determine the worn state of the seatbelt. In particular, the system may determine whether the seatbelt is being worn in one or more improper states, such as not being worn or being worn in an unsafe or dangerous manner, and if so, the system may alert the vehicle to take corrective action. In this manner, the system provides constant and real-time monitoring of seatbelts to improve seatbelt usage and safety.
Abstract: A level-shifting circuits utilizing storage cells for shifting signals low-to-high or high-to-low include control drivers with moving supply voltages. The moving supply voltages may power positive or negative supply terminals of the control drivers. The control drivers drive gates of common-source configured devices coupled to storage nodes of the storage cell.
Type:
Grant
Filed:
September 14, 2022
Date of Patent:
June 11, 2024
Assignee:
NVIDIA CORP.
Inventors:
Walker Joseph Turner, John Poulton, Sanquan Song
Abstract: In various examples, systems and methods are disclosed relating to generating dialogue responses from structured data for conversational artificial intelligence (AI) systems and applications. Systems and methods are disclosed for training or updating a machine learning model—such as a deep neural network—for deployment using structured data from dialogues of multiple domains. The systems and methods can generate responses to users to provide a more natural user experience, such as by generating alternative outputs that vary in syntax with respect to how the outputs incorporate data used to respond to user utterances, while still accurately providing information to satisfy requests from users.
Abstract: Apparatuses, systems, and techniques to generate images. In at least one embodiment, one or more machine learning models generate an output image based, at least in part, on calculating attention scores using time embeddings.
Type:
Application
Filed:
July 17, 2023
Publication date:
June 6, 2024
Applicant:
NVIDIA Corporation
Inventors:
Ali Hatamizadeh, Jiaming Song, Jan Kautz, Arash Vahdat
Abstract: Implicit Memory Tagging (IMT) mechanisms utilizing alias-free memory tags that enable hardware-assisted memory tagging without incurring storage overhead above those incurred by conventional tagging mechanisms, while providing enhanced data integrity and memory security. The IMT mechanisms enhance the utility of error correcting codes (ECCs) to test memory tags in addition to the traditional utility of ECCs for detecting and correcting data errors and enable a finer granularity of memory tagging than many conventional approaches.
Type:
Application
Filed:
October 11, 2023
Publication date:
June 6, 2024
Applicant:
NVIDIA Corp.
Inventors:
Michael B. Sullivan, Mohamed Tarek Bnziad Mohamed Hassan, Aamer Jaleel
Abstract: Aspects of this technical solution can identify, based at least on a representation of a quantum computing circuit, a first node of a topology of a computing platform configured to simulate at least a portion of the quantum computing circuit, compute a first metric indicating a first latency including the first node, the first latency based at least on a portion of the topology including the first node, select a second node of the topology having a second metric indicating a second latency less than the first latency, the second latency based at least on a portion of the topology including the second node, and simulate the quantum computing circuit on the computing platform using the second node.
Abstract: In various examples, past location information corresponding to actors in an environment and map information may be applied to a deep neural network (DNN)—such as a recurrent neural network (RNN)—trained to compute information corresponding to future trajectories of the actors. The output of the DNN may include, for each future time slice the DNN is trained to predict, a confidence map representing a confidence for each pixel that an actor is present and a vector field representing locations of actors in confidence maps for prior time slices. The vector fields may thus be used to track an object through confidence maps for each future time slice to generate a predicted future trajectory for each actor. The predicted future trajectories, in addition to tracked past trajectories, may be used to generate full trajectories for the actors that may aid an ego-vehicle in navigating the environment.
Abstract: The technology disclosed herein involves using a machine learning model (e.g., CNN) to expand lower dynamic-range image content (e.g., SDR images) into higher dynamic-range image content (e.g., HDR images). The machine learning model can take as input the lower dynamic-range image and can output multiple expansion maps that are used to make the expanded image appear more natural. The expansion maps may be used by image operators to smooth color banding and to dim overexposed regions or user interface elements in the expanded image. The expanded content (e.g., HDR image content) may then be provided to one or more devices for display or storage.
Type:
Grant
Filed:
March 2, 2022
Date of Patent:
June 4, 2024
Assignee:
Nvidia Corporation
Inventors:
Shaveen Kumar, Anjul Patney, Eric Xu, Anton Moor
Abstract: Apparatuses, systems, and techniques for handling faults by a direct memory access (DMA) engine. When a DMA engine detects an error associated with an encryption or decryption operation, the DMA engine reports the error to a CPU, which may be executing an untrusted software directing a DMA operation, and the secure processor. The DMA engine waits for clearance from the secure processor before responding to further directions from the potentially untrusted software.
Abstract: A combined on-package and off-package memory system uses a custom base-layer within which are fabricated one or more dedicated interfaces to off-package memories. An on-package processor and on-package memories are also directly coupled to the custom base-layer. The custom base-layer includes memory management logic between the processor and memories (both off and on package) to steer requests. The memories are exposed as a combined memory space having greater bandwidth and capacity compared with either the off-package memories or the on-package memories alone. The memory management logic services requests while maintaining quality of service (QoS) to satisfy bandwidth requirements for each allocation. An allocation may include any combination of the on and/or off package memories. The memory management logic also manages data migration between the on and off package memories.
Type:
Grant
Filed:
August 23, 2023
Date of Patent:
June 4, 2024
Assignee:
NVIDIA Corporation
Inventors:
Niladrish Chatterjee, James Michael O'Connor, Donghyuk Lee, Gaurav Uttreja, Wishwesh Anil Gandhi
Abstract: Apparatuses, systems, and techniques to compute cyclic redundancy checks use a graphics processing unit (GPU) to compute cyclic redundancy checks. For example, in at least one embodiment, an input data sequence is distributed among GPU threads for parallel calculation of an overall CRC value for the input data sequence according to various novel techniques described herein.
Abstract: Disclosed are apparatuses, systems, and techniques to perform and facilitate secure ladder computational operations whose iterative execution depends on secret values associated with input data. Disclosed embodiments balance execution of various iterations in a way that is balanced for different secret values, significantly reducing vulnerability of ladder computations to adversarial side-channel attacks.
Abstract: Apparatuses, systems, and techniques to estimate one or more wireless channels between one or more user devices and a base station. In at least one embodiment, one or more circuits use one or more groups of two or more reflected wireless reference signals to estimate the one or more wireless channels based, at least in part, on one or more bandlimited functions.
Abstract: A method, computer program product, apparatus, and system are provided. Some embodiments may include transmitting a request to make one or more writes associated with an identification tag. The request may include the identification tag, the one or more writes, a first instruction to make the one or more writes to one of a plurality of persistence levels of a memory, and a second instruction to respond with at least one first indication that the one or more writes associated with the identification tag have been written to at least one of the one of the plurality of persistence levels of the memory. Some embodiments may include receiving the at least one first indication that the one or more writes associated with the identification tag have been written to at least one of the one of the plurality of persistence levels of the memory.
Abstract: A neural network includes at least a first network layer that includes a first set of filters and a second network layer that includes a second set of filters. Notably, a filter was removed from the first network layer. A bias associated with a different filter included in the second set of filters compensates for a different bias associated with the filter that was removed from the first network layer.
Abstract: In examples, the number of rays used to sample lighting conditions of a light source in a virtual environment with respect to particular locations in the virtual environment may be adapted to scene conditions. An additional ray(s) may be used for locations that tend to be associated with visual artifacts in rendered images. A determination may be made on whether to cast an additional ray(s) to a light source for a location and/or a quantity of rays to cast. To make the determination variables such as visibilities and/or hit distances of ray-traced samples of the light source may be analyzed for related locations in the virtual environment, such as those in a region around the location (e.g., within an N-by-N kernel centered at the location). Factors may include variability in visibilities and/or hit distances, differences between visibilities and/or hit distances relative to the location, and magnitudes of hit distances.
Abstract: Systems and methods for cooling a datacenter are disclosed. In at least one embodiment, an integrated power and coolant distribution unit (PCDU) is provided to determine a change in a power state or a coolant state of at least one server and to enable a coolant response from an overhead cooling unit (OCU) to dissipate heat from secondary coolant of a secondary cooling loop.