Abstract: Apparatuses, systems, and techniques to optimize processor resources at a user-defined level. In at least one embodiment, priority of one or more tasks are adjusted to prevent one or more other dependent tasks from entering an idle state due to lack of resources to consume.
Type:
Grant
Filed:
December 20, 2019
Date of Patent:
April 9, 2024
Assignee:
Nvidia Corporation
Inventors:
Jonathon Evans, Lacky Shah, Phil Johnson, Jonah Alben, Brian Pharris, Greg Palmer, Brian Fahs
Abstract: A computing system includes a volatile memory, a cache coupled with the volatile memory, and a processing device coupled with the cache and at least one of a storage device or a network port. The processing device is to: generate a plurality of virtual addresses that are sequentially numbered for data that is to be at least one of processed or transferred in response to an input/output (I/O) request; allocate, for the data, a continuous range of physical addresses of the volatile memory; generate a set of hash-based values based on mappings between the plurality of virtual addresses and respective physical addresses of the continuous range of physical addresses; identify a unique cache line of the cache that corresponds to each respective hashed-based value of the set of hash-based values; and cause the data to be directly stored in the unique cache lines of the cache.
Abstract: In various examples, systems and methods for reducing written requirements in a system on chip (SoC) are described herein. For instance, a total number of iterations may be determined for processing data, such as data representing an array. In some circumstances, a set of iterations may include a first number of iterations that is less than a second number of iterations. As such, and during execution of the set of iterations, a predicate flag corresponding to an excess iteration of the set of iterations may be generated, where the excess iteration corresponds to an iteration that is part of a number of excess iterations that is associated with a difference between the first number of iterations and the second number of iterations. Based on the predicate flag, one or more first values corresponding to the iteration may be prevented from being written to memory.
Type:
Grant
Filed:
August 2, 2021
Date of Patent:
April 9, 2024
Assignee:
NVIDIA Corporation
Inventors:
Ching-Yu Hung, Ravi P Singh, Jagadeesh Sankaran, Yen-Te Shih, Ahmad Itani
Abstract: Disclosed are apparatuses, systems, and techniques to perform and facilitate fast and efficient modular computational operations, such as modular division and modular inversion, using shared platforms, including hardware accelerator engines.
Abstract: Systems and techniques for performing multicast-reduction operations. In at least one embodiment, a network device receives first network data associated with a multicast operation to be collectively performed by at least a plurality of endpoints. The network device reserves resources to process second network data to be received from the endpoints, and sends the first network data to a plurality of additional network devices. The network device receives the second network data, and processes the second network data using the reserved resources.
Type:
Grant
Filed:
March 30, 2022
Date of Patent:
April 9, 2024
Assignee:
NVIDIA CORPORATION
Inventors:
Glenn Dearth, Mark Hummel, Nan Jiang, Gregory Thorson
Abstract: Systems and methods for cooling a computer environment are disclosed. In at least one embodiment, one or more neural networks can be used to adjust one or more flow control valves, of a liquid cooling system for a data center, to control a variation in liquid flow rate across the data center.
Abstract: In various examples, systems and methods are described that generate scene flow in 3D space through simplifying the 3D LiDAR data to “2.5D” optical flow space (e.g., x, y, and depth flow). For example, LiDAR range images may be used to generate 2.5D representations of depth flow information between frames of LiDAR data, and two or more range images may be compared to generate depth flow information, and messages may be passed—e.g., using a belief propagation algorithm—to update pixel values in the 2.5D representation. The resulting images may then be used to generate 2.5D motion vectors, and the 2.5D motion vectors may be converted back to 3D space to generate a 3D scene flow representation of an environment around an autonomous machine.
Type:
Grant
Filed:
August 2, 2021
Date of Patent:
April 9, 2024
Assignee:
NVIDIA Corporation
Inventors:
David Wehr, Ibrahim Eden, Joachim Pehserl
Abstract: High dynamic range (HDR) support is provided for legacy application programs, such as games that are configured to display standard dynamic range (SDR) frames. HDR frames may be synthesized without modifying the legacy application program. The buffer creation process of the legacy application program is intercepted and modified before creation of the SDR format buffer so that the buffer is configured to use an HDR format. A location of an intermediate buffer storing HDR rendered data is determined by intercepting and analyzing graphics driver calls in a command stream produced by the legacy application program. The HDR rendered data is combined with user interface content extracted from the SDR frames. Additionally, any post processing effects used by the legacy application program to produce the SDR frames may be predicted and applied to the HDR rendered data to synthesize the HDR frames for display on a modern HDR display device.
Abstract: In various examples, systems and methods are disclosed relating to differentially private generative machine learning models. Systems and methods are disclosed for configuring generative models using privacy criteria, such as differential privacy criteria. The systems and methods can generate outputs representing content using machine learning models, such as diffusion models, that are determined in ways that satisfy differential privacy criteria. The machine learning models can be determined by diffusing the same training data to multiple noise levels.
Type:
Application
Filed:
February 3, 2023
Publication date:
April 4, 2024
Applicant:
NVIDIA Corporation
Inventors:
Karsten Julian KREIS, Tim DOCKHORN, Tianshi CAO, Arash VAHDAT
Abstract: Apparatuses, systems, and techniques to render computer graphics. In at least one embodiment, a first one or more lights are selected from among lights in a virtual scene to be rendered as a frame of graphics, and a second one or more lights are selected from among lights used to render one or more pixels in at least one of a prior frame or the current frame. A pixel of the current frame is rendered using the first and second one or more lights, and a light is selected for reuse in rendering a subsequent frame from among the first and second one or more lights.
Abstract: The disclosure provides a framework or system for learning visual representation using a large set of image/text pairs. The disclosure provides, for example, a method of visual representation learning, a joint representation learning system, and an artificial intelligence (AI) system that employs one or more of the trained models from the method or system. The AI system can be used, for example, in autonomous or semi-autonomous vehicles. In one example, the method of visual representation learning includes: (1) receiving a set of image embeddings from an image representation model and a set of text embeddings from a text representation model, and (2) training, employing mutual information, a critic function by learning relationships between the set of image embeddings and the set of text embeddings.
Type:
Grant
Filed:
August 21, 2020
Date of Patent:
April 2, 2024
Assignee:
NVIDIA Corporation
Inventors:
Arash Vahdat, Tanmay Gupta, Xiaodong Yang, Jan Kautz
Abstract: Systems and methods for cooling a datacenter are disclosed. In at least one embodiment, a modular unit is swappable or hot-swappable and has a heat exchanger, a variable speed fan, and at least one flow controller to pass fluid through microchannels of a cold plate, so that the fluid extracts heat from at least one computing device and so that fluid through a heat exchanger enables dissipation of heat by forced air from a variable speed fan.
Abstract: In various examples, two or more cameras in an automotive surround view system generate two or more input images to be stitched, or combined, into a single stitched image. In an embodiment, to improve the quality of a stitched image, a feedback module calculates two or more scores representing errors between the stitched image and one or more input images. If a computed score indicates structural errors in the stitched image, the feedback module calculates and applies one or more geometric transforms to apply to the one or more input images. If a computed score indicates color errors in the stitched image, the feedback module calculates and applies one or more photometric transforms to apply to the one or more input images.
Abstract: In various embodiments of the present disclosure, playstyle patterns of players are learned and used to generate virtual representations (“bots”) of users. Systems and methods are disclosed that use game session data (e.g., metadata) from a plurality of game sessions of a game to learn playstyle patterns of users, based on user inputs of the user in view of variables presented within the game sessions. The game session data is applied to one or more machine learning models to learn playstyle patterns of the user for the game, and associated with a user profile of the user. Profile data representative of the user profile is then used to control or instantiate bots of the users, or of categories of users, according to the learned playstyle patterns.
Type:
Grant
Filed:
February 25, 2021
Date of Patent:
April 2, 2024
Assignee:
NVIDIA Corporation
Inventors:
Andrew Fear, Brian Burke, Pillulta Venkata Naga Hanumath Prasad, Abhishek Lalwani
Abstract: A system includes a hardware circuitry having a device coupled with one or more external memory devices. The device is to detect an input/output (I/O) request associated with an external memory device of the one or more external memory devices. The device is to record a first timestamp in response to detecting the IO request transmitted to the external memory device. The device is further to detect an indication from the external memory device of a completion of the IO request associated with the external memory device and record a second timestamp in response to detecting the indication. The device is also to determine a latency associated with the IO request based on the first timestamp and the second timestamp.
Type:
Grant
Filed:
April 6, 2022
Date of Patent:
April 2, 2024
Assignee:
NVIDIA Corporation
Inventors:
Shridhar Rasal, Oren Duer, Aviv Kfir, Liron Mula
Abstract: A cheat detection methodology is disclosed that relates to identifying cheaters making super-human movements in interactive programs. For example, users trying to outcompete their opponents in video games using large aim assists and aim bots that perform actions that are not feasibly human. The disclosed methodology substantially reduces or even eliminates the benefit that various cheating solutions offer. In one aspect, the disclosure provides a method of monitoring cheating in interactive programs. In one example, the method includes: (1) obtaining motion data corresponding to a user input device interacting with an interactive program, (2) segmenting data submovements from the motion data, (3) determining one or more attributes of the data submovements, and (4) detecting, based on the one or more attributes, the data submovements that are a deviation of human submovements.
Abstract: Techniques are described for detecting an electromagnetic (“EM”) fault injection attack directed toward circuitry in a target digital system. In various embodiments, a first node may be coupled to first driving circuitry, and a second node may be coupled to second driving circuitry. The driving circuitry is implemented in a manner such that a logic state on the second node has greater sensitivity to an EM pulse than has a logic state on the first node. Comparison circuitry may be coupled to the first and to the second nodes to assert an attack detection output responsive to sensing a logic state on the second node that is unexpected relative to a logic state on the first node.
Abstract: In various examples, sensor data may be received that represents a field of view of a sensor of a vehicle located in a physical environment. The sensor data may be applied to a machine learning model that computes both a set of boundary points that correspond to a boundary dividing drivable free-space from non-drivable space in the physical environment and class labels for boundary points of the set of boundary points that correspond to the boundary. Locations within the physical environment may be determined from the set of boundary points represented by the sensor data, and the vehicle may be controlled through the physical environment within the drivable free-space using the locations and the class labels.
Abstract: In various examples, a VPU and associated components may be optimized to improve VPU performance and throughput. For example, the VPU may include a min/max collector, automatic store predication functionality, a SIMD data path organization that allows for inter-lane sharing, a transposed load/store with stride parameter functionality, a load with permute and zero insertion functionality, hardware, logic, and memory layout functionality to allow for two point and two by two point lookups, and per memory bank load caching capabilities. In addition, decoupled accelerators may be used to offload VPU processing tasks to increase throughput and performance, and a hardware sequencer may be included in a DMA system to reduce programming complexity of the VPU and the DMA system. The DMA and VPU may execute a VPU configuration mode that allows the VPU and DMA to operate without a processing controller for performing dynamic region based data movement operations.
Type:
Grant
Filed:
January 6, 2023
Date of Patent:
March 26, 2024
Assignee:
NVIDIA Corporation
Inventors:
Ching-Yu Hung, Ravi P. Singh, Jagadeesh Sankaran, Yen-Te Shih, Ahmad Itani
Abstract: In various examples, compute resources may be allocated for highlight generation in cloud gaming systems. Systems and methods are disclosed that distribute, between and among various devices, processing including user interface generation and overlay, analysis of game streams for actionable events, generation of highlights, storage of highlights, and sharing of highlights. The distribution of processing or compute resources within the cloud gaming system may be dependent on system information of various devices and/or networks. Recordings, snapshots, and/or other highlights may be generated within the cloud gaming system using the determined distribution of compute resources.