Patents Assigned to NVidia
-
Patent number: 11082347Abstract: Multiple processors are often used in computing systems to solve very large, complex problems, such as those encountered in artificial intelligence. Such processors typically exchange data among each other via an interconnect fabric (such as, e.g., a group of network connections and switches) in solving such complex problems. The amount of data injected into the interconnect fabric by the processors can at times overwhelm the interconnect fabric preventing some of the processors from communicating with each other. To address this problem, techniques are disclosed to enable, for example, processors that are connected to an interconnect fabric to coordinate and control the amount of data injected so that the interconnect fabric does not get overwhelmed.Type: GrantFiled: February 15, 2019Date of Patent: August 3, 2021Assignee: Nvidia CorporationInventors: Glenn Dearth, Nan Jiang, John Wortman, Alex Ishii, Mark Hummel, Rich Reeves
-
Patent number: 11079434Abstract: In various examples, a test system is provided for executing built-in-self-test (BIST) on integrated circuits deployed in the field. The integrated circuits may include a first device and a second device, the first device having direct access to external memory, which stores test data, and the second device having indirect access to the external memory by way of the first device. In addition to providing a mechanism to permit the first device and the second device to run test concurrently, the hardware and software may reduce memory requirements and runtime associated with running the test sequences, thereby making real-time BIST possible in deployment. Furthermore, some embodiments permit a single external memory image to cater to different SKU configurations.Type: GrantFiled: October 10, 2019Date of Patent: August 3, 2021Assignee: NVIDIA CorporationInventors: Anitha Kalva, Jue Wu
-
Patent number: 11082720Abstract: A method, computer readable medium, and system are disclosed for identifying residual video data. This data describes data that is lost during a compression of original video data. For example, the original video data may be compressed and then decompressed, and this result may be compared to the original video data to determine the residual video data. This residual video data is transformed into a smaller format by means of encoding, binarizing, and compressing, and is sent to a destination. At the destination, the residual video data is transformed back into its original format and is used during the decompression of the compressed original video data to improve a quality of the decompressed original video data.Type: GrantFiled: November 14, 2018Date of Patent: August 3, 2021Assignee: NVIDIA CORPORATIONInventors: Yi-Hsuan Tsai, Ming-Yu Liu, Deqing Sun, Ming-Hsuan Yang, Jan Kautz
-
Patent number: 11080051Abstract: A technique for block data transfer is disclosed that reduces data transfer and memory access overheads and significantly reduces multiprocessor activity and energy consumption. Threads executing on a multiprocessor needing data stored in global memory can request and store the needed data in on-chip shared memory, which can be accessed by the threads multiple times. The data can be loaded from global memory and stored in shared memory using an instruction which directs the data into the shared memory without storing the data in registers and/or cache memory of the multiprocessor during the data transfer.Type: GrantFiled: December 12, 2019Date of Patent: August 3, 2021Assignee: NVIDIA CorporationInventors: Andrew Kerr, Jack Choquette, Xiaogang Qiu, Omkar Paranjape, Poornachandra Rao, Shirish Gadre, Steven J. Heinrich, Manan Patel, Olivier Giroux, Alan Kaatz
-
Patent number: 11079764Abstract: In various examples, a current claimed set of points representative of a volume in an environment occupied by a vehicle at a time may be determined. A vehicle-occupied trajectory and at least one object-occupied trajectory may be generated at the time. An intersection between the vehicle-occupied trajectory and an object-occupied trajectory may be determined based at least in part on comparing the vehicle-occupied trajectory to the object-occupied trajectory. Based on the intersection, the vehicle may then execute the first safety procedure or an alternative procedure that, when implemented by the vehicle when the object implements the second safety procedure, is determined to have a lesser likelihood of incurring a collision between the vehicle and the object than the first safety procedure.Type: GrantFiled: February 1, 2019Date of Patent: August 3, 2021Assignee: NVIDIA CorporationInventors: David Nister, Hon-Leung Lee, Julia Ng, Yizhou Wang
-
Patent number: 11080111Abstract: Apparatuses, systems, and techniques to execute programs in a single hardware context on a graphics processing unit (GPU). In at least one embodiment, resource management patches expressed in library or executable code are applied to one or more kernels to ensure execution in a shared context on a GPU.Type: GrantFiled: February 24, 2020Date of Patent: August 3, 2021Assignee: NVIDIA CorporationInventors: Kyrylo Perelygin, Cory Perry, Ze Long
-
Patent number: 11074717Abstract: An object detection neural network receives an input image including an object and generates belief maps for vertices of a bounding volume that encloses the object. The belief maps are used, along with three-dimensional (3D) coordinates defining the bounding volume, to compute the pose of the object in 3D space during post-processing. When multiple objects are present in the image, the object detection neural network may also generate vector fields for the vertices. A vector field comprises vectors pointing from the vertex to a centroid of the object enclosed by the bounding volume defined by the vertex. The object detection neural network may be trained using images of computer-generated objects rendered in 3D scenes (e.g., photorealistic synthetic data). Automatically labelled training datasets may be easily constructed using the photorealistic synthetic data. The object detection neural network may be trained for object detection using only the photorealistic synthetic data.Type: GrantFiled: May 7, 2019Date of Patent: July 27, 2021Assignee: NVIDIA CorporationInventors: Jonathan Tremblay, Thang Hong To, Stanley Thomas Birchfield
-
Parallel pipelines for computing backlight illumination fields in high dynamic range display devices
Patent number: 11074871Abstract: A display controller generates a backlight illumination field (BLIF) based on a coarse point-spread function (PSF) and a correction PSF. The display controller samples the coarse PSF to accumulate light contributions from a larger neighborhood of LEDs around a given LCD pixel. The display controller samples the correction PSF to generate correction factors for a smaller neighborhood of LEDs around the given LCD pixel. The display controller interpolates samples drawn from the coarse PSF and samples drawn from the correction PSF and then combines the interpolated samples to generate a full resolution BLIF.Type: GrantFiled: March 24, 2020Date of Patent: July 27, 2021Assignee: NVIDIA CorporationInventor: Jens Roever -
Patent number: 11069023Abstract: A technique selectively avoids memory fetches for partially uniform textures in real time graphics shader programs and instead uses program paths specialized for one or more frequently occurring values. One aspect avoids memory lookups and dependent computations for partially uniform textures through use of pre-constructed coarse-grained representations called value locality maps or dirty tilemaps (DTMs). The decision to use a specialized fast path or not is made dynamically by consulting such coarse-grained dirty tilemap representations. Thread-sharing value reuse can be implemented with or instead of the DTM mechanism.Type: GrantFiled: May 24, 2019Date of Patent: July 20, 2021Assignee: NVIDIA CorporationInventor: Ram Rangan
-
Patent number: 11067806Abstract: An augmented reality display system includes a first beam path for a foveal inset image on a holographic optical element, a second beam path for a peripheral display image on the holographic optical element, and pupil position tracking logic that generates control signals to set a position of the foveal inset as perceived through the holographic optical element, to determine the peripheral display image, and to control a moveable stage.Type: GrantFiled: May 31, 2019Date of Patent: July 20, 2021Assignee: NVIDIA Corp.Inventors: Jonghyun Kim, Youngmo Jeong, Michael Stengel, Morgan McGuire, David Luebke
-
Simulating a cable driven system by simulating the effect of cable portions on objects of the system
Patent number: 11068626Abstract: A cable driving a large system such as cable driven machines, cable cars or tendons in a human or robot is typically modeled as a large number of small segments that are connected via joints. The two main difficulties with this approach are satisfying the inextensibility constraint and handling the typically large mass ratio between the small segments and the larger objects they connect. This disclosure introduces a more effective approach to solving these problems. The introduced approach simulates the effect of a cable instead of the cable itself using a new type of distance constraint called ‘cable joint’ that changes both its attachment points and its rest length dynamically. The introduced approach models a cable connecting a series of objects as a sequence of cable joints, reducing the complexity of the simulation from the order of the number of segments in the cable to the number of connected objects.Type: GrantFiled: October 4, 2018Date of Patent: July 20, 2021Assignee: Nvidia CorporationInventors: Matthias Mueller-Fischer, Stefan Jeschke, Miles Macklin, Nuttapong Chentanez -
Patent number: 11070205Abstract: When a signal glitches, logic receiving the signal may change in response, thereby charging and/or discharging nodes within the logic and dissipating power. Providing a glitch-free signal may reduce the number of times the nodes are charged and/or discharged, thereby reducing the power dissipation. A technique for eliminating glitches in a signal is to insert a storage element that samples the signal after it is done changing to produce a glitch-free output signal. The storage element is enabled by a “ready” signal having a delay that matches the delay of circuitry generating the signal. The technique prevents the output signal from changing until the final value of the signal is achieved. The output signal changes only once, typically reducing the number of times nodes in the logic receiving the signal are charged and/or discharged so that power dissipation is also reduced.Type: GrantFiled: July 2, 2020Date of Patent: July 20, 2021Assignee: NVIDIA CorporationInventor: William James Dally
-
Patent number: 11069129Abstract: In various examples, shader bindings may be recorded in a shader binding table that includes shader records. Geometry of a 3D scene may be instantiated using object instances, and each may be associated with a respective set of the shader records using a location identifier of the set of shader records in memory. The set of shader records may represent shader bindings for an object instance under various predefined conditions. One or more of these predefined conditions may be implicit in the way the shader records are arranged in memory (e.g., indexed by ray type, by sub-geometry, etc.). For example, a section selector value (e.g., a section index) may be computed to locate and select a shader record based at least in part on a result of a ray tracing query (e.g., what sub-geometry was hit, what ray type was traced, etc.).Type: GrantFiled: April 5, 2019Date of Patent: July 20, 2021Assignee: NVIDIA CorporationInventors: Martin Stich, Ignacio Llamas, Steven Parker
-
Patent number: 11069095Abstract: A sample mask is used to control which samples are used in a filtering operation such as bilinear filtering. A conventional filtering operation reads a set of samples based on a single coordinate and combines the samples to produce a filtered sample value. Such filtering operations are performed conventionally using fixed function units designed specifically to perform such filtering operations. However, for some applications, excluding one or more of the samples in producing a filtered sample value is desirable. In other applications, combining the samples using different weighting factors is also desirable. Techniques are disclosed herein for extending the capabilities of existing filtering units, for example, to exclude one or more samples in the filtering operation and for specifying different weighting rules for combining the samples.Type: GrantFiled: June 2, 2020Date of Patent: July 20, 2021Assignee: NVIDIA CorporationInventor: Evgenii Makarov
-
Patent number: 11068781Abstract: A method, computer readable medium, and system are disclosed for implementing a temporal ensembling model for training a deep neural network. The method for training the deep neural network includes the steps of receiving a set of training data for a deep neural network and training the deep neural network utilizing the set of training data by: analyzing the plurality of input vectors by the deep neural network to generate a plurality of prediction vectors, and, for each prediction vector in the plurality of prediction vectors corresponding to the particular input vector, computing a loss term associated with the particular input vector by combining a supervised component and an unsupervised component according to a weighting function and updating the target prediction vector associated with the particular input vector.Type: GrantFiled: September 29, 2017Date of Patent: July 20, 2021Assignee: NVIDIA CorporationInventors: Samuli Matias Laine, Timo Oskari Aila
-
Patent number: 11061571Abstract: In various embodiments, a memory interface unit organizes data within a memory tile to facilitate efficient memory accesses. In an embodiment, a memory tile represents a portion of memory that holds multiple chunks of data, where each chunk is stored either in a non-compressed or in a smaller compressed data format. In an embodiment, the tile is organized to pack multiple compressed chunks together so that multiple compressed chunks can be retrieved from memory with a single read access. In another embodiment, the tile is organized to store redundant copies of compressed chunks so that a compressed chunk can be quickly decompressed within a tile without having to relocate other compressed chunks in the tile. Additional embodiments are further disclosed for allowing efficient accesses to both compressed and non-compressed data.Type: GrantFiled: March 19, 2020Date of Patent: July 13, 2021Assignee: NVIDIA CorporationInventors: Praveen Krishnamurthy, Wishwesh Anil Gandhi
-
Patent number: 11063629Abstract: Various embodiments include techniques for detecting a poor-quality cable associated with a wired communications channel that is causing noise that interferes with a wireless communications channel. The techniques are directed towards a test that a processor performs when the cable is installed in a user system. A wireless test application, executing on one or more processors of the system, determines a noise floor for the wireless communications channel when the wired communications channel is disabled. The wireless test application determines the noise power for the wireless communications channel when the wired communications channel is enabled, thereby causing interference in the wireless communications channel. The wireless test application compares the noise power to the noise floor in order to determine whether the cable is a high-quality cable or a low-quality cable.Type: GrantFiled: October 14, 2020Date of Patent: July 13, 2021Assignee: NVIDIA CorporationInventors: Srirama Bhupatiraju, Tom Winton
-
Patent number: 11064203Abstract: Real-time, hardware-implementable Structured Similarity (SSIM)-based rate distortion optimization (RDO) techniques for video transmission are described. The disclosed techniques provide efficient application of SSIM as a distortion metric in selecting prediction modes for encoding video for transmission. A prediction mode, at a high level, specifies which previously encoded group of pixels can be utilized to encode a subsequent block of pixels in a video frame. A less compute intensive distortion metric is first used to select a subset of candidate prediction modes. Then a more compute intensive SSIM-based selection is made on the subset. By utilizing the disclosed techniques during video encoding, tradeoffs between distortion and transmission rate can be made that are more relevant to human perception.Type: GrantFiled: March 12, 2018Date of Patent: July 13, 2021Assignee: NVIDIA CORPORATIONInventors: Megamus Zhang, Jant Chen, Steven Feng, Shining Yi
-
Patent number: 11061741Abstract: Techniques are disclosed for reducing the latency associated with performing data reductions in a multithreaded processor. In response to a single instruction associated with a set of threads executing in the multithreaded processor, a warp reduction unit acquires register values stored in source registers, where each register value is associated with a different thread included in the set of threads. The warp reduction unit performs operation(s) on the register values to compute an aggregate value. The warp reduction unit stores the aggregate value in a destination register that is accessible to at least one of the threads in the set of threads. Because the data reduction is performed via a single instruction using hardware specialized for data reductions, the number of cycles required to perform the data reduction is decreased relative to prior-art techniques that are performed via multiple instructions using hardware that is not specialized for data reductions.Type: GrantFiled: July 16, 2019Date of Patent: July 13, 2021Assignee: NVIDIA CorporationInventors: Peter Nelson, Olivier Giroux, Ajay Sudarshan Tirumala
-
Patent number: 11062471Abstract: Stereo matching generates a disparity map indicating pixels offsets between matched points in a stereo image pair. A neural network may be used to generate disparity maps in real time by matching image features in stereo images using only 2D convolutions. The proposed method is faster than 3D convolution-based methods, with only a slight accuracy loss and higher generalization capability. A 3D efficient cost aggregation volume is generated by combining cost maps for each disparity level. Different disparity levels correspond to different amounts of shift between pixels in the left and right image pair. In general, each disparity level is inversely proportional to a different distance from the viewpoint.Type: GrantFiled: May 6, 2020Date of Patent: July 13, 2021Assignee: NVIDIA CorporationInventors: Yiran Zhong, Wonmin Byeon, Charles Loop, Stanley Thomas Birchfield