Abstract: Display data used in display frame generation are compressed for efficient storage in a local memory within a graphics processing unit. The compression technique used is difference encoding and before performing difference encoding, display data in RGB format are converted into YCbCr format. Since the component values of adjacent pixels in YCbCr format typically vary less than the component values of the same adjacent pixels in RGB format, converting the display data to YCbCr format before performing difference encoding improves the compression efficiency.
Type:
Grant
Filed:
December 13, 2006
Date of Patent:
June 19, 2018
Assignee:
NVIDIA CORPORATION
Inventors:
Sreenivas Krishnan, Koen Bennebroek, Karthik Bhat, Stefano A. Pescador, David G. Reed, Brad W. Simeral, Edward M. Veeser
Abstract: A transmitter for a serial communications link, a serial communications link and an electronic system are disclosed herein. In one embodiment, the transmitter includes: (1) a communications interface connected to a transmission medium and (2) a safe mode circuit coupled to the communications interface and configured to send data over the transmission medium in a safe mode.
Type:
Grant
Filed:
November 5, 2015
Date of Patent:
June 19, 2018
Assignee:
Nvidia Corporation
Inventors:
Dennis Ma, Marvin Denman, Eric Tyson, Stephen D. Glaser
Abstract: Embodiments related to selecting a runahead poison policy from a plurality of runahead poison policies during microprocessor operation are provided. The example method includes causing the microprocessor to enter runahead upon detection of a runahead event and implementing a first runahead poison policy selected from a plurality of runahead poison policies operative to manage runahead poison injection during runahead. The example method also includes during microprocessor operation, selecting a second runahead poison policy operative to manage runahead poison injection differently from the first runahead poison policy.
Type:
Grant
Filed:
October 26, 2012
Date of Patent:
June 19, 2018
Assignee:
NVIDIA CORPORATION
Inventors:
Magnus Ekman, James van Zoeren, Paul Serris
Abstract: A first thread is placed into a blocked state by causing the thread to perform a blocking pop operation on a hardware-accelerated, single-entry queue. When a synchronization event completes, a second thread may release the first thread from the blocked state pushing a data value onto the hardware accelerated, single-entry queue. The push operation satisfies the blocking pop operation, and the first thread is released.
Abstract: A system and method for procedurally synthesizing a training dataset for training a machine-learning model. In one embodiment, the system includes: (1) a training designer configured to describe variations in content of training images to be included in the training dataset and (2) an image definer coupled to the training designer, configured to generate training image definitions in accordance with the variations and transmit the training image definitions: to a 3D graphics engine for rendering into corresponding training images, and further to a ground truth generator for generating associated ground truth corresponding to the training images, the training images and the associated ground truth comprising the training dataset.
Abstract: A transmitter is configured to scale up a low bandwidth delivered by a first processing element to match a higher bandwidth associated with an interconnect. A receiver is configured to scale down the high bandwidth delivered by the interconnect to match the lower bandwidth associated with a second processing element. The first processing element and the second processing element may thus communicate with one another across the interconnect via the transmitter and the receiver, respectively, despite the bandwidth mismatch between those processing elements and the interconnect.
Type:
Grant
Filed:
September 19, 2013
Date of Patent:
June 12, 2018
Assignee:
NVIDIA Corporation
Inventors:
Marvin A. Denman, Dennis K. Ma, Stephen David Glaser
Abstract: Detecting a tool used on a touch screen. In accordance with a method embodiment of the present invention, a cell value is accessed for each cell of a touch sensing device. The cell value indicates a force applied to the cell. A touch area sample count is determined as a count of how many of the cells have a cell value above a noise floor. A touch area weight is determined as a sum of all cell values for the cells having a cell value above the noise floor. An object touching the touch sensing device is identified based on the touch area sample count and the touch area weight. The object's touch indication may be rejected if the object is not identified. The identity of the object may be reported to a software application.
Type:
Grant
Filed:
May 5, 2016
Date of Patent:
June 12, 2018
Assignee:
Nvidia Corporation
Inventors:
Ilkka Varje, Kirill Artamonov, Aaron Bartholomew
Abstract: A system for multi-client control of a common avatar is provided herein. The system includes, for example, a cloud game engine and a cooperative play engine associated with the cloud game engine and configured to multicast a video stream from the cloud game engine to multiple players, combine separate response streams from the multiple players into a joint response stream based on avatar functions contained therein and provide the joint response stream to the cloud game engine.
Type:
Grant
Filed:
February 15, 2016
Date of Patent:
June 5, 2018
Assignee:
Nvidia Corporation
Inventors:
Jen-Hsun Huang, Spencer Huang, Madison Huang, David Cook
Abstract: A system and method are provided for improving video encoding using content information. A three-dimensional (3D) modeling system produces an encoded video stream. The system includes a content engine, a renderer, and a video encoder. The renderer receives 3D model information from the content engine relating and to produces corresponding two-dimensional (2D) images. The video encoder receives the 2D images and produce a corresponding encoded video stream. The video encoder receives content information from the content engine, transforms the content information into encoder control information, and controls the video encoder using the encoder control information.
Type:
Grant
Filed:
October 1, 2012
Date of Patent:
May 29, 2018
Assignee:
Nvidia Corporation
Inventors:
Hassane S. Azar, Stefan Eckart, Dawid Pajak, Bryan Dudash, Swagat Mohapatra
Abstract: Raw video data is captured, processed, and then stored within a set of buffers. An encoder engine is configured to encode the video data for storage. A feedback controller dynamically adjusts the clock frequency of the encoder engine based on the number of buffers currently occupied by the video data. The feedback controller is tuned so that the clock frequency of the encoder engine will be increased when the number of buffers occupied by video data increases, and the clock frequency of the encoder engine will be decreased when the number of buffers occupied by the video data decreases.
Abstract: Presented systems and methods can facilitate efficient voltage sensing and regulation. In one embodiment, a presented multiple point voltage sensing system includes Multiple point voltage sensing. Multi-point sensing is the scheme where voltage feedback from Silicon to the voltage regulator is an average from multiple points on the die. In one embodiment, multi-point sensing is done by placing multiple sense points across the partition/silicon and merging the sense traces from each sense point with balanced routing. In one embodiment, a presented multiple point voltage sensing system includes Virtual VDD Sensing with guaranteed non-floating feedback. In one exemplary implementation, Virtual VDD Sensing with guaranteed non-floating feedback allows more accurate sensing when a component is power gated off by removing the sensing results associated with the component.
Abstract: A method, system, and computer program product for controlling a sample mask from a fragment shader are disclosed. The method includes the steps of generating a fragment for each pixel that is covered, at least in part, by a primitive and determining coverage information for each fragment corresponding to the primitive. Then, for each fragment, the method includes the steps of generating a sample mask by a fragment shader, replacing the coverage information for the fragment with the sample mask, and writing, based on the sample mask, a result generated by the fragment shader to a memory. The method may be implemented on a parallel processing unit configured to implement, at least in part, a graphics processing pipeline.
Type:
Grant
Filed:
July 27, 2015
Date of Patent:
May 22, 2018
Assignee:
NVIDIA Corporation
Inventors:
Jeffrey Alan Bolz, Eric B. Lum, Rui Manuel Bastos
Abstract: A terminal for communication with a communication network and a method of configuring a subscriber identity device are disclosed. In one embodiment, the terminal includes computer storage configured to store a subscriber identity application, a processing unit operable to provide access to the communication network by executing an instance of the subscriber identity application, and a toolkit file assigning a modem of the terminal to handle at least one communication procedure for effecting said communication with the communication network, wherein the transferred terminal profile information assigns a host processor of the terminal to handle at least one communication procedure for effecting said communication with the communication network.
Abstract: A software development environment (SDE) and a method of compiling integrated source code. One embodiment of the SDE includes: (1) a parser configured to partition an integrated source code into a host code partition and a device code partition, the host code partition including a reference to a device variable, (2) a translator configured to: (2a) embed device machine code, compiled based on the device code partition, into a modified host code, (2b) define a pointer in the modified host code configured to be initialized, upon execution of the integrated source code, to a memory address allocated to the device variable, and (2c) replace the reference with a dereference to the pointer, and (3) a host compiler configured to employ a host library to compile the modified host code.
Type:
Grant
Filed:
November 20, 2013
Date of Patent:
May 15, 2018
Assignee:
Nvidia Corporation
Inventors:
Stephen Jones, Mark Hairgrove, Jaydeep Marathe, Vivek Kini, Bastiaan Aarts
Abstract: In one embodiment of the present invention, a graphics processing unit (GPU) is configured to detect an object in an image using a random forest classifier that includes multiple, identically structured decision trees. Notably, the application of each of the decision trees is independent of the application of the other decision trees. In operation, the GPU partitions the image into subsets of pixels, and associates an execution thread with each of the pixels in the subset of pixels. The GPU then causes each of the execution threads to apply the random forest classifier to the associated pixel, thereby determining a likelihood that the pixel corresponds to the object. Advantageously, such a distributed approach to object detection more fully leverages the parallel architecture of the parallel processing unit (PPU) than conventional approaches. In particular, the PPU performs object detection more efficiently using the random forest classifier than using a cascaded classifier.
Type:
Grant
Filed:
September 17, 2013
Date of Patent:
May 15, 2018
Assignee:
NVIDIA Corporation
Inventors:
Mateusz Jerzy Baranowski, Shalini Gupta, Elif Albuz
Abstract: A method, computer readable medium, and system are disclosed for decoupling data pre-fetch from demand loads. The method includes the steps of receiving, by a processor, a set of instructions that includes a load instruction; and executing, by the processor, the load instruction to perform a load operation. The load operation loads data from a cache unit into a register file. The load instruction includes a no-update operator that prevents the cache unit from updating the cache state information in response to the load operation. The result is that the eviction policy for the cache unit responds to the order of pre-fetch memory access requests rather than the demand load operations.
Abstract: A system and method for constructing binary radix trees in parallel, which are used for as a building block for constructing secondary trees. A non-transitory computer-readable storage medium having computer-executable instructions for causing a computer system to perform a method is disclosed. The method includes determining a plurality of primitives comprising a total number of primitive nodes that are indexed, wherein the plurality of primitives correspond to leaf nodes of a hierarchical tree. The method includes sorting the plurality of primitives. The method includes building the hierarchical tree in a manner requiring at most a linear amount of temporary storage with respect to the total number of primitive nodes. The method includes building an internal node of the hierarchical tree in parallel with one or more of its ancestor nodes.
Abstract: A shut-off circuit interrupts the flow of power to the system circuit of a portable device, when liquids are detected within the portable device. Liquid sensors are placed proximate to the ports of the portable device. The ports may admit the flow of liquids, so the liquid sensors may detect the passage of liquids into the portable device. If the liquid sensors detect liquids entering the portable device, a shut-off circuit interrupts the flow of power from the battery to the system circuit.
Abstract: One embodiment of the present invention sets forth a technique for error-checking a compute task. The technique involves receiving a pointer to a compute task, storing the pointer in a scheduling queue, determining that the compute task should be executed, retrieving the pointer from the scheduling queue, determining via an error-check procedure that the compute task is eligible for execution, and executing the compute task.
Type:
Grant
Filed:
December 9, 2011
Date of Patent:
May 8, 2018
Assignee:
NVIDIA Corporation
Inventors:
Jerome F. Duluk, Jr., Timothy John Purcell, Jesse David Hall, Philip Alexander Cuadra
Abstract: A cellular communication system is described for supporting broadcast transmission in at least one of a plurality of communication cells. The cellular communication system comprises at least one base station (210) capable of broadcasting content to at least one wireless communication unit (226) via at least one relay node (RN) (224), wherein the at least one base station (210) is arranged to supplement the broadcast transmission with at least one augmented unicast transmission associated with the broadcast content.