Abstract: One embodiment of the present invention sets forth a technique for addressing data in a hierarchical graphics processing unit cluster. A hierarchical address is constructed based on the location of a storage circuit where a target unit of data resides. The hierarchical address comprises a level field indicating a hierarchical level for the unit of data and a node identifier that indicates which GPU within the GPU cluster currently stores the unit of data. The hierarchical address may further comprise one or more identifiers that indicate which storage circuit in a particular hierarchical level currently stores the unit of data. The hierarchical address is constructed and interpreted based on the level field. The technique advantageously enables programs executing within the GPU cluster to efficiently access data residing in other GPUs using the hierarchical address.
Abstract: The invention discloses a semiconductor memory device and a method for word line decoding and routing. The present invention relates generally to semiconductor memory field, Problems solved by the invention is that, to improve the quality of word line signals results in routing congestion. Embodiments of the invention provide the program as follows: a semiconductor memory device and a method for word line decoding and routing, dividing memory array of the semiconductor memory device into a plurality of smaller memory arrays, on a first metal layer routing first decoded row address, on a second metal layer below the first metal layer routing second decoded row address, and the output word line after decoding drives the plurality of smaller memory arrays. Embodiments of the invention are suitable for various semiconductor memory designs, including: on-chip cache, translation look-aside buffer, content addressable memory, ROM, EEPROM, and SRAM and so on.
Type:
Grant
Filed:
August 9, 2012
Date of Patent:
March 17, 2015
Assignee:
NVIDIA Corporation
Inventors:
Yongchang Huang, Jing Guo, Hua Chen, Jiping Ma
Abstract: A client computing device transmits commands and/or data to a software application executing on a server computing device. The server computing device includes one or more graphics processing units (GPUs) that render frames of graphic data associated with the software application. For each frame, the one or more GPUs copy the frame to memory. A server engine also executing on the server computing device divides the frame into subframes, compresses each subframe, and transmits compressed subframes to the client computing device. The client computing device decompresses and reassembles the frame for display to an end-user of the client computing device.
Abstract: A method includes continuously capturing, through an application executing on a data processing device, images of a desktop of the data processing device as a background process as part of a testing session on the data processing device in an active mode thereof. The method also includes encoding, through a processor of the data processing device, the captured images of the desktop as a video sequence, and providing a capability to a user of the data processing device and/or another data processing device to detect a fault event related to the testing session based on access to the encoded video sequence.
Abstract: One embodiment of the present invention sets forth a technique for translating application programs written using a parallel programming model for execution on multi-core graphics processing unit (GPU) for execution by general purpose central processing unit (CPU). Portions of the application program that rely on specific features of the multi-core GPU are converted by a translator for execution by a general purpose CPU. The application program is partitioned into regions of synchronization independent instructions. The instructions are classified as convergent or divergent and divergent memory references that are shared between regions are replicated. Thread loops are inserted to ensure correct sharing of memory between various threads during execution by the general purpose CPU.
Type:
Grant
Filed:
March 31, 2009
Date of Patent:
March 17, 2015
Assignee:
Nvidia Corporation
Inventors:
Vinod Grover, Bastiaan Joannes Matheus Aarts, Michael Murphy
Abstract: One embodiment of the present invention sets forth a technique for enabling the insertion of generated tasks into a scheduling pipeline of a multiple processor system allows a compute task that is being executed to dynamically generate a dynamic task and notify a scheduling unit of the multiple processor system without intervention by a CPU. A reflected notification signal is generated in response to a write request when data for the dynamic task is written to a queue. Additional reflected notification signals are generated for other events that occur during execution of a compute task, e.g., to invalidate cache entries storing data for the compute task and to enable scheduling of another compute task.
Type:
Grant
Filed:
December 16, 2011
Date of Patent:
March 17, 2015
Assignee:
Nvidia Corporation
Inventors:
Timothy John Purcell, Lacky V. Shah, Jerome F. Duluk, Jr., Sean J. Treichler, Karim M. Abdalla, Philip Alexander Cuadra, Brian Pharris
Abstract: A method includes implementing, through a processor communicatively coupled to a memory and/or a hardware block, a Bilateral Filter (BF) including a spatial filter component and a range filter component, and implementing the spatial filter component with a low-complexity function to allow for focus on the range filter component. The method also includes determining, through the processor, filter tap value(s) related to the range filter component as a function of radiometric distance between a pixel of a video frame and/or an image and other pixels thereof based on a pre-computed corpus of data related to execution of an application in accordance with a filtering requirement of the pixel by the application. Further, the method includes constraining, through the processor, the filter tap value(s) to a form i×base based on the BF implementation. i is an integer and base is a floating point base.
Abstract: A method, system, and computer-program product are provided for automatically performing stability testing on device firmware. The method includes the steps of copying a binary file corresponding to a version of a firmware to one or more nodes that each include a testbench, causing the one or more nodes to perform tests utilizing the version of the firmware, and determining whether a new build of the firmware is available. If the new build is available, then the steps include copying a second binary file corresponding to the new build to the one or more nodes and causing the one or more nodes to perform the tests utilizing the new build. However, if the new build is not available, then the steps include then causing the one or more nodes to perform one or more further iterations of the tests utilizing the version of the firmware.
Abstract: A system, method, and computer program product are provided for using compression with programmable sample locations, where the compression is a function of the programmable sample locations. The method includes the steps of storing a first value specifying a programmed sample location within a pixel in a first sample pattern table that is associated with a first display surface and storing, in a memory, geometric surface parameters corresponding to a first attribute at the programmed sample location within a first pixel of the first display surface. A second value specifying the programmed sample location within the pixel in a second sample pattern table that is associated with a second display surface is also stored and the first attribute is reconstructed based on the geometric surface parameters and the first value.
Type:
Application
Filed:
September 11, 2013
Publication date:
March 12, 2015
Applicant:
NVIDIA Corporation
Inventors:
Eric B. Lum, Jeffrey Alan Bolz, Rui Manuel Bastos, Andrei Khodakovsky, Christian Johannes Amsinck, Bengt-Olaf Schneider
Abstract: Embodiments are disclosed relating to methods of ordering transactions across a bus of a computing device. One embodiment of a method includes determining a current target memory channel for an incoming transaction request, and passing the incoming transaction request downstream if the current target memory channel matches an outstanding target memory channel indicated by a direction bit of a counter or the counter equals zero. The method further includes holding the incoming transaction request if the counter is greater than zero and the current target memory channel does not match the outstanding target memory channel.
Abstract: The disclosure is directed to systems and methods for switching a data downloading session from a first mobile device to a second mobile device. The data downloading session first commences on the first mobile device. During the session, a download transfer condition is identified. In response, the first mobile device sends the second mobile device download resumption information, which is specifically adapted to enable the second mobile device to continue the data downloading session without having to restart an interrupted downloading operation.
Abstract: A communication interface and method for efficient robust header compression (RoHC). One embodiment of the communication interface includes: (1) a data flow associated with a context ID (CID) and a data flow status indicator, and having packets, and (2) a robust header compression (RoHC) compressor configured to employ the CID to compress headers of the packets and to mark the CID as reusable by another data flow if the data flow status indicator indicates the data flow is terminated.
Type:
Application
Filed:
September 12, 2013
Publication date:
March 12, 2015
Applicant:
Nvidia Corporation
Inventors:
Bruno De Smet, Fabien Besson, Alexander May-Weymann
Abstract: A mobile communication device, a method of establishing a mobile telephone voice call and an apparatus are provided herein. In one embodiment, the mobile communication device includes: 1) a processor configured to indicate a voice call employing the mobile communication device is a hearing impaired call and (3) a modem configured to initiate establishment of the hearing impaired call with a mobile cellular network, wherein the establishment includes providing a hearing impaired codec list to the mobile cellular network.
Type:
Application
Filed:
September 9, 2013
Publication date:
March 12, 2015
Applicant:
Nvidia Corporation
Inventors:
Alexander May-Weymann, Bruno De Smet, Flavien Delorme
Abstract: A modem and a method for handing over Internet protocol (IP) multimedia subsystem (IMS) sessions from a packet-switched network to a circuit-switched network. One embodiment of the modem includes: (1) a physical layer through which IMS packets for a plurality of IMS sessions are transmittable and receivable, and (2) a control layer configured to gain access to respective IMS session data for the plurality of IMS sessions, the respective IMS session data originating from a host IMS application.
Abstract: A device including a touch screen display may be configured to selectively filter touch input. The device may receive a plurality of touch events. The device may determine whether the touch events correspond to a scaling gesture. When the device determines that the touch events correspond to a scaling gesture, the device may apply a smoothing filter to data corresponding to the touch event. The smoothing filter may be a Kalman based filter. The device may perform a scaling operation using the filtered data. When the device determines that the touch events do not correspond to a scaling gesture, to reduce latency, the smoothing filter may not be applied.
Abstract: A system, method, and computer program product are provided for using compression with programmable sample locations, where the compression is a function of the programmable sample locations. The method includes the steps of storing a first value specifying a programmed sample location within a pixel in a sample pattern table and storing, in a memory, geometric surface parameters corresponding to a first attribute at the programmed sample location within a first pixel of a display surface. An instruction to store a second value specifying the programmed sample location within the pixel in the sample pattern table is received. The attribute is reconstructed based on the geometric surface parameters and the first value.
Type:
Application
Filed:
September 11, 2013
Publication date:
March 12, 2015
Applicant:
NVIDIA Corporation
Inventors:
Eric B. Lum, Jeffrey Alan Bolz, Rui Manuel Bastos, Andrei Khodakovsky, Christian Johannes Amsinck, Bengt-Olaf Schneider
Abstract: An image is remotely processed over a network. An electronic device is characterized based on a unique identifier associated therewith and properties data, which relate to display related properties of the device. Local data is collected from the device in relation to real-time conditions and control data and, which correspond to the device in relation to the characterizing. The image is remotely generated for download to the device and includes processing data. The processing data are based on the properties data and the local data.
Abstract: A circuit and method for filtering adjacent channel interferers. One embodiment of an adjacent channel filtering circuit for reducing adjacent channel interference with an in-band signal, includes: (1) a radio frequency (RF) circuit configured to receive and down-convert an RF signal to a baseband signal containing an in-band signal and adjacent channel components, (2) a controlled single pole filter electrically coupled to the RF circuit and configured to reject the adjacent channel components and cause a predetermined attenuation in the in-band signal, (3) a baseband circuit coupled to the controlled single pole filter and configured to condition the baseband signal for conversion to a digital signal, and (4) a digital circuit coupled to the baseband circuit and configured to receive the digital signal and compensate for the predetermined attenuation.
Abstract: A backward-compatible stereo image processing system and a method of generating a backward-compatible stereo image. One embodiment of the backward-compatible stereo image processing system includes: (1) first and second viewpoints for an image, (2) an intermediate viewpoint for the image, and (3) first and second output channels configured to provide respective images composed of high spatial frequency content of the intermediate viewpoint and respective low spatial frequency content of the first and second viewpoints.
Abstract: One embodiment provides a method to wake an electronic device having a central processing unit (CPU) from an idle condition. The method includes creating a worker queue in an interrupt-request (IRQ) driver module of the operating-system kernel of the device, receiving in the kernel an indication of user input in a form of an IRQ, and in response to receiving the indication of user input, posting a request in the worker queue to boost clock speed in the CPU. The request is then processed, causing an increase in the clock speed.
Type:
Application
Filed:
September 10, 2013
Publication date:
March 12, 2015
Applicant:
NVIDIA Corporation
Inventors:
Vikas Ashok Jain, Yogish Kulkarni, Li Li, Sunny Satish Shah