Patents by Inventor Dennis K. Ma

Dennis K. Ma has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 9996490
    Abstract: A transmitter is configured to scale up a low bandwidth delivered by a first processing element to match a higher bandwidth associated with an interconnect. A receiver is configured to scale down the high bandwidth delivered by the interconnect to match the lower bandwidth associated with a second processing element. The first processing element and the second processing element may thus communicate with one another across the interconnect via the transmitter and the receiver, respectively, despite the bandwidth mismatch between those processing elements and the interconnect.
    Type: Grant
    Filed: September 19, 2013
    Date of Patent: June 12, 2018
    Assignee: NVIDIA Corporation
    Inventors: Marvin A. Denman, Dennis K. Ma, Stephen David Glaser
  • Patent number: 9727521
    Abstract: Techniques are disclosed for peer-to-peer data transfers where a source device receives a request to read data words from a target device. The source device creates a first and second read command for reading a first portion and a second portion of a plurality of data words from the target device, respectively. The source device transmits the first read command to the target device, and, before a first read operation associated with the first read command is complete, transmits the second read command to the target device. The first and second portions of the plurality of data words are stored in a first and second portion a buffer memory, respectively. Advantageously, an arbitrary number of multiple read operations may be in progress at a given time without using multiple peer-to-peer memory buffers. Performance for large data block transfers is improved without consuming peer-to-peer memory buffers needed by other peer GPUs.
    Type: Grant
    Filed: September 14, 2012
    Date of Patent: August 8, 2017
    Assignee: NVIDIA Corporation
    Inventors: Dennis K. Ma, Karan Gupta, Lei Tian, Franck R. Diard, Praveen Jain, Wei-Je Huang, Atul Kalambur
  • Patent number: 9626320
    Abstract: A transmitter is configured to scale up a low bandwidth delivered by a first processing element to match a higher bandwidth associated with an interconnect. A receiver is configured to scale down the high bandwidth delivered by the interconnect to match the lower bandwidth associated with a second processing element. The first processing element and the second processing element may thus communicate with one another across the interconnect via the transmitter and the receiver, respectively, despite the bandwidth mismatch between those processing elements and the interconnect.
    Type: Grant
    Filed: September 19, 2013
    Date of Patent: April 18, 2017
    Assignee: NVIDIA Corporation
    Inventors: Marvin A. Denman, Dennis K. Ma, Stephen David Glaser
  • Patent number: 9424227
    Abstract: Non-contiguous or tiled payload data are efficiently transferred between peers over a fabric. Specifically, a client transfers a byte enable message to a peer device via a mailbox mechanism, where the byte enable message specifies which bytes of the payload data being transferred via the data packet are to be written to the frame buffer on the peer device and which bytes are not to be written. The client transfers the non-contiguous or tiled payload payload data to the peer device. Upon receiving the payload data, the peer device writes bytes from the payload data into the target frame buffer for only those bytes enabled via the byte enable message. One advantage of the present invention is that non-contiguous or tiled data are transferred over a fabric with improved efficiency.
    Type: Grant
    Filed: July 3, 2012
    Date of Patent: August 23, 2016
    Assignee: NVIDIA CORPORATION
    Inventors: Samuel H. Duncan, Dennis K. Ma, Wei-Je Huang, Gary Ward
  • Patent number: 9164766
    Abstract: Methods and apparatus for providing additional storage, in the form of a hardware assisted stack, usable by software running an environment with limited resources. As an example, the hardware assisted stack may provide additional stack space to VBIOS code that is accessible within its limited allocated address space.
    Type: Grant
    Filed: April 8, 2005
    Date of Patent: October 20, 2015
    Assignee: NVIDIA Corporation
    Inventors: Aron L. Wong, Dennis K. Ma, Jonah M. Alben, Mark S. Krueger, Jeffrey J. Irwin
  • Publication number: 20150082074
    Abstract: A transmitter is configured to scale up a low bandwidth delivered by a first processing element to match a higher bandwidth associated with an interconnect. A receiver is configured to scale down the high bandwidth delivered by the interconnect to match the lower bandwidth associated with a second processing element. The first processing element and the second processing element may thus communicate with one another across the interconnect via the transmitter and the receiver, respectively, despite the bandwidth mismatch between those processing elements and the interconnect.
    Type: Application
    Filed: September 19, 2013
    Publication date: March 19, 2015
    Applicant: NVIDIA CORPORATION
    Inventors: Marvin A. DENMAN, Dennis K. MA, Stephen David GLASER
  • Publication number: 20150082075
    Abstract: A transmitter is configured to scale up a low bandwidth delivered by a first processing element to match a higher bandwidth associated with an interconnect. A receiver is configured to scale down the high bandwidth delivered by the interconnect to match the lower bandwidth associated with a second processing element. The first processing element and the second processing element may thus communicate with one another across the interconnect via the transmitter and the receiver, respectively, despite the bandwidth mismatch between those processing elements and the interconnect.
    Type: Application
    Filed: September 19, 2013
    Publication date: March 19, 2015
    Applicant: NVIDIA CORPORATION
    Inventors: Marvin A. DENMAN, Dennis K. MA, Stephen David GLASER
  • Patent number: 8726283
    Abstract: Under some conditions, requests transmitted between different devices in a computing system may be blocked in a way that prevents the request from being processed, resulting in a deadlock condition. A skid buffer is used to allow additional requests to be queued in order to remove the blockage and end the deadlock condition. Once the deadlock condition is removed, the requests are processed and the additional buffer entries in the skid buffer are disabled.
    Type: Grant
    Filed: June 4, 2007
    Date of Patent: May 13, 2014
    Assignee: NVIDIA Corporation
    Inventors: Oren Rubinstein, Dennis K. Ma, Richard B. Kujoth
  • Publication number: 20140082120
    Abstract: Techniques are disclosed for peer-to-peer data transfers where a source device receives a request to read data words from a target device. The source device creates a first and second read command for reading a first portion and a second portion of a plurality of data words from the target device, respectively. The source device transmits the first read command to the target device, and, before a first read operation associated with the first read command is complete, transmits the second read command to the target device. The first and second portions of the plurality of data words are stored in a first and second portion a buffer memory, respectively. Advantageously, an arbitrary number of multiple read operations may be in progress at a given time without using multiple peer-to-peer memory buffers. Performance for large data block transfers is improved without consuming peer-to-peer memory buffers needed by other peer GPUs.
    Type: Application
    Filed: September 14, 2012
    Publication date: March 20, 2014
    Inventors: Dennis K. MA, Karan Gupta, Lei Tian, Franck R. Diard, Praveen Jain, Wei-Je Huang, Atul Kalambur
  • Publication number: 20140012904
    Abstract: Non-contiguous or tiled payload data are efficiently transferred between peers over a fabric. Specifically, a client transfers a byte enable message to a peer device via a mailbox mechanism, where the byte enable message specifies which bytes of the payload data being transferred via the data packet are to be written to the frame buffer on the peer device and which bytes are not to be written. The client transfers the non-contiguous or tiled payload payload data to the peer device. Upon receiving the payload data, the peer device writes bytes from the payload data into the target frame buffer for only those bytes enabled via the byte enable message. One advantage of the present invention is that non-contiguous or tiled data are transferred over a fabric with improved efficiency.
    Type: Application
    Filed: July 3, 2012
    Publication date: January 9, 2014
    Inventors: Samuel H. DUNCAN, Dennis K. MA, Wei-Je HUANG, Gary WARD
  • Patent number: 8547993
    Abstract: Methods, apparatuses, and systems are presented for performing asynchronous communications involving using an asynchronous interface to send signals between a source device and a plurality of client devices, the source device and the plurality of client devices being part of a processing unit capable of performing graphics operations, the source device being coupled to the plurality of client devices using the asynchronous interface, wherein the asynchronous interface includes at least one request signal, at least one address signal, at least one acknowledge signal, and at least one data signal, and wherein the asynchronous interface operates in accordance with at least one programmable timing characteristic associated with the source device.
    Type: Grant
    Filed: August 10, 2006
    Date of Patent: October 1, 2013
    Assignee: NVIDIA Corporation
    Inventors: Lincoln G. Garlick, Richard A. Silkebakken, Prakash G. Apte, Paolo E. Sabella, Samuel H. Duncan, Dennis K. Ma, Sean J. Treichler
  • Patent number: 8392667
    Abstract: Deadlocks are avoided by marking read requests issued by a parallel processor to system memory as “special.” Read completions associated with read requests marked as special are routed on virtual channel 1 of the PCIe bus. Data returning on virtual channel 1 cannot become stalled by write requests in virtual channel 0, thus avoiding a potential deadlock.
    Type: Grant
    Filed: December 12, 2008
    Date of Patent: March 5, 2013
    Assignee: NVIDIA Corporation
    Inventors: Samuel H. Duncan, David B. Glasco, Wei-Je Huang, Atul Kalambur, Patrick R. Marchand, Dennis K. Ma
  • Patent number: 8108879
    Abstract: A processor having multiple independent engines can concurrently support a number of independent processes or operation contexts. The processor can independently schedule instructions for execution by the engines. The processor can independently switch the operation context that an engine supports. The processor can maintain the integrity of the operations performed and data processed by each engine during a context switch by controlling the manner in which the engine transitions from one operation context to the next. The processor can wait for the engine to complete processing of pipelined instructions of a first context before switching to another context, or the processor can halt the operation of the engine in the midst of one or more instructions to allow the engine to execute instructions corresponding to another context. The processor can affirmatively verify completion of tasks for a specific operation context.
    Type: Grant
    Filed: October 27, 2006
    Date of Patent: January 31, 2012
    Assignee: NVIDIA Corporation
    Inventors: Lincoln G. Garlick, Dennis K. Ma, Paolo E. Sabella, David W. Nuechterlein
  • Publication number: 20100153658
    Abstract: Deadlocks are avoided by marking read requests issued by a parallel processor to system memory as “special.” Read completions associated with read requests marked as special are routed on virtual channel 1 of the PCIe bus. Data returning on virtual channel 1 cannot become stalled by write requests in virtual channel 0, thus avoiding a potential deadlock.
    Type: Application
    Filed: December 12, 2008
    Publication date: June 17, 2010
    Inventors: Samuel H. Duncan, David B. Glasco, Wei-je Huang, Atul Kalambur, Patrick R. Marchand, Dennis K. Ma
  • Patent number: 7483032
    Abstract: Circuits, methods, and apparatus that allow the elimination of a frame buffer connected directly to a graphics processing unit. The graphics processing unit includes an on-chip memory. Following system power-up or reset, the GPU initially renders comparatively low-resolution images to the on-chip memory for display. Afterward, the GPU renders images, which are typically higher resolution, and stores them in a system memory, apart from the graphics processing unit. The on-chip memory, which is no longer needed for image storage, instead stores address information, referred to as page tables, identifying the location of data stored by the GPU in the separate system memory.
    Type: Grant
    Filed: October 18, 2005
    Date of Patent: January 27, 2009
    Assignee: NVIDIA Corporation
    Inventors: Sonny S. Yeoh, Shane J. Keil, Dennis K. Ma, Peter C. Tong
  • Patent number: 7467289
    Abstract: Software can freeze portions of a pipeline operation in a processor by asserting a predetermined freeze register in the processor. The processor halts operations relating to portions of a common pipeline processing in response to an asserted freeze register. Processor resources that operate downstream from the common pipeline continue to process any scheduled instructions. The processor is prevented from initiating any context switching in which a processor resource is allocated to a different channel. The processor stops supplying any additional data to downstream resources and ensures that the interface to downstream resources is clear of previously sent data. The processor prevents state machines from making additional requests. The processor asserts an acknowledgement indication in response to the freeze assertion when the processing has reached a stable state. Software is allowed to manipulate states and registers within the processor. Clearing the freeze register allows processing to resume.
    Type: Grant
    Filed: October 27, 2006
    Date of Patent: December 16, 2008
    Assignee: NVIDIA Corporation
    Inventors: Lincoln G. Garlick, Vikramjeet Singh, David W. Nuechterlein, Shail Dave, Jeffrey M. Smith, Paolo E. Sabella, Dennis K. Ma
  • Publication number: 20080028181
    Abstract: Circuits, methods, and apparatus that reduce or eliminate system memory accesses to retrieve address translation information. In one example, these accesses are reduced or eliminated by pre-populating a graphics TLB with entries that are used to translate virtual addresses used by a GPU to physical addresses used by a system memory. Translation information is maintained by locking or restricting entries in the graphics TLB that are needed for display access. This may be done by limiting access to certain locations in the graphics TLB, by storing flags or other identifying information in the graphics TLB, or by other appropriate methods. In another example, memory space is allocated by a system BIOS for a GPU, which stores a base address and address range. Virtual addresses in the address range are translated by adding them to the base address.
    Type: Application
    Filed: March 21, 2007
    Publication date: January 31, 2008
    Applicant: NVIDIA Corporation
    Inventors: Peter C. Tong, Sonny S. Yeoh, Kevin J. Kranzusch, Gary D. Lorensen, Kaymann L. Woo, Ashish Kishen Kaul, Colyn S. Case, Stefan A. Gottschalk, Dennis K. Ma