Patents by Inventor Wei-Je Huang

Wei-Je Huang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 9847891
    Abstract: In a system according to one embodiment of the present disclosure, the system comprises a first device, a second device, a communications link, and a memory. The memory stores instructions that when executed by the system perform a method of communications link training. This method comprises requesting a speed change to a second speed for the first device communicating with the second device at a first speed via the communications link. A saved set of parameters are accessed for at least one of the first device and the second device. A first training cycle is performed for the first device and the second device at the second speed using the saved set of parameters for the at least one of the first device and second device. The reuse of parameters from a previous successful equalization training cycle reduces the time required to perform equalization training.
    Type: Grant
    Filed: August 24, 2011
    Date of Patent: December 19, 2017
    Assignee: Nvidia Corporation
    Inventors: David Wyatt, Vishal Mehta, Michael Hopgood, Mark Taylor, Hitendra Dutt, Samuel Vincent, Wei-Je Huang
  • Patent number: 9727521
    Abstract: Techniques are disclosed for peer-to-peer data transfers where a source device receives a request to read data words from a target device. The source device creates a first and second read command for reading a first portion and a second portion of a plurality of data words from the target device, respectively. The source device transmits the first read command to the target device, and, before a first read operation associated with the first read command is complete, transmits the second read command to the target device. The first and second portions of the plurality of data words are stored in a first and second portion a buffer memory, respectively. Advantageously, an arbitrary number of multiple read operations may be in progress at a given time without using multiple peer-to-peer memory buffers. Performance for large data block transfers is improved without consuming peer-to-peer memory buffers needed by other peer GPUs.
    Type: Grant
    Filed: September 14, 2012
    Date of Patent: August 8, 2017
    Assignee: NVIDIA Corporation
    Inventors: Dennis K. Ma, Karan Gupta, Lei Tian, Franck R. Diard, Praveen Jain, Wei-Je Huang, Atul Kalambur
  • Patent number: 9424227
    Abstract: Non-contiguous or tiled payload data are efficiently transferred between peers over a fabric. Specifically, a client transfers a byte enable message to a peer device via a mailbox mechanism, where the byte enable message specifies which bytes of the payload data being transferred via the data packet are to be written to the frame buffer on the peer device and which bytes are not to be written. The client transfers the non-contiguous or tiled payload payload data to the peer device. Upon receiving the payload data, the peer device writes bytes from the payload data into the target frame buffer for only those bytes enabled via the byte enable message. One advantage of the present invention is that non-contiguous or tiled data are transferred over a fabric with improved efficiency.
    Type: Grant
    Filed: July 3, 2012
    Date of Patent: August 23, 2016
    Assignee: NVIDIA CORPORATION
    Inventors: Samuel H. Duncan, Dennis K. Ma, Wei-Je Huang, Gary Ward
  • Patent number: 9390042
    Abstract: A processing unit exchanges data with another processing unit across a data connector that supports a particular communication protocol. When the communication protocol is updated to support a new packet type, a specification of that new packet type may be stored within software registers included within the processing unit. Under circumstances that require the use of the new packet type, packet generation logic may read the packet specification of the new packet type, then generate and transmit a packet of the new type.
    Type: Grant
    Filed: July 3, 2012
    Date of Patent: July 12, 2016
    Assignee: NVIDIA Corporation
    Inventors: Wei-Je Huang, Dennis Ma, Hitendra Dutt
  • Patent number: 8949497
    Abstract: In an apparatus according to one embodiment of the present disclosure, a communications link comprises a first device and a second device communicating with each other via the communications link at a plurality of different speeds. However, prior to communicating via the communications link for the first time at a second speed, the first device and second device complete a first training cycle at the second speed. Further, during this first training cycle for the second speed, the first training cycle for the second speed will pause before the first training cycle at the second speed completes, and the first device and second device communicate at a first speed for a period of time before returning to the paused first training cycle at the second speed. When the paused first training cycle for the second speed continues, the first training cycle for the second speed will continue where it had paused.
    Type: Grant
    Filed: September 12, 2011
    Date of Patent: February 3, 2015
    Assignee: NVIDIA Corporation
    Inventors: Michael Hopgood, Wei-Je Huang, Mark Taylor, Hitendra Dutt, David Wyatt, Vishal Mehta
  • Publication number: 20140082120
    Abstract: Techniques are disclosed for peer-to-peer data transfers where a source device receives a request to read data words from a target device. The source device creates a first and second read command for reading a first portion and a second portion of a plurality of data words from the target device, respectively. The source device transmits the first read command to the target device, and, before a first read operation associated with the first read command is complete, transmits the second read command to the target device. The first and second portions of the plurality of data words are stored in a first and second portion a buffer memory, respectively. Advantageously, an arbitrary number of multiple read operations may be in progress at a given time without using multiple peer-to-peer memory buffers. Performance for large data block transfers is improved without consuming peer-to-peer memory buffers needed by other peer GPUs.
    Type: Application
    Filed: September 14, 2012
    Publication date: March 20, 2014
    Inventors: Dennis K. MA, Karan Gupta, Lei Tian, Franck R. Diard, Praveen Jain, Wei-Je Huang, Atul Kalambur
  • Publication number: 20140013023
    Abstract: A processing unit exchanges data with another processing unit across a data connector that supports a particular communication protocol. When the communication protocol is updated to support a new packet type, a specification of that new packet type may be stored within software registers included within the processing unit. Under circumstances that require the use of the new packet type, packet generation logic may read the packet specification of the new packet type, then generate and transmit a packet of the new type.
    Type: Application
    Filed: July 3, 2012
    Publication date: January 9, 2014
    Inventors: Wei-Je HUANG, Dennis Ma, Hitendra Dutt
  • Publication number: 20140012904
    Abstract: Non-contiguous or tiled payload data are efficiently transferred between peers over a fabric. Specifically, a client transfers a byte enable message to a peer device via a mailbox mechanism, where the byte enable message specifies which bytes of the payload data being transferred via the data packet are to be written to the frame buffer on the peer device and which bytes are not to be written. The client transfers the non-contiguous or tiled payload payload data to the peer device. Upon receiving the payload data, the peer device writes bytes from the payload data into the target frame buffer for only those bytes enabled via the byte enable message. One advantage of the present invention is that non-contiguous or tiled data are transferred over a fabric with improved efficiency.
    Type: Application
    Filed: July 3, 2012
    Publication date: January 9, 2014
    Inventors: Samuel H. DUNCAN, Dennis K. MA, Wei-Je HUANG, Gary WARD
  • Publication number: 20130067127
    Abstract: In an apparatus according to one embodiment of the present disclosure, a communications link comprises a first device and a second device communicating with each other via the communications link at a plurality of different speeds. However, prior to communicating via the communications link for the first time at a second speed, the first device and second device complete a first training cycle at the second speed. Further, during this first training cycle for the second speed, the first training cycle for the second speed will pause before the first training cycle at the second speed completes, and the first device and second device communicate at a first speed for a period of time before returning to the paused first training cycle at the second speed. When the paused first training cycle for the second speed continues, the first training cycle for the second speed will continue where it had paused.
    Type: Application
    Filed: September 12, 2011
    Publication date: March 14, 2013
    Applicant: NVIDIA CORPORATION
    Inventors: Michael Hopgood, Wei-Je Huang, Mark Taylor, Hitendra Dutt, David Wyatt, Vishal Mehta
  • Patent number: 8392667
    Abstract: Deadlocks are avoided by marking read requests issued by a parallel processor to system memory as “special.” Read completions associated with read requests marked as special are routed on virtual channel 1 of the PCIe bus. Data returning on virtual channel 1 cannot become stalled by write requests in virtual channel 0, thus avoiding a potential deadlock.
    Type: Grant
    Filed: December 12, 2008
    Date of Patent: March 5, 2013
    Assignee: NVIDIA Corporation
    Inventors: Samuel H. Duncan, David B. Glasco, Wei-Je Huang, Atul Kalambur, Patrick R. Marchand, Dennis K. Ma
  • Publication number: 20130051483
    Abstract: In a system according to one embodiment of the present disclosure, the system comprises a first device, a second device, a communications link, and a memory. The memory stores instructions that when executed by the system perform a method of communications link training. This method comprises requesting a speed change to a second speed for the first device communicating with the second device at a first speed via the communications link. A saved set of parameters are accessed for at least one of the first device and the second device. A first training cycle is performed for the first device and the second device at the second speed using the saved set of parameters for the at least one of the first device and second device. The reuse of parameters from a previous successful equalization training cycle reduces the time required to perform equalization training.
    Type: Application
    Filed: August 24, 2011
    Publication date: February 28, 2013
    Applicant: NVIDIA Corporation
    Inventors: David Wyatt, Vishal Mehta, Michael Hopgood, Mark Taylor, Hitendra Dutt, Samuel Vincent, Wei-Je Huang
  • Patent number: 8031198
    Abstract: An apparatus and method for servicing multiple graphics processing channels are described. In one embodiment, a graphics processing apparatus includes a scheduler configured to direct servicing of a graphics processing channel by issuing an index related to the graphics processing channel. The graphics processing apparatus also includes a processing core connected to the scheduler. The processing core is configured to service the graphics processing channel by: (i) correlating the index with a memory location at which an instance block for the graphics processing channel is stored; and (ii) accessing the instance block stored at the memory location.
    Type: Grant
    Filed: October 31, 2006
    Date of Patent: October 4, 2011
    Assignee: Nvidia Corporation
    Inventors: Jeffrey M. Smith, Shail Dave, Wei-Je Huang, Lincoln G. Garlick, Paolo E. Sabella
  • Patent number: 7839885
    Abstract: A method of switching a plurality of tributaries disposed among a plurality of time slots in a frame is disclosed. The method generally includes the steps of (A) buffering the frame, (B) switching the tributaries among the time slots in response to a read address and (C) generating the read address in response to a plurality of identifications in a connection map, the connection map defining (i) at most one of the identifications for each of the tributaries and (ii) one of the identifications for each of the time slots carrying other than the tributaries.
    Type: Grant
    Filed: April 25, 2005
    Date of Patent: November 23, 2010
    Assignee: LSI Corporation
    Inventors: Ephrem C. Wu, Wei-Je Huang
  • Patent number: 7756123
    Abstract: A peripheral component interface express (PCIe) controller include a crossbar to reorder data lanes into an order compatible with PCIe negotiation rules. A full crossbar permits an arbitrary swizzling of data lanes, permitting greater flexibility in motherboard lane routing.
    Type: Grant
    Filed: December 21, 2006
    Date of Patent: July 13, 2010
    Assignee: Nvidia Corporation
    Inventors: Wei-Je Huang, Nathan C. Myers
  • Publication number: 20100153658
    Abstract: Deadlocks are avoided by marking read requests issued by a parallel processor to system memory as “special.” Read completions associated with read requests marked as special are routed on virtual channel 1 of the PCIe bus. Data returning on virtual channel 1 cannot become stalled by write requests in virtual channel 0, thus avoiding a potential deadlock.
    Type: Application
    Filed: December 12, 2008
    Publication date: June 17, 2010
    Inventors: Samuel H. Duncan, David B. Glasco, Wei-je Huang, Atul Kalambur, Patrick R. Marchand, Dennis K. Ma
  • Patent number: 7469309
    Abstract: Methods and apparatus for peer-to-peer data transfers in a computing environment provide configurable control over the number of outstanding read requests by one peer device to another. A requesting peer device includes a control register that stores a high-watermark value associated with requests to a target peer device. Each time a read request to the target peer device is generated, the number of such requests already outstanding is compared to the high-water mark. The request is blocked if the number of outstanding requests exceeds the high-water mark and remains blocked until such time as the number of outstanding requests no longer exceeds the high-water mark. Different high-water marks can be associated with different combinations of requesting and target devices.
    Type: Grant
    Filed: December 12, 2005
    Date of Patent: December 23, 2008
    Assignee: Nvidia Corporation
    Inventors: Samuel Hammond Duncan, Wei-Je Huang, Radha Kanekal
  • Patent number: 7451259
    Abstract: A method and apparatus for providing peer-to-peer data transfer through an interconnecting fabric. The method and apparatus enable a first device to read and/or write data to/from a local memory of a second device by communicating read and write requests across the interconnectivity fabric. Such data transfer can be performed even when the communication protocol of the interconnectivity fabric does not permit such transfers.
    Type: Grant
    Filed: December 6, 2004
    Date of Patent: November 11, 2008
    Assignee: NVIDIA Corporation
    Inventors: Samuel H. Duncan, Wei-Je Huang, John H. Edmondson
  • Patent number: 7426597
    Abstract: A bus permits the number of active serial data lanes of a data link to be re-negotiated in response to changes in bus bandwidth requirements. In one embodiment, one of the bus interfaces triggers a re-negotiation of link width and places a constraint on link width during the re-negotiation.
    Type: Grant
    Filed: September 16, 2005
    Date of Patent: September 16, 2008
    Assignee: NVIDIA Corporation
    Inventors: William P. Tsu, Luc R. Bisson, Oren Rubinstein, Wei-Je Huang, Michael B. Diamond
  • Patent number: 7420565
    Abstract: A computer system includes an integrated graphics subsystem and a graphics connector for attaching either an auxiliary graphics subsystem or a loopback card. A first bus connection communicates data from the computer system to the integrated graphics subsystem. With a loopback card in place, data travels from the integrated graphics subsystem back to the computer system via a second bus connection. When the auxiliary graphics subsystem is attached, the integrated graphics subsystem operates in a data forwarding mode. Data is communicated to the integrated graphics subsystem via the first bus connection. The integrated graphics subsystem then forwards data to the auxiliary graphics subsystem. A portion of the second bus connection communicates data from the auxiliary graphics subsystem back to the computer system. The auxiliary graphics subsystem communicates display information back to the integrated graphics subsystem, where it is used to control a display device.
    Type: Grant
    Filed: October 11, 2005
    Date of Patent: September 2, 2008
    Assignee: Nvidia Corporation
    Inventors: Oren Rubinstein, Jonah M. Alben, Wei-Je Huang
  • Patent number: 7370132
    Abstract: A bus permits the number of active serial data lanes of a data link to be re-negotiated in response to changes in bus bandwidth requirements. In one embodiment, clock buffers not required to drive active data lanes are placed in an inactive state to reduce clock power dissipation.
    Type: Grant
    Filed: November 16, 2005
    Date of Patent: May 6, 2008
    Assignee: Nvidia Corporation
    Inventors: Wei Je Huang, Luc R. Bisson, Oren Rubinstein, Michael B. Diamond, William B. Simms