Patents by Inventor Wei-Je Huang
Wei-Je Huang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 9847891Abstract: In a system according to one embodiment of the present disclosure, the system comprises a first device, a second device, a communications link, and a memory. The memory stores instructions that when executed by the system perform a method of communications link training. This method comprises requesting a speed change to a second speed for the first device communicating with the second device at a first speed via the communications link. A saved set of parameters are accessed for at least one of the first device and the second device. A first training cycle is performed for the first device and the second device at the second speed using the saved set of parameters for the at least one of the first device and second device. The reuse of parameters from a previous successful equalization training cycle reduces the time required to perform equalization training.Type: GrantFiled: August 24, 2011Date of Patent: December 19, 2017Assignee: Nvidia CorporationInventors: David Wyatt, Vishal Mehta, Michael Hopgood, Mark Taylor, Hitendra Dutt, Samuel Vincent, Wei-Je Huang
-
Patent number: 9727521Abstract: Techniques are disclosed for peer-to-peer data transfers where a source device receives a request to read data words from a target device. The source device creates a first and second read command for reading a first portion and a second portion of a plurality of data words from the target device, respectively. The source device transmits the first read command to the target device, and, before a first read operation associated with the first read command is complete, transmits the second read command to the target device. The first and second portions of the plurality of data words are stored in a first and second portion a buffer memory, respectively. Advantageously, an arbitrary number of multiple read operations may be in progress at a given time without using multiple peer-to-peer memory buffers. Performance for large data block transfers is improved without consuming peer-to-peer memory buffers needed by other peer GPUs.Type: GrantFiled: September 14, 2012Date of Patent: August 8, 2017Assignee: NVIDIA CorporationInventors: Dennis K. Ma, Karan Gupta, Lei Tian, Franck R. Diard, Praveen Jain, Wei-Je Huang, Atul Kalambur
-
Patent number: 9424227Abstract: Non-contiguous or tiled payload data are efficiently transferred between peers over a fabric. Specifically, a client transfers a byte enable message to a peer device via a mailbox mechanism, where the byte enable message specifies which bytes of the payload data being transferred via the data packet are to be written to the frame buffer on the peer device and which bytes are not to be written. The client transfers the non-contiguous or tiled payload payload data to the peer device. Upon receiving the payload data, the peer device writes bytes from the payload data into the target frame buffer for only those bytes enabled via the byte enable message. One advantage of the present invention is that non-contiguous or tiled data are transferred over a fabric with improved efficiency.Type: GrantFiled: July 3, 2012Date of Patent: August 23, 2016Assignee: NVIDIA CORPORATIONInventors: Samuel H. Duncan, Dennis K. Ma, Wei-Je Huang, Gary Ward
-
Patent number: 9390042Abstract: A processing unit exchanges data with another processing unit across a data connector that supports a particular communication protocol. When the communication protocol is updated to support a new packet type, a specification of that new packet type may be stored within software registers included within the processing unit. Under circumstances that require the use of the new packet type, packet generation logic may read the packet specification of the new packet type, then generate and transmit a packet of the new type.Type: GrantFiled: July 3, 2012Date of Patent: July 12, 2016Assignee: NVIDIA CorporationInventors: Wei-Je Huang, Dennis Ma, Hitendra Dutt
-
Patent number: 8949497Abstract: In an apparatus according to one embodiment of the present disclosure, a communications link comprises a first device and a second device communicating with each other via the communications link at a plurality of different speeds. However, prior to communicating via the communications link for the first time at a second speed, the first device and second device complete a first training cycle at the second speed. Further, during this first training cycle for the second speed, the first training cycle for the second speed will pause before the first training cycle at the second speed completes, and the first device and second device communicate at a first speed for a period of time before returning to the paused first training cycle at the second speed. When the paused first training cycle for the second speed continues, the first training cycle for the second speed will continue where it had paused.Type: GrantFiled: September 12, 2011Date of Patent: February 3, 2015Assignee: NVIDIA CorporationInventors: Michael Hopgood, Wei-Je Huang, Mark Taylor, Hitendra Dutt, David Wyatt, Vishal Mehta
-
Publication number: 20140082120Abstract: Techniques are disclosed for peer-to-peer data transfers where a source device receives a request to read data words from a target device. The source device creates a first and second read command for reading a first portion and a second portion of a plurality of data words from the target device, respectively. The source device transmits the first read command to the target device, and, before a first read operation associated with the first read command is complete, transmits the second read command to the target device. The first and second portions of the plurality of data words are stored in a first and second portion a buffer memory, respectively. Advantageously, an arbitrary number of multiple read operations may be in progress at a given time without using multiple peer-to-peer memory buffers. Performance for large data block transfers is improved without consuming peer-to-peer memory buffers needed by other peer GPUs.Type: ApplicationFiled: September 14, 2012Publication date: March 20, 2014Inventors: Dennis K. MA, Karan Gupta, Lei Tian, Franck R. Diard, Praveen Jain, Wei-Je Huang, Atul Kalambur
-
Publication number: 20140013023Abstract: A processing unit exchanges data with another processing unit across a data connector that supports a particular communication protocol. When the communication protocol is updated to support a new packet type, a specification of that new packet type may be stored within software registers included within the processing unit. Under circumstances that require the use of the new packet type, packet generation logic may read the packet specification of the new packet type, then generate and transmit a packet of the new type.Type: ApplicationFiled: July 3, 2012Publication date: January 9, 2014Inventors: Wei-Je HUANG, Dennis Ma, Hitendra Dutt
-
Publication number: 20140012904Abstract: Non-contiguous or tiled payload data are efficiently transferred between peers over a fabric. Specifically, a client transfers a byte enable message to a peer device via a mailbox mechanism, where the byte enable message specifies which bytes of the payload data being transferred via the data packet are to be written to the frame buffer on the peer device and which bytes are not to be written. The client transfers the non-contiguous or tiled payload payload data to the peer device. Upon receiving the payload data, the peer device writes bytes from the payload data into the target frame buffer for only those bytes enabled via the byte enable message. One advantage of the present invention is that non-contiguous or tiled data are transferred over a fabric with improved efficiency.Type: ApplicationFiled: July 3, 2012Publication date: January 9, 2014Inventors: Samuel H. DUNCAN, Dennis K. MA, Wei-Je HUANG, Gary WARD
-
Publication number: 20130067127Abstract: In an apparatus according to one embodiment of the present disclosure, a communications link comprises a first device and a second device communicating with each other via the communications link at a plurality of different speeds. However, prior to communicating via the communications link for the first time at a second speed, the first device and second device complete a first training cycle at the second speed. Further, during this first training cycle for the second speed, the first training cycle for the second speed will pause before the first training cycle at the second speed completes, and the first device and second device communicate at a first speed for a period of time before returning to the paused first training cycle at the second speed. When the paused first training cycle for the second speed continues, the first training cycle for the second speed will continue where it had paused.Type: ApplicationFiled: September 12, 2011Publication date: March 14, 2013Applicant: NVIDIA CORPORATIONInventors: Michael Hopgood, Wei-Je Huang, Mark Taylor, Hitendra Dutt, David Wyatt, Vishal Mehta
-
Patent number: 8392667Abstract: Deadlocks are avoided by marking read requests issued by a parallel processor to system memory as “special.” Read completions associated with read requests marked as special are routed on virtual channel 1 of the PCIe bus. Data returning on virtual channel 1 cannot become stalled by write requests in virtual channel 0, thus avoiding a potential deadlock.Type: GrantFiled: December 12, 2008Date of Patent: March 5, 2013Assignee: NVIDIA CorporationInventors: Samuel H. Duncan, David B. Glasco, Wei-Je Huang, Atul Kalambur, Patrick R. Marchand, Dennis K. Ma
-
Publication number: 20130051483Abstract: In a system according to one embodiment of the present disclosure, the system comprises a first device, a second device, a communications link, and a memory. The memory stores instructions that when executed by the system perform a method of communications link training. This method comprises requesting a speed change to a second speed for the first device communicating with the second device at a first speed via the communications link. A saved set of parameters are accessed for at least one of the first device and the second device. A first training cycle is performed for the first device and the second device at the second speed using the saved set of parameters for the at least one of the first device and second device. The reuse of parameters from a previous successful equalization training cycle reduces the time required to perform equalization training.Type: ApplicationFiled: August 24, 2011Publication date: February 28, 2013Applicant: NVIDIA CorporationInventors: David Wyatt, Vishal Mehta, Michael Hopgood, Mark Taylor, Hitendra Dutt, Samuel Vincent, Wei-Je Huang
-
Patent number: 8031198Abstract: An apparatus and method for servicing multiple graphics processing channels are described. In one embodiment, a graphics processing apparatus includes a scheduler configured to direct servicing of a graphics processing channel by issuing an index related to the graphics processing channel. The graphics processing apparatus also includes a processing core connected to the scheduler. The processing core is configured to service the graphics processing channel by: (i) correlating the index with a memory location at which an instance block for the graphics processing channel is stored; and (ii) accessing the instance block stored at the memory location.Type: GrantFiled: October 31, 2006Date of Patent: October 4, 2011Assignee: Nvidia CorporationInventors: Jeffrey M. Smith, Shail Dave, Wei-Je Huang, Lincoln G. Garlick, Paolo E. Sabella
-
Patent number: 7839885Abstract: A method of switching a plurality of tributaries disposed among a plurality of time slots in a frame is disclosed. The method generally includes the steps of (A) buffering the frame, (B) switching the tributaries among the time slots in response to a read address and (C) generating the read address in response to a plurality of identifications in a connection map, the connection map defining (i) at most one of the identifications for each of the tributaries and (ii) one of the identifications for each of the time slots carrying other than the tributaries.Type: GrantFiled: April 25, 2005Date of Patent: November 23, 2010Assignee: LSI CorporationInventors: Ephrem C. Wu, Wei-Je Huang
-
Patent number: 7756123Abstract: A peripheral component interface express (PCIe) controller include a crossbar to reorder data lanes into an order compatible with PCIe negotiation rules. A full crossbar permits an arbitrary swizzling of data lanes, permitting greater flexibility in motherboard lane routing.Type: GrantFiled: December 21, 2006Date of Patent: July 13, 2010Assignee: Nvidia CorporationInventors: Wei-Je Huang, Nathan C. Myers
-
Publication number: 20100153658Abstract: Deadlocks are avoided by marking read requests issued by a parallel processor to system memory as “special.” Read completions associated with read requests marked as special are routed on virtual channel 1 of the PCIe bus. Data returning on virtual channel 1 cannot become stalled by write requests in virtual channel 0, thus avoiding a potential deadlock.Type: ApplicationFiled: December 12, 2008Publication date: June 17, 2010Inventors: Samuel H. Duncan, David B. Glasco, Wei-je Huang, Atul Kalambur, Patrick R. Marchand, Dennis K. Ma
-
Patent number: 7469309Abstract: Methods and apparatus for peer-to-peer data transfers in a computing environment provide configurable control over the number of outstanding read requests by one peer device to another. A requesting peer device includes a control register that stores a high-watermark value associated with requests to a target peer device. Each time a read request to the target peer device is generated, the number of such requests already outstanding is compared to the high-water mark. The request is blocked if the number of outstanding requests exceeds the high-water mark and remains blocked until such time as the number of outstanding requests no longer exceeds the high-water mark. Different high-water marks can be associated with different combinations of requesting and target devices.Type: GrantFiled: December 12, 2005Date of Patent: December 23, 2008Assignee: Nvidia CorporationInventors: Samuel Hammond Duncan, Wei-Je Huang, Radha Kanekal
-
Patent number: 7451259Abstract: A method and apparatus for providing peer-to-peer data transfer through an interconnecting fabric. The method and apparatus enable a first device to read and/or write data to/from a local memory of a second device by communicating read and write requests across the interconnectivity fabric. Such data transfer can be performed even when the communication protocol of the interconnectivity fabric does not permit such transfers.Type: GrantFiled: December 6, 2004Date of Patent: November 11, 2008Assignee: NVIDIA CorporationInventors: Samuel H. Duncan, Wei-Je Huang, John H. Edmondson
-
Patent number: 7426597Abstract: A bus permits the number of active serial data lanes of a data link to be re-negotiated in response to changes in bus bandwidth requirements. In one embodiment, one of the bus interfaces triggers a re-negotiation of link width and places a constraint on link width during the re-negotiation.Type: GrantFiled: September 16, 2005Date of Patent: September 16, 2008Assignee: NVIDIA CorporationInventors: William P. Tsu, Luc R. Bisson, Oren Rubinstein, Wei-Je Huang, Michael B. Diamond
-
Patent number: 7420565Abstract: A computer system includes an integrated graphics subsystem and a graphics connector for attaching either an auxiliary graphics subsystem or a loopback card. A first bus connection communicates data from the computer system to the integrated graphics subsystem. With a loopback card in place, data travels from the integrated graphics subsystem back to the computer system via a second bus connection. When the auxiliary graphics subsystem is attached, the integrated graphics subsystem operates in a data forwarding mode. Data is communicated to the integrated graphics subsystem via the first bus connection. The integrated graphics subsystem then forwards data to the auxiliary graphics subsystem. A portion of the second bus connection communicates data from the auxiliary graphics subsystem back to the computer system. The auxiliary graphics subsystem communicates display information back to the integrated graphics subsystem, where it is used to control a display device.Type: GrantFiled: October 11, 2005Date of Patent: September 2, 2008Assignee: Nvidia CorporationInventors: Oren Rubinstein, Jonah M. Alben, Wei-Je Huang
-
Patent number: 7370132Abstract: A bus permits the number of active serial data lanes of a data link to be re-negotiated in response to changes in bus bandwidth requirements. In one embodiment, clock buffers not required to drive active data lanes are placed in an inactive state to reduce clock power dissipation.Type: GrantFiled: November 16, 2005Date of Patent: May 6, 2008Assignee: Nvidia CorporationInventors: Wei Je Huang, Luc R. Bisson, Oren Rubinstein, Michael B. Diamond, William B. Simms