Patents by Inventor Wei-Je Huang

Wei-Je Huang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

System and method for detecting reuse of an existing known high-speed serial interconnect link

Patent number: 9847891

Abstract: In a system according to one embodiment of the present disclosure, the system comprises a first device, a second device, a communications link, and a memory. The memory stores instructions that when executed by the system perform a method of communications link training. This method comprises requesting a speed change to a second speed for the first device communicating with the second device at a first speed via the communications link. A saved set of parameters are accessed for at least one of the first device and the second device. A first training cycle is performed for the first device and the second device at the second speed using the saved set of parameters for the at least one of the first device and second device. The reuse of parameters from a previous successful equalization training cycle reduces the time required to perform equalization training.

Type: Grant

Filed: August 24, 2011

Date of Patent: December 19, 2017

Assignee: Nvidia Corporation

Inventors: David Wyatt, Vishal Mehta, Michael Hopgood, Mark Taylor, Hitendra Dutt, Samuel Vincent, Wei-Je Huang
Efficient CPU mailbox read access to GPU memory

Patent number: 9727521

Abstract: Techniques are disclosed for peer-to-peer data transfers where a source device receives a request to read data words from a target device. The source device creates a first and second read command for reading a first portion and a second portion of a plurality of data words from the target device, respectively. The source device transmits the first read command to the target device, and, before a first read operation associated with the first read command is complete, transmits the second read command to the target device. The first and second portions of the plurality of data words are stored in a first and second portion a buffer memory, respectively. Advantageously, an arbitrary number of multiple read operations may be in progress at a given time without using multiple peer-to-peer memory buffers. Performance for large data block transfers is improved without consuming peer-to-peer memory buffers needed by other peer GPUs.

Type: Grant

Filed: September 14, 2012

Date of Patent: August 8, 2017

Assignee: NVIDIA Corporation

Inventors: Dennis K. Ma, Karan Gupta, Lei Tian, Franck R. Diard, Praveen Jain, Wei-Je Huang, Atul Kalambur
Providing byte enables for peer-to-peer data transfer within a computing environment

Patent number: 9424227

Abstract: Non-contiguous or tiled payload data are efficiently transferred between peers over a fabric. Specifically, a client transfers a byte enable message to a peer device via a mailbox mechanism, where the byte enable message specifies which bytes of the payload data being transferred via the data packet are to be written to the frame buffer on the peer device and which bytes are not to be written. The client transfers the non-contiguous or tiled payload payload data to the peer device. Upon receiving the payload data, the peer device writes bytes from the payload data into the target frame buffer for only those bytes enabled via the byte enable message. One advantage of the present invention is that non-contiguous or tiled data are transferred over a fabric with improved efficiency.

Type: Grant

Filed: July 3, 2012

Date of Patent: August 23, 2016

Assignee: NVIDIA CORPORATION

Inventors: Samuel H. Duncan, Dennis K. Ma, Wei-Je Huang, Gary Ward
System and method for sending arbitrary packet types across a data connector

Patent number: 9390042

Abstract: A processing unit exchanges data with another processing unit across a data connector that supports a particular communication protocol. When the communication protocol is updated to support a new packet type, a specification of that new packet type may be stored within software registers included within the processing unit. Under circumstances that require the use of the new packet type, packet generation logic may read the packet specification of the new packet type, then generate and transmit a packet of the new type.

Type: Grant

Filed: July 3, 2012

Date of Patent: July 12, 2016

Assignee: NVIDIA Corporation

Inventors: Wei-Je Huang, Dennis Ma, Hitendra Dutt
Method and apparatus for interleaving bursts of high-speed serial interconnect link training with bus data transactions

Patent number: 8949497

Abstract: In an apparatus according to one embodiment of the present disclosure, a communications link comprises a first device and a second device communicating with each other via the communications link at a plurality of different speeds. However, prior to communicating via the communications link for the first time at a second speed, the first device and second device complete a first training cycle at the second speed. Further, during this first training cycle for the second speed, the first training cycle for the second speed will pause before the first training cycle at the second speed completes, and the first device and second device communicate at a first speed for a period of time before returning to the paused first training cycle at the second speed. When the paused first training cycle for the second speed continues, the first training cycle for the second speed will continue where it had paused.

Type: Grant

Filed: September 12, 2011

Date of Patent: February 3, 2015

Assignee: NVIDIA Corporation

Inventors: Michael Hopgood, Wei-Je Huang, Mark Taylor, Hitendra Dutt, David Wyatt, Vishal Mehta
EFFICIENT CPU MAILBOX READ ACCESS TO GPU MEMORY

Publication number: 20140082120

Abstract: Techniques are disclosed for peer-to-peer data transfers where a source device receives a request to read data words from a target device. The source device creates a first and second read command for reading a first portion and a second portion of a plurality of data words from the target device, respectively. The source device transmits the first read command to the target device, and, before a first read operation associated with the first read command is complete, transmits the second read command to the target device. The first and second portions of the plurality of data words are stored in a first and second portion a buffer memory, respectively. Advantageously, an arbitrary number of multiple read operations may be in progress at a given time without using multiple peer-to-peer memory buffers. Performance for large data block transfers is improved without consuming peer-to-peer memory buffers needed by other peer GPUs.

Type: Application

Filed: September 14, 2012

Publication date: March 20, 2014

Inventors: Dennis K. MA, Karan Gupta, Lei Tian, Franck R. Diard, Praveen Jain, Wei-Je Huang, Atul Kalambur
SYSTEM AND METHOD FOR SENDING ARBITRARY PACKET TYPES ACROSS A DATA CONNECTOR

Publication number: 20140013023

Abstract: A processing unit exchanges data with another processing unit across a data connector that supports a particular communication protocol. When the communication protocol is updated to support a new packet type, a specification of that new packet type may be stored within software registers included within the processing unit. Under circumstances that require the use of the new packet type, packet generation logic may read the packet specification of the new packet type, then generate and transmit a packet of the new type.

Type: Application

Filed: July 3, 2012

Publication date: January 9, 2014

Inventors: Wei-Je HUANG, Dennis Ma, Hitendra Dutt
PROVIDING BYTE ENABLES FOR PEER-TO-PEER DATA TRANSFER WITHIN A COMPUTING ENVIRONMENT

Publication number: 20140012904

Abstract: Non-contiguous or tiled payload data are efficiently transferred between peers over a fabric. Specifically, a client transfers a byte enable message to a peer device via a mailbox mechanism, where the byte enable message specifies which bytes of the payload data being transferred via the data packet are to be written to the frame buffer on the peer device and which bytes are not to be written. The client transfers the non-contiguous or tiled payload payload data to the peer device. Upon receiving the payload data, the peer device writes bytes from the payload data into the target frame buffer for only those bytes enabled via the byte enable message. One advantage of the present invention is that non-contiguous or tiled data are transferred over a fabric with improved efficiency.

Type: Application

Filed: July 3, 2012

Publication date: January 9, 2014

Inventors: Samuel H. DUNCAN, Dennis K. MA, Wei-Je HUANG, Gary WARD
METHOD AND APPARATUS FOR INTERLEAVING BURSTS OF HIGH-SPEED SERIAL INTERCONNECT LINK TRAINING WITH BUS DATA TRANSACTIONS

Publication number: 20130067127

Abstract: In an apparatus according to one embodiment of the present disclosure, a communications link comprises a first device and a second device communicating with each other via the communications link at a plurality of different speeds. However, prior to communicating via the communications link for the first time at a second speed, the first device and second device complete a first training cycle at the second speed. Further, during this first training cycle for the second speed, the first training cycle for the second speed will pause before the first training cycle at the second speed completes, and the first device and second device communicate at a first speed for a period of time before returning to the paused first training cycle at the second speed. When the paused first training cycle for the second speed continues, the first training cycle for the second speed will continue where it had paused.

Type: Application

Filed: September 12, 2011

Publication date: March 14, 2013

Applicant: NVIDIA CORPORATION

Inventors: Michael Hopgood, Wei-Je Huang, Mark Taylor, Hitendra Dutt, David Wyatt, Vishal Mehta
Deadlock avoidance by marking CPU traffic as special

Patent number: 8392667

Abstract: Deadlocks are avoided by marking read requests issued by a parallel processor to system memory as “special.” Read completions associated with read requests marked as special are routed on virtual channel 1 of the PCIe bus. Data returning on virtual channel 1 cannot become stalled by write requests in virtual channel 0, thus avoiding a potential deadlock.

Type: Grant

Filed: December 12, 2008

Date of Patent: March 5, 2013

Assignee: NVIDIA Corporation

Inventors: Samuel H. Duncan, David B. Glasco, Wei-Je Huang, Atul Kalambur, Patrick R. Marchand, Dennis K. Ma
SYSTEM AND METHOD FOR DETECTING REUSE OF AN EXISTING KNOWN HIGH-SPEED SERIAL INTERCONNECT LINK

Publication number: 20130051483

Abstract: In a system according to one embodiment of the present disclosure, the system comprises a first device, a second device, a communications link, and a memory. The memory stores instructions that when executed by the system perform a method of communications link training. This method comprises requesting a speed change to a second speed for the first device communicating with the second device at a first speed via the communications link. A saved set of parameters are accessed for at least one of the first device and the second device. A first training cycle is performed for the first device and the second device at the second speed using the saved set of parameters for the at least one of the first device and second device. The reuse of parameters from a previous successful equalization training cycle reduces the time required to perform equalization training.

Type: Application

Filed: August 24, 2011

Publication date: February 28, 2013

Applicant: NVIDIA Corporation

Inventors: David Wyatt, Vishal Mehta, Michael Hopgood, Mark Taylor, Hitendra Dutt, Samuel Vincent, Wei-Je Huang
Apparatus and method for servicing multiple graphics processing channels

Patent number: 8031198

Abstract: An apparatus and method for servicing multiple graphics processing channels are described. In one embodiment, a graphics processing apparatus includes a scheduler configured to direct servicing of a graphics processing channel by issuing an index related to the graphics processing channel. The graphics processing apparatus also includes a processing core connected to the scheduler. The processing core is configured to service the graphics processing channel by: (i) correlating the index with a memory location at which an instance block for the graphics processing channel is stored; and (ii) accessing the instance block stored at the memory location.

Type: Grant

Filed: October 31, 2006

Date of Patent: October 4, 2011

Assignee: Nvidia Corporation

Inventors: Jeffrey M. Smith, Shail Dave, Wei-Je Huang, Lincoln G. Garlick, Paolo E. Sabella
Connection memory for tributary time-space switches

Patent number: 7839885

Abstract: A method of switching a plurality of tributaries disposed among a plurality of time slots in a frame is disclosed. The method generally includes the steps of (A) buffering the frame, (B) switching the tributaries among the time slots in response to a read address and (C) generating the read address in response to a plurality of identifications in a connection map, the connection map defining (i) at most one of the identifications for each of the tributaries and (ii) one of the identifications for each of the time slots carrying other than the tributaries.

Type: Grant

Filed: April 25, 2005

Date of Patent: November 23, 2010

Assignee: LSI Corporation

Inventors: Ephrem C. Wu, Wei-Je Huang
Apparatus, system, and method for swizzling of a PCIe link

Patent number: 7756123

Abstract: A peripheral component interface express (PCIe) controller include a crossbar to reorder data lanes into an order compatible with PCIe negotiation rules. A full crossbar permits an arbitrary swizzling of data lanes, permitting greater flexibility in motherboard lane routing.

Type: Grant

Filed: December 21, 2006

Date of Patent: July 13, 2010

Assignee: Nvidia Corporation

Inventors: Wei-Je Huang, Nathan C. Myers
Deadlock Avoidance By Marking CPU Traffic As Special

Publication number: 20100153658

Abstract: Deadlocks are avoided by marking read requests issued by a parallel processor to system memory as “special.” Read completions associated with read requests marked as special are routed on virtual channel 1 of the PCIe bus. Data returning on virtual channel 1 cannot become stalled by write requests in virtual channel 0, thus avoiding a potential deadlock.

Type: Application

Filed: December 12, 2008

Publication date: June 17, 2010

Inventors: Samuel H. Duncan, David B. Glasco, Wei-je Huang, Atul Kalambur, Patrick R. Marchand, Dennis K. Ma
Peer-to-peer data transfer method and apparatus with request limits

Patent number: 7469309

Abstract: Methods and apparatus for peer-to-peer data transfers in a computing environment provide configurable control over the number of outstanding read requests by one peer device to another. A requesting peer device includes a control register that stores a high-watermark value associated with requests to a target peer device. Each time a read request to the target peer device is generated, the number of such requests already outstanding is compared to the high-water mark. The request is blocked if the number of outstanding requests exceeds the high-water mark and remains blocked until such time as the number of outstanding requests no longer exceeds the high-water mark. Different high-water marks can be associated with different combinations of requesting and target devices.

Type: Grant

Filed: December 12, 2005

Date of Patent: December 23, 2008

Assignee: Nvidia Corporation

Inventors: Samuel Hammond Duncan, Wei-Je Huang, Radha Kanekal
Method and apparatus for providing peer-to-peer data transfer within a computing environment

Patent number: 7451259

Abstract: A method and apparatus for providing peer-to-peer data transfer through an interconnecting fabric. The method and apparatus enable a first device to read and/or write data to/from a local memory of a second device by communicating read and write requests across the interconnectivity fabric. Such data transfer can be performed even when the communication protocol of the interconnectivity fabric does not permit such transfers.

Type: Grant

Filed: December 6, 2004

Date of Patent: November 11, 2008

Assignee: NVIDIA Corporation

Inventors: Samuel H. Duncan, Wei-Je Huang, John H. Edmondson
Apparatus, system, and method for bus link width optimization of a graphics system

Patent number: 7426597

Abstract: A bus permits the number of active serial data lanes of a data link to be re-negotiated in response to changes in bus bandwidth requirements. In one embodiment, one of the bus interfaces triggers a re-negotiation of link width and places a constraint on link width during the re-negotiation.

Type: Grant

Filed: September 16, 2005

Date of Patent: September 16, 2008

Assignee: NVIDIA Corporation

Inventors: William P. Tsu, Luc R. Bisson, Oren Rubinstein, Wei-Je Huang, Michael B. Diamond
Point-to-point bus bridging without a bridge controller

Patent number: 7420565

Abstract: A computer system includes an integrated graphics subsystem and a graphics connector for attaching either an auxiliary graphics subsystem or a loopback card. A first bus connection communicates data from the computer system to the integrated graphics subsystem. With a loopback card in place, data travels from the integrated graphics subsystem back to the computer system via a second bus connection. When the auxiliary graphics subsystem is attached, the integrated graphics subsystem operates in a data forwarding mode. Data is communicated to the integrated graphics subsystem via the first bus connection. The integrated graphics subsystem then forwards data to the auxiliary graphics subsystem. A portion of the second bus connection communicates data from the auxiliary graphics subsystem back to the computer system. The auxiliary graphics subsystem communicates display information back to the integrated graphics subsystem, where it is used to control a display device.

Type: Grant

Filed: October 11, 2005

Date of Patent: September 2, 2008

Assignee: Nvidia Corporation

Inventors: Oren Rubinstein, Jonah M. Alben, Wei-Je Huang
Logical-to-physical lane assignment to reduce clock power dissipation in a bus having a variable link width

Patent number: 7370132

Abstract: A bus permits the number of active serial data lanes of a data link to be re-negotiated in response to changes in bus bandwidth requirements. In one embodiment, clock buffers not required to drive active data lanes are placed in an inactive state to reduce clock power dissipation.

Type: Grant

Filed: November 16, 2005

Date of Patent: May 6, 2008

Assignee: Nvidia Corporation

Inventors: Wei Je Huang, Luc R. Bisson, Oren Rubinstein, Michael B. Diamond, William B. Simms

1 2 next