Patents by Inventor Todd Rimmer

Todd Rimmer has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

DIRECT MEMORY WRITES BY NETWORK INTERFACE OF A GRAPHICS PROCESSING UNIT

Publication number: 20220351326

Abstract: Examples described herein relate to a first graphics processing unit (GPU) with at least one integrated communications system, wherein the at least one integrated communications system is to apply a reliability protocol to communicate with a second at least one integrated communications system associated with a second GPU to copy data from a first memory region to a second memory region and wherein the first memory region is associated with the first GPU and the second memory region is associated with the second GPU.

Type: Application

Filed: June 29, 2022

Publication date: November 3, 2022

Inventors: Todd RIMMER, Mark DEBBAGE, Bruce G. WARREN, Sayantan SUR, Nayan Amrutlal SUTHAR, Ajaya Durg
COMMUNICATIONS FOR WORKLOADS

Publication number: 20220138021

Abstract: Examples described herein relate to a sender process having a capability to select from use of a plurality of connections to at least one target process, wherein the plurality of connections to at least one target process comprise a connection for the sender process and/or one or more connections allocated per job. In some examples, the connection for the sender process comprises a datagram transport for message transfers. In some examples, the one or more connections allocated per job utilize a kernel bypass datagram transport for message transfers. In some examples, the one or more connections allocated per job comprise a connection oriented transport and wherein multiple remote direct memory access (RDMA) write operations for a plurality of processes are to be multiplexed using the connection oriented transport.

Type: Application

Filed: December 24, 2021

Publication date: May 5, 2022

Inventors: Todd RIMMER, Mark DEBBAGE
PACKET FORMAT ADJUSTMENT TECHNOLOGIES

Publication number: 20220116325

Abstract: Examples described herein relate to a network interface device that includes circuitry to decide packet format of a packet including data to be transmitted based on network utilized to transmit the packet and circuitry to form the packet based on the decided packet format. In some examples, the network utilized to transmit the packet is based on an egress port of the packet. In some examples, the network utilized to transmit the packet comprises one or more of: direct interconnect, small scale-up network, or large scale-out network. In some examples, to decide packet format, the circuitry is to form the packet byte by byte to reduce overhead caused by preamble and number of header fields.

Type: Application

Filed: December 22, 2021

Publication date: April 14, 2022

Inventors: Bruce G. WARREN, Robert ZAK, Mark DEBBAGE, Todd RIMMER
RELIABLE TRANSPORT ARCHITECTURE

Publication number: 20210119930

Abstract: Examples described herein relate to technologies for reliable packet transmission. In some examples, a network interface includes circuitry to: receive a request to transmit a packet to a destination device, select a path for the packet, provide a path identifier identifying one of multiple paths from the network interface to a destination and Path Sequence Number (PSN) for the packet, wherein the PSN is to identify a packet transmission order over the selected path, include the PSN in the packet, and transmit the packet. In some examples, if the packet is a re-transmit of a previously transmitted packet, the circuitry is to: select a path for the re-transmit packet, and set a PSN of the re-transmit packet that is a current packet transmission number for the selected path for the re-transmit packet.

Type: Application

Filed: October 29, 2020

Publication date: April 22, 2021

Inventors: Mark DEBBAGE, Robert SOUTHWORTH, Arvind SRINIVASAN, Cheolmin PARK, Todd RIMMER, Brian S. HAUSAUER
BUFFER ALLOCATION FOR PARALLEL PROCESSING OF DATA

Publication number: 20200358721

Abstract: Examples described herein relate to receiving, at a network interface, an allocation of a first group of one or more buffers to store data to be processed by a Message Passing Interface (MPI) and based on a received packet including an indicator that permits the network interface to select a buffer for the received packet and store the received packet in the selected buffer, the network interface storing a portion of the received packet in a buffer of the first group of the one or more buffers. The indicator can permit the network interface to select a buffer for the received packet and store the received packet in the selected buffer irrespective of a tag and sender associated with the received packet.

Type: Application

Filed: July 30, 2020

Publication date: November 12, 2020

Inventors: Todd RIMMER, Sayantan SUR, Michael William HEINZ
Hierarchical/lossless packet preemption to reduce latency jitter in flow-controlled packet-based networks

Patent number: 10230665

Abstract: Methods, apparatus, and systems for implementing hierarchical and lossless packet preemption and interleaving to reduce latency jitter in flow-controller packet-based networks. Fabric packets are divided into a plurality of data units, with data units for different fabric packets buffered in separate buffers. Data units are pulled from the buffers and added to a transmit stream in which groups of data units are interleaved. Upon receipt by a receiver, the groups of data units are separated out and buffered in separate buffers under which data units for the same fabric packets are grouped together. In one aspect, each buffer is associated with a respective virtual lane (VL), and the fabric packets are effectively transferred over fabric links using virtual lanes. VLs may have different levels of priority under which data units for fabric packets in higher-priority VLs may preempt fabric packets in lower-priority VLs.

Type: Grant

Filed: December 20, 2013

Date of Patent: March 12, 2019

Assignee: Intel Corporation

Inventors: Thomas D. Lovett, Albert Cheng, Mark S. Birrittella, James Kunz, Todd Rimmer
System, method and apparatus for improving the performance of collective operations in high performance computing

Patent number: 10015056

Abstract: System, method, and apparatus for improving the performance of collective operations in High Performance Computing (HPC). Compute nodes in a networked HPC environment form collective groups to perform collective operations. A spanning tree is formed including the compute nodes and switches and links used to interconnect the compute nodes, wherein the spanning tree is configured such that there is only a single route between any pair of nodes in the tree. The compute nodes implement processes for performing the collective operations, which includes exchanging messages between processes executing on other compute nodes, wherein the messages contain indicia identifying collective operations they belong to. Each switch is configured to implement message forwarding operations for its portion of the spanning tree. Each of the nodes in the spanning tree implements a ratcheted cyclical state machine that is used for synchronizing collective operations, along with status messages that are exchanged between nodes.

Type: Grant

Filed: July 12, 2016

Date of Patent: July 3, 2018

Assignee: Intel Corporation

Inventors: Michael Heinz, Todd Rimmer, James Kunz, Mark Debbage
Method and system for flexible credit exchange within high performance fabrics

Patent number: 9917787

Abstract: Method, apparatus, and systems for implementing flexible credit exchange within high performance fabrics. Available buffer space in a receive buffer on a receive-side of a link is managed and tracked at the transmit-side of the link using credits. Peer link interfaces coupled via a link are provided with receive buffer configuration information that specifies how the receive buffer space in each peer is partitioned and space allocated for each buffer, including a plurality of virtual lane (VL) buffers. Credits are used for tracking buffer space consumption and in credits are returned from the receive-side indicating freed buffer space. The peer link interfaces exchange credit organization information to inform the other peer of how much space each credit represents. In connection with data transfer over the link, the transmit-side de-allocates credits based on an amount of buffer space to be consumed in applicable buffers in the receive buffer.

Type: Grant

Filed: June 16, 2016

Date of Patent: March 13, 2018

Assignee: Intel Corporation

Inventors: Todd Rimmer, Thomas D. Lovett, Albert Cheng
METHOD, APPARATUS, AND SYSTEM FOR QOS WITHIN HIGH PERFORMANCE FABRICS

Publication number: 20170237671

Abstract: Method, apparatus, and systems for implementing Quality of Service (QoS) within high performance fabrics. A multi-level QoS scheme is implemented including virtual fabrics, Traffic Classes, Service Levels (SLs), Service Channels (SCs) and Virtual Lanes (VLs). SLs are implemented for Layer 4 (Transport Layer) end-to-end transfer of fabric packets, while SCs are used to differentiate fabric packets at the Link Layer. Fabric packets are divided into flits, with fabric packet data transmitted via fabric links as flits streams. Fabric switch input ports and device receive ports detect SC IDs for received fabric packets and implement SC-to-VL mappings to determine VL buffers to buffer fabric packet flits in. An SL may have multiple SCs, and SC-to-SC mapping may be implemented to change the SC for a fabric packet as it is forwarded through the fabric, while maintaining its SL. A Traffic Class may include multiple SLs, enabling request and response traffic for an application to employ separate SLs.

Type: Application

Filed: May 3, 2017

Publication date: August 17, 2017

Applicant: Intel Corporation

Inventors: Todd Rimmer, Thomas D. Lovett, Albert Cheng
Method, apparatus, and system for QoS within high performance fabrics

Patent number: 9648148

Abstract: Method, apparatus, and systems for implementing Quality of Service (QoS) within high performance fabrics. A multi-level QoS scheme is implemented including virtual fabrics, Traffic Classes, Service Levels (SLs), Service Channels (SCs) and Virtual Lanes (VLs). SLs are implemented for Layer 4 (Transport Layer) end-to-end transfer of fabric packets, while SCs are used to differentiate fabric packets at the Link Layer. Fabric packets are divided into flits, with fabric packet data transmitted via fabric links as flits streams. Fabric switch input ports and device receive ports detect SC IDs for received fabric packets and implement SC-to-VL mappings to determine VL buffers to buffer fabric packet flits in. An SL may have multiple SCs, and SC-to-SC mapping may be implemented to change the SC for a fabric packet as it is forwarded through the fabric, while maintaining its SL. A Traffic Class may include multiple SLs, enabling request and response traffic for an application to employ separate SLs.

Type: Grant

Filed: December 24, 2013

Date of Patent: May 9, 2017

Assignee: Intel Corporation

Inventors: Todd Rimmer, Thomas D. Lovett, Albert Cheng
Reliable transport of ethernet packet data with wire-speed and packet data rate match

Patent number: 9628382

Abstract: Method, apparatus, and systems for reliably transferring Ethernet packet data over a link layer and facilitating fabric-to-Ethernet and Ethernet-to-fabric gateway operations at matching wire speed and packet data rate. Ethernet header and payload data is extracted from Ethernet frames received at the gateway and encapsulated in fabric packets to be forwarded to a fabric endpoint hosting an entity to which the Ethernet packet is addressed. The fabric packets are divided into flits, which are bundled in groups to form link packets that are transferred over the fabric at the Link layer using a reliable transmission scheme employing implicit ACKnowledgements. At the endpoint, the fabric packet is regenerated, and the Ethernet packet data is de-encapsulated. The Ethernet frames received from and transmitted to an Ethernet network are encoded using 64b/66b encoding, having an overhead-to-data bit ratio of 1:32.

Type: Grant

Filed: February 5, 2014

Date of Patent: April 18, 2017

Assignee: Intel Corporation

Inventors: Mark S. Birrittella, Thomas D. Lovett, Todd Rimmer
METHOD AND SYSTEM FOR FLEXIBLE CREDIT EXCHANGE WITHIN HIGH PERFORMANCE FABRICS

Publication number: 20170026300

Abstract: Method, apparatus, and systems for implementing flexible credit exchange within high performance fabrics. Available buffer space in a receive buffer on a receive-side of a link is managed and tracked at the transmit-side of the link using credits. Peer link interfaces coupled via a link are provided with receive buffer configuration information that specifies how the receive buffer space in each peer is partitioned and space allocated for each buffer, including a plurality of virtual lane (VL) buffers. Credits are used for tracking buffer space consumption and in credits are returned from the receive-side indicating freed buffer space. The peer link interfaces exchange credit organization information to inform the other peer of how much space each credit represents. In connection with data transfer over the link, the transmit-side de-allocates credits based on an amount of buffer space to be consumed in applicable buffers in the receive buffer.

Type: Application

Filed: June 16, 2016

Publication date: January 26, 2017

Applicant: lntel Corporation

Inventors: Todd Rimmer, Thomas D. Lovett, Albert Cheng
SYSTEM, METHOD AND APPARATUS FOR IMPROVING THE PERFORMANCE OF COLLECTIVE OPERATIONS IN HIGH PERFORMANCE COMPUTING

Publication number: 20160323150

Abstract: System, method, and apparatus for improving the performance of collective operations in High Performance Computing (HPC). Compute nodes in a networked HPC environment form collective groups to perform collective operations. A spanning tree is formed including the compute nodes and switches and links used to interconnect the compute nodes, wherein the spanning tree is configured such that there is only a single route between any pair of nodes in the tree. The compute nodes implement processes for performing the collective operations, which includes exchanging messages between processes executing on other compute nodes, wherein the messages contain indicia identifying collective operations they belong to. Each switch is configured to implement message forwarding operations for its portion of the spanning tree. Each of the nodes in the spanning tree implements a ratcheted cyclical state machine that is used for synchronizing collective operations, along with status messages that are exchanged between nodes.

Type: Application

Filed: July 12, 2016

Publication date: November 3, 2016

Applicant: lntel Corporation

Inventors: Michael Heinz, Todd Rimmer, James Kunz, Mark Debbage
System, method and apparatus for improving the performance of collective operations in high performance computing

Patent number: 9391845

Abstract: System, method, and apparatus for improving the performance of collective operations in High Performance Computing (HPC). Compute nodes in a networked HPC environment form collective groups to perform collective operations. A spanning tree is formed including the compute nodes and switches and links used to interconnect the compute nodes, wherein the spanning tree is configured such that there is only a single route between any pair of nodes in the tree. The compute nodes implement processes for performing the collective operations, which includes exchanging messages between processes executing on other compute nodes, wherein the messages contain indicia identifying collective operations they belong to. Each switch is configured to implement message forwarding operations for its portion of the spanning tree. Each of the nodes in the spanning tree implements a ratcheted cyclical state machine that is used for synchronizing collective operations, along with status messages that are exchanged between nodes.

Type: Grant

Filed: September 24, 2014

Date of Patent: July 12, 2016

Assignee: Intel Corporation

Inventors: Michael Heinz, Todd Rimmer, James Kunz, Mark Debbage
Method and system for flexible credit exchange within high performance fabrics

Patent number: 9385962

Abstract: Method, apparatus, and systems for implementing flexible credit exchange within high performance fabrics. Available buffer space in a receive buffer on a receive-side of a link is managed and tracked at the transmit-side of the link using credits. Peer link interfaces coupled via a link are provided with receive buffer configuration information that specifies how the receive buffer space in each peer is partitioned and space allocated for each buffer, including a plurality of virtual lane (VL) buffers. Credits are used for tracking buffer space consumption and in credits are returned from the receive-side indicating freed buffer space. The peer link interfaces exchange credit organization information to inform the other peer of how much space each credit represents. In connection with data transfer over the link, the transmit-side de-allocates credits based on an amount of buffer space to be consumed in applicable buffers in the receive buffer.

Type: Grant

Filed: December 20, 2013

Date of Patent: July 5, 2016

Assignee: Intel Corporation

Inventors: Todd Rimmer, Thomas D. Lovett, Albert Cheng
SYSTEM, METHOD AND APPARATUS FOR IMPROVING THE PERFORMANCE OF COLLECTIVE OPERATIONS IN HIGH PERFORMANCE COMPUTING

Publication number: 20160087848

Abstract: System, method, and apparatus for improving the performance of collective operations in High Performance Computing (HPC). Compute nodes in a networked HPC environment form collective groups to perform collective operations. A spanning tree is formed including the compute nodes and switches and links used to interconnect the compute nodes, wherein the spanning tree is configured such that there is only a single route between any pair of nodes in the tree. The compute nodes implement processes for performing the collective operations, which includes exchanging messages between processes executing on other compute nodes, wherein the messages contain indicia identifying collective operations they belong to. Each switch is configured to implement message forwarding operations for its portion of the spanning tree. Each of the nodes in the spanning tree implements a ratcheted cyclical state machine that is used for synchronizing collective operations, along with status messages that are exchanged between nodes.

Type: Application

Filed: September 24, 2014

Publication date: March 24, 2016

Inventors: Michael Heinz, Todd Rimmer, James Kunz, Mark Debbage
Scalable Address Resolution

Publication number: 20150264116

Abstract: One embodiment provides Subnet administrator (SA) proxy logic to be executed by a computer network node. The SA proxy logic includes provider logic that includes path record information of an associated subnet in communication with the computer network node; and provider interface logic to receive an address resolution request from at least one application that includes partial address information. The provider interface logic is also to determine at least one local port of the computer network node to enable packet routing associated with the address resolution request. The provider logic is also to determine at least one subnet associated with the address resolution request. The provider interface logic is also to determine at least one provider logic to utilize to obtain the path record information for at least one subnet associated with the address resolution request.

Type: Application

Filed: March 14, 2014

Publication date: September 17, 2015

Inventors: Ira Weiny, Mark Sean Hefty, Todd Rimmer, John Fleck, Kaike Wan
TRANSPORT OF ETHERNET PACKET DATA WITH WIRE-SPEED AND PACKET DATA RATE MATCH

Publication number: 20150222533

Abstract: Method, apparatus, and systems for reliably transferring Ethernet packet data over a link layer and facilitating fabric-to-Ethernet and Ethernet-to-fabric gateway operations at matching wire speed and packet data rate. Ethernet header and payload data is extracted from Ethernet frames received at the gateway and encapsulated in fabric packets to be forwarded to a fabric endpoint hosting an entity to which the Ethernet packet is addressed. The fabric packets are divided into flits, which are bundled in groups to form link packets that are transferred over the fabric at the Link layer using a reliable transmission scheme employing implicit ACKnowledgements. At the endpoint, the fabric packet is regenerated, and the Ethernet packet data is de-encapsulated. The Ethernet frames received from and transmitted to an Ethernet network are encoded using 64b/66b encoding, having an overhead-to-data bit ratio of 1:32.

Type: Application

Filed: February 5, 2014

Publication date: August 6, 2015

Inventors: Mark S. Birrittella, Thomas D. Lovett, Todd Rimmer
HIERARCHICAL/LOSSLESS PACKET PREEMPTION TO REDUCE LATENCY JITTER IN FLOW-CONTROLLED PACKET-BASED NETWORKS

Publication number: 20150180799

Abstract: Methods, apparatus, and systems for implementing hierarchical and lossless packet preemption and interleaving to reduce latency jitter in flow-controller packet-based networks. Fabric packets are divided into a plurality of data units, with data units for different fabric packets buffered in separate buffers. Data units are pulled from the buffers and added to a transmit stream in which groups of data units are interleaved. Upon receipt by a receiver, the groups of data units are separated out and buffered in separate buffers under which data units for the same fabric packets are grouped together. In one aspect, each buffer is associated with a respective virtual lane (VL), and the fabric packets are effectively transferred over fabric links using virtual lanes. VLs may have different levels of priority under which data units for fabric packets in higher-priority VLs may preempt fabric packets in lower-priority VLs.

Type: Application

Filed: December 20, 2013

Publication date: June 25, 2015

Inventors: Thomas D. Lovett, Albert Cheng, Mark S. Birrittella, James Kunz, Todd Rimmer
METHOD, APPARATUS AND SYSTEM FOR QOS WITHIN HIGH PERFORMANCE FABRICS

Publication number: 20150180782

Abstract: Method, apparatus, and systems for implementing Quality of Service (QoS) within high performance fabrics. A multi-level QoS scheme is implemented including virtual fabrics, Traffic Classes, Service Levels (SLs), Service Channels (SCs) and Virtual Lanes (VLs). SLs are implemented for Layer 4 (Transport Layer) end-to-end transfer of fabric packets, while SCs are used to differentiate fabric packets at the Link Layer. Fabric packets are divided into flits, with fabric packet data transmitted via fabric links as flits streams. Fabric switch input ports and device receive ports detect SC IDs for received fabric packets and implement SC-to-VL mappings to determine VL buffers to buffer fabric packet flits in. An SL may have multiple SCs, and SC-to-SC mapping may be implemented to change the SC for a fabric packet as it is forwarded through the fabric, while maintaining its SL. A Traffic Class may include multiple SLs, enabling request and response traffic for an application to employ separate SLs.

Type: Application

Filed: December 24, 2013

Publication date: June 25, 2015

Inventors: Todd Rimmer, Thomas D. Lovett, Albert Cheng

1 2 next