HASHING PACKET CONTENTS TO DETERMINE A PROCESSOR

Info

Publication number: 20110142050
Type: Application
Filed: Feb 21, 2011
Publication Date: Jun 16, 2011
Inventors: Yadong Li (Portland, OR), Xinan Tang (San Jose, CA)
Application Number: 13/031,368

Abstract

The disclosure includes a description of an apparatus having circuitry to determine a first hash value for a first packet tuple of a first packet traveling in a first direction of a duplex connection and determine a processor for the first packet from a set of multiple processors based, at least in part, on the first hash value. The apparatus includes circuitry to determine a second hash value for a second packet tuple of a second packet traveling in a second direction of the duplex connection and determine the same processor for the second packet from the set of multiple processors based, at least in part, on the second hash value.

Description

Description

BACKGROUND

Networks enable computers and other devices to communicate. For example, networks can carry data representing video, audio, e-mail, and so forth. Typically, data sent across a network is divided into smaller messages known as packets. By analogy, a packet is much like an envelope you drop in a mailbox. A packet typically includes “payload” and a “header”. The packet's “payload” is analogous to the letter inside the envelope. The packet's “header” is much like the information written on the envelope itself. The header can include information to help network devices handle the packet appropriately. For example, the header can include an address that identifies the packet's destination.

A series of related packets can form a connection. A connection is often identified by a combination of different portions of a packet known as a tuple. For example, a tuple is commonly formed by a combination of source and destination information of a packet header.

A variety of networking protocols maintain state information for a connection. For example, the Transmission Control Protocol (TCP) stores state data for a connection in a Transmission Control Block (TCB). A TCB includes state data such as the last received byte, the last successfully transmitted byte, and so forth. Typically, connection state data is accessed and, potentially, updated for each packet in a connection. In a multi-processor system, this can create contention issues between processors handling packets for the same connection. That is, for example, different processors handling data for the same connection may each attempt to access a connection's state data at the same time, creating requirements for data locking and introducing delay as the processors wait for access to the connection state data.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a diagram of a system that determines a processor for a packet using a symmetric hash.

FIG. 2 is a diagram illustrating a symmetric hash.

FIG. 3 is a diagram illustrating a network interface controller.

DETAILED DESCRIPTION

In a multi-processor system, processors may vie for access to the same connection state information. Contention between the processors, however, can be reduced by mapping respective connections to the respective processors. For example, a network interface controller (NIC) may perform a hash on a tuple of a received packet and use the hash to determine a processor to handle a given packet. Directing packets having the same tuple to the same processor can help pin down state information to the same processor. This can enable the processor to retain the state data for a connection in local processor memory (e.g., cache memory) and reduce contention between processors trying to access the same connection state data.

Intermediate nodes in a network such as a security gateway, firewall, switch, or router may handle data traveling in both directions of a duplex (i.e., bi-directional) connection. For example, FIG. 1 depicts a multi-processor (e.g., multi-core) 102a-102n host 100 handling packets of a duplex connection between nodes “A” and “B”. The processors 102a-102n may be integrated on a single die and/or may be included within the same integrated circuit package. The processors 102a-102n each may feature programmable logic such as an instruction decoder, arithmetic logic unit, and so forth. As shown, the processors 102a-102n may be coupled to and commonly service packets received by NICs 104a, 104b. Processors 102a-102n may communicate with the NICs 104a, 104b via a chipset, interconnect, or other inter-communication circuitry.

In the example shown in FIG. 1, packets (e.g., 110a) traveling from node A to node B have a source of “A” and a destination of “B” while packets (e.g., 110b) traveling from node B to node A have a source of “B” and a destination of “A”. As shown, the host 100 receives packet 110a at NIC 104a and packet 110b at NIC 104b. Both NICs 104a, 104b map received packets to a selected processor 102a-102n.

A NIC 104a, 104b may use an asymmetric hash that yields a different hash value for a packet in a connection depending on the direction the packet travels (e.g., a hash where hash(Source A, Destination B) does not equal hash(Source B, Destination A)). In this case, the NICs 104a, 104b may map packets belonging to the same connection to different processors 102a-102n due to the different hash values derived for packets traveling different directions in the same connection. This may undermine a goal of reducing contention between processors 102a-102n for connection state data. That is, if packet 110a is mapped to processor 102a and packet 110b is mapped to processor 102n, then processors 102a and 102n may both vie for access to the connection state data for the connection between nodes A and B.

As shown in FIG. 1, NICs 104a, 104b may instead use a processor selection operation that features a symmetric hash that yields the same hash value for a packet in a connection regardless of the direction the packet travels (e.g., a hash where hash(Source A, Destination B)=hash (Source B, Destination A)). Such a hash may map packets belonging to the same duplex connection to the same processor, processor 102a in this example. In other words, due to generation of the same hash value for packets traveling in both directions of a connection despite packet data variations (e.g., different source and destination information), packets belonging to the same connection can be mapped to the same processor 102a. This can reduce cache thrash and contention between processors 102a-102n for connection state data.

FIG. 2 depicts a sample technique to generate a symmetric hash. As shown, circuitry 200 operates on different orders of the same bits of packet data. For example, in the illustration, asymmetric hash circuitry 202a and 202b operates on switched orders of source/destination data for a TCP/IP tuple. That is, hash 202a operates on a tuple formed by:

- {source IP, destination IP, source TCP port, destination TCP port}
  while hash 202b operates on a tuple formed by:
- {destination IP, source IP, destination TCP port, source TCP port}.

The output of circuitry 202a and 202b is then combined. For example, the output of hash circuitry 202a and 202b may undergo a combination operation 204 such as a logical AND and/or XOR. Thus, in this sample implementation, the circuitry 200 can form a symmetric hash from asymmetric hash engines/functions 202a, 202b. This can enable the circuitry 200 to use commonly implemented asymmetric hash engines (e.g., Toeplitz hash engines) to generate a symmetric hash, lowering the design cost of the circuitry 200.

While FIG. 2 depicts a parallel implementation of the circuitry, other implementations may vary. For example, in a serial implementation, the different sets of bits may be fed to the same hash circuitry in turn. A wide variety of other techniques may be used to generate a symmetric hash. For example, protocol data may be sorted before a hash operation. For instance, a symmetric hash can be produced by circuitry that orders IP address within a tuple by magnitude and TCP ports within a tuple by magnitude and feeds the single ordered set of tuple data to a single hashing circuit. Thus, in FIG. 1, both packets 110a and 110b would yield the same ordered set of data to be hashed, produce the same hash value, and may be mapped to the same processor 102a.

Once determined, a symmetric hash value may then be used to determine a processor mapped to a packet's connection. For example, a mask may be applied to the symmetric hash value and may be used as a lookup value into an indirection table that associates the masked hash values to processor numbers. The resulting processor number from the indirection table may be adjusted, for example, by incrementing by a base core/processor number. After a processor is determined for a packet, the packet may be queued, for example, in a processor specific queue. An interrupt may then be generated to the processor. Potentially, interrupt moderation may be used to reduce the number of interrupts signaled.

While FIG. 2 depicted a tuple of the source and destination IP addresses and source and destination TCP ports, other tuples may be formed. For example, a tuple may consist solely of the IP source and destination addresses. Alternately, or in addition, a tuple may include information from other header fields, headers in lower layers (e.g., Ethernet) or higher layers in a protocol stack (e.g., HTTP (Hypertext Transfer Protocol) data or eXtensible Markup Language (XML) data), a packet's payload, and/or portions thereof. Further, while the above generically referred to Internet Protocol datagrams, this term encompasses both IPv4 (Internet Protocol version 4) and IPv6 (Internet Protocol version 6) datagrams. Similarly, while the above described IP datagrams encapsulating TCP segments, other layer 3 or layer 4 protocols (e.g., User Datagram Protocol [UDP]) in OSI (Open Systems Interconnection) terminology may similarly use the techniques described above. Finally, a symmetric hash may also operate on data not found in a packet (e.g., identification of the NIC receiving a packet).

FIG. 3 depicts a sample NIC 300 implementing a symmetric hash. As shown, the NIC 300 includes a PHY 302 (physical layer devices) (e.g., wired or wireless PHYs) and a MAC (media access control). The NIC 300 may also feature a DMA (Direct Memory Access) engine to transfer packet data to host memory (not shown) or directly to a host processor for example via a chipset, interconnect, or other communication medium. In the sample shown, the NIC 300 includes symmetric hash circuitry 304 for use in determining a processor 102a-102n to handle a packet.

A NIC, such as NIC 300, can be configured to operate in either symmetric or asymmetric hash mode. For example, a NIC may be configured to use a particular hash function (e.g., Toeplitz) and/or whether to generate a symmetric or asymmetric hash. For instance, this configuration may be performed via a network driver executed by a processor. For example, the network driver may specify an object identifier with the desired configuration values/selection of asymmetric or symmetric hash.

While FIGS. 1-3 depict sample implementations and sample environments, many other implementations are possible. For example, the system of FIG. 1 may feature a single NIC or more than two NICs that determine a symmetric hash. Further, the symmetric hash circuitry need not be located in a NIC, but may instead may be located elsewhere in the host, such as in a chipset, processor 102a-102n circuitry, or instructions executed by a processor 102a-102n. Additionally, while the above described an intermediate node in a network, the techniques described above may also be used in a terminal network node (e.g., a server). Further, while described in conjunction with bi-direcitonal connections, the techniques described above may also work with multi-casting or n-directional connections.

The term packet as used herein encompasses protocol data units (PDUs) for a wide variety of network protocols featuring a header and payload. A packet may be an encapsulated or encapsulating packet. Further, a given tuple may feature data from zero or more encapsulated packet headers and may or may not feature data from an encapsulating packet header.

The techniques described above may be implemented in a variety of software and/or hardware architectures. The term circuitry as used herein includes hardwired circuitry, digital circuitry, analog circuitry, programmable circuitry, and so forth. The programmable circuitry may operate on computer programs.

Other embodiments are within the scope of the following claims.

Claims

1-15. (canceled)

16. A method, comprising:

for a first packet received at a network interface of a system comprising multiple processors: ordering the Internet Protocol source address and the Internet Protocol destination address of the first packet by magnitude and ordering the source port and destination port of the first packet by magnitude; performing a hash based, at least in part, on the ordering of the Internet Protocol source address and the Internet Protocol destination address of the first packet by magnitude and the ordering of the source port and destination port of the first packet by magnitude; and determining a processor from the multiple processors based on the performed hash.

17. The method of claim 16, further comprising:

for a second packet to be transmitted via the network interface to a remote destination: ordering the Internet Protocol source address and the Internet Protocol destination address of the second packet by magnitude and ordering the source port and destination port of the second packet by magnitude; performing a hash based, at least in part, on the ordering of the Internet Protocol source address and the Internet Protocol destination address of the second packet by magnitude and the ordering of the source port and destination port of the second packet by magnitude.

18. The method of claim 16, wherein the determining the processor comprises using the performed hash to perform a lookup associating hash values with indications of processors.

19. The method of claim 16, wherein the hash comprises a Toeplitz hash.

20. The method of claim 16, wherein the determining the processor comprises selecting a queue associated with the processor.

21. A computer program, disposed on a non-transitory computer readable medium, comprising instructions to cause circuitry to:

for a received packet: order the Internet Protocol source address and the Internet Protocol destination address of the first received packet by magnitude and order the source port and destination port of the first received packet by magnitude; performing a hash, based at least in part, on the ordering of the Internet Protocol source address and the Internet Protocol destination address of the received packet by magnitude and the ordering of the source port and destination port of the first received packet by magnitude; and determine a processor from a set of multiple processors based on the performed hash.

22. The computer program of claim 21, further comprising instructions for causing circuitry to:

for a second packet to be transmitted via the network interface to a remote destination: order the Internet Protocol source address and the Internet Protocol destination address of the second packet by magnitude and order the source port and destination port of the second packet by magnitude; perform a hash, based at least in part, on the ordering of the Internet Protocol source address and the Internet Protocol destination address of the second packet by magnitude and the order of the source port and destination port of the second packet by magnitude.

23. The computer program of claim 21, wherein the instructions to determine the processor comprise instructions to use the performed hash to perform a lookup associating hash values with indications of processors.

24. The computer program of claim 21, wherein the hash comprises a Toeplitz hash.

25. The computer program of claim 21, wherein the determining the processor comprises selecting a queue associated with the processor.

26. A system, comprising

multiple processors;

at least one network interface controller coupled to the multiple processors; and

circuitry to: for a received packet: order the Internet Protocol source address and the Internet Protocol destination address of the received packet by magnitude and order the source port and destination port of the receive packet by magnitude; perform a hash based, at least in part, on the ordering of the Internet Protocol source address and the Internet Protocol destination address of the received packet by magnitude and on the ordering of the source port and destination port of the received packet by magnitude; and determine a processor from the multiple processors based on the performed hash.

27. The system of claim 26, wherein the circuitry comprises circuitry to:

for a second packet to be transmitted via the network interface to a remote destination: order the Internet Protocol source address and the Internet Protocol destination address of the second packet by magnitude and order the source port and destination port of the second packet by magnitude; and perform a hash, based at least in part, on the ordering of the Internet Protocol source address and the Internet Protocol destination address of the second packet by magnitude and the ordering of the source port and destination port by magnitude.

28. The system of claim 26, wherein the circuitry to determine the processor comprises circuitry to use the performed hash to perform a lookup associating hash values with indications of processors.

29. The system of claim 26, wherein the hash comprises a Toeplitz hash.

30. The system of claim 26, wherein the circuitry to determine the processor comprises circuitry to select a queue associated with the processor.

31. The system of claim 26, wherein the circuitry comprises circuitry programmed by instructions disposed on a non-transitory computer readable medium.