SESSION-BASED FORWARDING

The present disclosure discloses a method and network device for session based forwarding. Specifically, the disclosed system receives a first packet in a session, and performs a route lookup to determine a route for the first packet. Then, the system caches a reference to the route and a neighbor in the session, and also caches a reference to the session in a tunnel within which packets in the session are to be forwarded. Based on a comparison between the route version number cached in the session and the route version number in a route table corresponding to the route referenced by a route index in the session, the system determines whether the route is stale. If so, the system performs another route lookup to update the route. Moreover, the system uses cached reference to the session in the tunnel for forwarding subsequent packets in the session.

Skip to: Description  ·  Claims  · Patent History  ·  Patent History
Description
RELATED APPLICATIONS

This application claims the benefit of priority on U.S. Provisional Patent Application 61/732,829, filed Dec. 3, 2012, the entire contents of which are incorporated by reference.

Related patent applications to the subject application include the following: (1) U.S. Patent Application entitled “System and Method for Achieving Enhanced Performance with Multiple Networking Central Processing Unit (CPU) Cores” by Janakiraman, et al., U.S. application Ser. No. 13/692,622, filed Dec. 3, 2012, attorney docket reference no. 6259P186; (2) U.S. Patent Application entitled “Ingress Traffic Classification and Prioritization with Dynamic Load Balancing” by Janakiraman, et al., U.S. application Ser. No. 13/692,608, filed Dec. 3, 2012, attorney docket reference no. 6259P191; (3) U.S. Patent Application entitled “Method and System for Maintaining Derived Data Sets” by Gopalasetty, et al., U.S. application Ser. No. 13/692,920, filed Dec. 3, 2012, attorney docket reference no. 6259P192; (4) U.S. Patent Application entitled “System and Method for Message handling in a Network Device” by Palkar, et al., U.S. application Ser. No. ______, filed Jun. 14, 2013, attorney docket reference no. 6259P189; (5) U.S. Patent Application entitled “Rate Limiting Mechanism Based on Device Load/Capacity or Traffic Content” by Nambiar, et al., U.S. application Ser. No. ______ , filed Jun. 14, 2013, attorney docket reference no. 6259P185; (6) U.S. Patent Application entitled “Control Plane Protection for Various Tables Using Storm Prevention Entries” by Janakiraman, et al., U.S. application Ser. No. ______, filed Jun. 14, 2013, attorney docket reference no. 6259P188. The entire contents of the above applications are incorporated herein by reference.

FIELD

The present disclosure relates to networking processing performance of a symmetric multiprocessing (SMP) network architecture. In particular, the present disclosure relates to a system and method for providing session-based forwarding in a pipelined forwarding model.

BACKGROUND

A symmetric multiprocessing (SMP) architecture generally is a multiprocessor computer architecture where two or more identical processors can connect to a single shared main memory. In the case of multi-core processors, the SMP architecture can apply to the CPU cores.

In an SMP architecture, multiple networking CPUs or CPU cores can receive and transmit network traffic. While receiving and transmitting the network traffic, the system may maintain a flow-based engine that transmits the network traffic on a per-flow basis. Each flow is uniquely identified by a session key. To allow for efficient forwarding of flow-based network traffic, a network routing system typically uses longest prefix match to perform route lookup. Specifically, the routing system can look up in a route table for next hop information based on which Internet Protocol (IP) address provides the longest prefix match to the destination IP address.

Nevertheless, the longest prefix match lookup may incur excessive cost when the packets have long IP addresses, e.g., in the scenario of IPv6 network. Moreover, because the network topology and conditions can change dynamically, updating the route table to reflect the route changes in the flow table can be costly too. As a result, the rate of network convergence in the event of route changes may be slow with the conventional routing mechanisms that update the route information in the flow table.

BRIEF DESCRIPTION OF THE DRAWINGS

The present disclosure may be best understood by referring to the following description and accompanying drawings that are used to illustrate embodiments of the present disclosure.

FIG. 1 is a diagram illustrating an exemplary wireless network environment according to embodiments of the present disclosure.

FIG. 2 illustrates an exemplary architecture at multiple processing planes according to embodiments of the present disclosure.

FIG. 3 illustrates an exemplary network forwarding process according to embodiments of the present disclosure.

FIG. 4 is a diagram illustrating exemplary routing tables maintained in a shared memory according to embodiments of the present disclosure.

FIG. 5 illustrates exemplary layer 3 and/or layer 2 packet flow data structure according to embodiments of the present disclosure.

FIG. 6 illustrates an exemplary route catch table according to embodiments of the present disclosure.

FIGS. 7A-7C illustrates various routing tables according to embodiments of the present disclosure.

FIGS. 8A-8C illustrates various scenarios in which a network route may need to be updated according to embodiments of the present disclosure.

FIG. 9 illustrates an exemplary trie data structure used in session-based forwarding according to embodiments of the present disclosure.

FIGS. 10A-10B illustrate processes for session-based forwarding according to embodiments of the present disclosure.

FIG. 11 is a block diagram illustrating a system of session-based forwarding according to embodiments of the present disclosure.

DETAILED DESCRIPTION

In the following description, several specific details are presented to provide a thorough understanding. While the context of the disclosure is directed to SMP architecture performance enhancement, one skilled in the relevant art will recognize, however, that the concepts and techniques disclosed herein can be practiced without one or more of the specific details, or in combination with other components, etc. In other instances, well-known implementations or operations are not shown or described in details to avoid obscuring aspects of various examples disclosed herein. It should be understood that this disclosure covers all modifications, equivalents, and alternatives falling within the spirit and scope of the present disclosure.

Overview

Embodiments of the present disclosure relate to networking processing performance. In particular, the present disclosure relates to a system and method for providing efficient session-based forwarding with multiple networking central processing unit (CPU) cores. Specifically, the system achieves efficient session-based forwarding by maintaining a version associated with each route in a session table and determining whether a route is stale based on the value of the version associated with each route.

According to embodiments of the present disclosure, the conventional route cache table that enumerates all destinations on the shared memory is trimmed down to regular Neighbor table without the need for LPM based Route lookup. The packet forwarding pipeline process is optimized by performing route lookup only once per session flow (assuming that no route changes during the session). The present disclosure allows for caching a reference to a route in the session and caching a reference to the session in a tunnel or a logical interface, and thus not only enhancing the conventional per-packet based route lookup to per-flow based lookup, but also allowing direct access to route information from the tunnel or logical interface.

Specifically, with the solution provided herein, a disclosed network device receives a first packet in a session, and performs a route lookup based on a header of the first packet to determine a route for the first packet. Further, the network device caches a reference to the route in the session such that subsequent packets in the session are routed based on the cached reference in lieu of subsequent route lookups. The reference to the route comprises one or more of a route index (which may additionally include an equal cost multiple path (ECMP) index), a route version number, a neighbor index, and a neighbor index number.

For session based forwarding, the disclosed system compares a route's version number in the session against the version number in a route referred by the index in the session. Likewise, the disclosed system also compares the neighbor entry's version number in the session against the version number in a neighbor referred by the index in the session.

For tunnel based forwarding, the disclosed system can validate reference to the session by comparing the source and destination IP addresses. Specifically, the disclosed system can checks the source and/or destination IP address in the tunnel against the source and/or destination IP address in the session.

If the disclosed network device determines that the route is stale, the disclosed network device can perform another route lookup to update the route with one or more of an updated route index, an updated route version number, an updated neighbor index, and an updated neighbor version number. In some embodiments, however, if the disclosed network device determines that the route is stale and the session is inactive, it will delay route lookup until at least one packet is received in the session.

In some embodiments, at least two paths with identical cost correspond to the route are stored in the route table. Each path is identified by a unique Equal Cost Multiple Path (ECMP) index. When a new ECMP index is added to the route table, a subsequent session uses the path associated with the new ECMP index, but an existing session continues to use an existing path associated with an existing ECMP index.

In some embodiments, when at least two next hop nodes use Virtual Router Redundancy Protocol (VRRP), the route is determined to be stale based on difference between the first neighbor version number cached in the session and the second neighbor version number corresponding to the route in the neighbor table.

In some embodiments, if the route determined to be stale, the disclosed network device performs another route lookup to update the session with an updated route index and an updated route version number. And if the updated route index and the updated route version corresponding to a shorter alternative route than the route, the disclosed network device forwards subsequent packets in the session using the shorter alternative route. In one embodiment, the shorter alternative route is stored in a patricia trie as a child node of a parent node. Specifically, the parent node corresponds to the route; and, a route version number of the route corresponding to the parent node is updated/incremented in response to the child node being inserted in the patricia trie.

In some embodiments, the disclosed network device encapsulates the first packet based on information returned from a bridge lookup prior to encrypting the first packet.

Furthermore, the disclosed network device identifies a network interface that the first packet is to be transmitted on. Then, the disclosed network device sends the first packet to a security engine of the network device to encrypt the first packet, and instructs the security engine to forward encrypted first packet to the identified network interface in lieu of returning the encrypted first packet to a processor within the network device.

Computing Environment

FIG. 1 shows an exemplary wireless digital network environment according to embodiments of the present disclosure. FIG. 1 includes at least one or more network controller (such as controller 100), one or more access points (such as access point 160), one or more client devices (such as client 170), a layer 2 or layer 3 network 110, a routing device (such as router 120), a gateway 130, Internet 140, and one or more web servers (such as web server A 150, web server B 155, and web server C 158), etc.

Controller 100 is a hardware device and/or software module that provide network managements, which include but are not limited to, controlling, planning, allocating, deploying, coordinating, and monitoring the resources of a network, network planning, frequency allocation, predetermined traffic routing to support load balancing, cryptographic key distribution authorization, configuration management, fault management, security management, performance management, bandwidth management, route analytics and accounting management, etc.

Moreover, assuming that a number of access points, such as access point 160, are interconnected with network controller 100. Each access points may be interconnected with zero or more client devices via either a wired interface or a wireless interface. In this example, for illustration purposes only, assuming that client 170 is associated with access point 160 via a wireless link. Access points generally refer to a network device that allows wireless clients to connect to a wired network. Access points usually connect to a router via a wired network or can be a part of a router in itself.

Furthermore, controller 100 can be connected to router 120 through zero or more hops in a layer 3 or layer 2 network (such as L2/L3 Network 110). Router 120 can forward traffic to and receive traffic from Internet 140 through gateway 130. Router 160 generally is a network device that forwards data packets between different networks, and thus creating an overlay internetwork. A router is typically connected to two or more data lines from different networks. When a data packet comes in one of the data lines, the router reads the address information in the packet to determine its destination. Then, using information in its routing table or routing policy, the router directs the packet to the next/different network. A data packet is typically forwarded from one router to another router through the Internet until the packet gets to its destination.

Gateway 130 is a network device that passes network traffic from local subnet to devices on other subnets. In the example in FIG. 1, gateway 130 is a default gateway that often connects a local network (such as L2/L3 Network 110) to Internet 140. In some embodiments, gateway 130 may be a part of router 120 depending on the configuration of router 120.

Web servers 150, 155, and 158 are hardware devices and/or software modules that facilitate delivery of web content that can be accessed through Internet 140. For example, web server 150 may be assigned an IP address of 1.1.1.1 and used to host a first Internet website (e.g., www.yahoo.com); web server 155 may be assigned an IP address of 2.2.2.2 and used to host a second Internet website (e.g., www.google.com); and, web server 158 may be assigned an IP address of 3.3.3.3 and used to host a third Internet website (e.g., www.facebook.com).

In packet switching networks, a flow generally refers to a sequence of packets from a source network/client device to a destination network/client device, which may be another host, a multicast group, or a broadcast domain. A flow could consist of all packets in a specific session connection or media stream. Each layer 2 or layer 3 network session can be uniquely identified by a session key, which may be a layer 3 network session key or a layer 2 network session key. A layer 3 network session key generally includes information, such as a source Internet Protocol (IP) address, a destination IP address, a protocol, a layer 4 source port, a layer 4 destination port, etc. Moreover, a layer 2 network session key generally includes a source Media Access Control (MAC) address, a destination MAC address, Ethernet type, etc. The above described session keys are maintained in a session table use for session management.

General Architecture

FIG. 2 illustrates a general architecture including multiple processing planes according to embodiments of the present disclosure. Specifically, FIG. 2 includes at least a control plane process 210, two or more datapath processors 220, a lockless shared memory 260 accessible by the two or more datapath processors 220, and a network interface 250.

Control plane process 210 may be running on one or more CPU or CPU cores, such as CP CPU 1 212, CP CPU 2 214, . . . CP CPU M 218. Furthermore, control plane process 210 typically handles network control or management traffic generated by and/or terminated at network devices as opposed to data traffic generated and/or terminated at client devices.

According to embodiments of the present disclosure, datapath processors 220 include a single exception processing CPU, such as a slowpath (SP) processor (e.g., Exception Processing CPU 230) and multiple forwarding CPU, such as fastpath (FP) processors (e.g., Forwarding CPU 1 240, Forwarding CPU 2 242, . . . Forwarding CPU N 248). Only forwarding processors are able to receive data packets directly from network interface 250. Exception processing processor, on the other hand, only receives data packets from the forwarding processors.

Lockless shared memory 260 is a flat structure that is shared by all datapath processors 220, and not tied to any particular CPU or CPUs. Any datapath processor 220 can read any memory location within lockless shared memory 260. Therefore, both the single exception processing processor (e.g., Exception Processing CPU 230) and the multiple forwarding processors (e.g., Forwarding CPU 1 240, Forwarding CPU 2 242, . . . Forwarding CPU N 248) have read access to lockless shared memory 260, but only the single exception processing processor (e.g., Exception Processing CPU 230) has write access to lockless shared memory 260. More specifically, any datapath processor 220 can have access to any location in lockless shared memory 260 in the disclosed system.

Also, control plane process 210 is communicatively coupled to exception processing CPU 230, such as a slowpath (SP) CPU, but not forwarding CPU, such as fastpath (FP) processors (e.g., Forwarding CPU 1 240, Forwarding CPU 2 242, . . . Forwarding CPU N 248). Thus, whenever control plane process 210 needs information from datapath processors 220, control plane process 210 will communicate with exception processing CPU 230, such as an SP processor.

Network Forwarding

FIG. 3 illustrates an exemplary network forwarding process according to embodiments of the present disclosure. A typical pipeline process at a FP processor involves one or more of the following operations:

Port lookup;

VLAN lookup;

Port-VLAN table lookup;

Bridge table lookup;

Firewall session table lookup;

Route table lookup;

Packet encapsulation;

Packet encryption;

Packet decryption;

Tunnel de-capsulation; and/or

Forwarding; etc.

Thus, the network forwarding process illustrated in FIG. 3 includes at least a port lookup operation 300, a virtual local area network (VLAN) lookup operation 305, a port/VLAN lookup operation 310, a bridge lookup operation 315, a firewall session lookup operation 320, a route lookup operation 325, a forward lookup operation 330, an encapsulation operation 335, an encryption operation 340, a tunnel decapsulation operation 345, a decryption operation 350, and a transmit operation 360.

FIG. 4 is a diagram illustrating exemplary routing tables maintained in a shared memory according to embodiments of the present disclosure. Shared memory 380 can be used to store a variety of tables to assist software network packet forwarding. For example, the tables may include, but are not limited to, a bridge table, a session table, a user table, a station table, a tunnel table, a route table and/or route cache, etc. Specifically, in the example illustrated in FIG. 4, shared memory 400 stores at least one or more of a port table 410, a VLAN table 420, a bridge table 430, a station table 440, a route table 450, a route cache 460, a session policy table 470, a user table 480, etc. Each table is used during network forwarding operations illustrated in FIG. 3 for retrieving relevant information in order to perform proper network forwarding. For example, port table 410 is used during port lookup operation to look up a port identifier based on the destination address of a network packet. Likewise, VLAN table 420 is used during VLAN lookup operation to look up a VLAN identifier based on the port identifier and/or source/destination address(es). Note that, a table can be used by multiple network forwarding operations, and each network forwarding operation may need to access multiple routing tables.

In some embodiments, shared memory 400 is a lockless shared memory. Thus, multiple tables in shared memory 400 can be accessed by multiple FP processors while the FP processors are processing packets received one or more network interfaces. If the FP processor determines that a packet requires any special handlings, the FP processor will hand over the packet processing to the SP processor. For example, the FP processor may find a table entry corresponding to the packet is missed; and therefore, handing over the packet processing to the SP processor. As another example, the FP processor may find that the packet is a fragmented packet, and thus hand over the packet processing to the SP processor.

A. Packet Flows

As mentioned above, a flow generally refers to a sequence of packets from a source network/client device to a destination network/client device, which may be another host, a multicast group, or a broadcast domain. A flow could consist of all packets in a specific session connection or media stream. FIG. 5 illustrates exemplary flow packets according to embodiments of the present disclosure. Note that, a layer 2 or layer 3 packet flow may include multiple fragmented packets, which include, e.g., a first fragment (or a parent fragment) and one or more subsequent fragments (or data fragments).

Also, each packet may include multiple portions. For example, a packet with a L3 packet key 500 may include at least a network layer (layer 3 or L3) header that includes L3 source IP 510, L3 destination IP 515, and protocol 520, a transport layer (layer 4 or L4) header that includes L4 source port 525 and L4 destination port 530. As another example, a packet with a L2 packet key 550 may include at least a media access control layer (layer 2 or L2) header that include source media access control (MAC) address 560, destination MAC address 570, and Ethernet type 580.

Subsequent fragments include at least a network layer (layer 3 or L3) header, but do not include any transport (layer 4 or L4) header. Transport (layer 4 or L4) header is required for session-based forwarding, for example, when firewall policies need to be applied to the packet. Even though subsequent fragments do not include any transport (layer 4 or L4) header, they are typically applied with the same session policies as those applied to the first segment.

B. Route Cache

FIG. 6 illustrates an exemplary route catch table according to embodiments of the present disclosure. In the example shown, route cache table 600 includes fields such as destination IP address 605, next hop interface 610, gateway MAC address 615, and neighbor information 620, which duplicates a corresponding neighbor entry in a neighbor table. To populate route catch table 600, a network device first performs a longest prefix match lookup in the route table based on the destination address (e.g., “1.1.1.1”) of a network packet to find out the next hop interface corresponding to the destination address (e.g., InterfaceNH1). Assuming, for example, the longest prefix match in the route table for “1.1.1.1” is “1.1.0.0,” which corresponds to next hop interface of InterfaceNH1. Note that, the next hop interface may include, but is not limited to, a VLAN, a tunnel, and/or a port.

Then, the network device inserts an entry in route cache table 600 with the destination IP address, the next hop MAC interface resulting from the route table lookup, and the default gateway MAC address (e.g., MACGW1) that is known to the network device (e.g., a network controller). The inserted entry will also include other information in its corresponding neighbor entry Entry1 from the neighbor table. In the example illustrated in FIG. 6, for destination IP address “2.2.2.2,” the next hop interface is InterfaceNH2, the default gateway MAC address is MACGW2, and the neighbor information includes neighbor entry Entry2; for destination IP address “3.3.3.3,” the next hop interface is InterfaceNH3, the default gateway MAC address is MACGW3, and the neighbor information includes neighbor entry Entry3. Note that, when multiple routes exist for a destination IP address, each route will correspond to a different next hop interface although the default gateway address may be the same. Thus, each route to the same destination IP address corresponds to a unique entry in the route cache table.

In some embodiments, route cache table 600 is maintained as a hash table by applying a hash function on the destination IP address of the packet. Route cache table 600 may introduce a few issues. First, the cost of matching longest prefix for each packet can be high. This is especially true in IPv6 network, where the IP addresses become longer than conventional IPv4 addresses. Second, as the number of hashed entries in route cache table increases, the cost for looking up route cache table 600 also increases accordingly. Third, maintaining the consistency of route cache table 600 may result in additional costs.

For example, when the next hop address in a route corresponding to “1.1.0.0” changes in the routing table, the system would have to perform a reverse lookup to search for all destination IP addresses for which “1.1.0.0” is the longest prefix match, and update the next hop address in all of those entries. Therefore, the convergence after a route change is slow because of the costs involved in maintaining the consistency of route cache table 600.

C. Routing Tables

FIGS. 7A-7C illustrate various routing tables according to embodiments of the present disclosure. The illustrated routing tables in combination provide an alternative and an enhancement to the route cache table describe above. Specifically, FIG. 7A illustrates an exemplary route table 700, which includes a field for version 702, prefix/length 705 and a field for next hop address 710 and also an optional next hop interface for routes through P2P interface (not shown). For example, according to route table 700, the next hop address for “1.0.0.0/8” is IPNH1) and the version of the route is 100; the next hop address for “0.0.0.0/0” is IPNH2, and the version of the route is 101; etc.

FIG. 7B illustrates an exemplary session table 720, which includes at least the following fields: source IP address 725, destination IP address 730, route index 735, route version 740, neighbor index 745, neighbor version 750, etc. Route index 735 may also include a next hop index in cases where equal cost multiple paths (EMCP) apply. In the first example shown in FIG. 7B, the route index corresponding to a route from the source IP address “100.100.100.100” to the destination IP address “1.1.1.1” is route1. Also, the corresponding route version for route1 is 100; and, the corresponding neighbor index and neighbor version are ARP1 and 10 respectively. In the second example, due to a neighbor change in route1, the neighbor index, neighbor version, and route version are updated to ARP2, 20, and 101 respectively responsive to the next hop change for route1.

Note that, the version number 100 for route1 is the route version number corresponding to route entry route1 in the route table at the time when the disclosed system performs the route lookup and insert the reference to route1 into session table 720. Likewise, the version number 10 is neighbor ARP1 is the neighbor version number corresponding to the neighbor entry ARP1 in the neighbor table at the time when the disclosed system insert the reference to APR1 into session table 720.

FIG. 7C illustrates an exemplary neighbor table 760, which includes at least the following fields: version 765, index 770, IP address 775, MAC address 780, VLAN identifier 790, etc. Neighbor table 760 provides a mapping between a destination IP address and MAC address. In some embodiments, information in neighbor table 760 can be obtained by transmitting an Address Resolution Protocol (ARP) request and analyze the corresponding ARP response. In the example shown in FIG. 7C, for the first neighbor with neighbor index number 1, the IP address is 10.10.10.10, which corresponds to VLAN 10; the corresponding MAC address is MAC1; and the version number is 10. Likewise, for the second neighbor with neighbor index number 2, the IP address is 18.18.18.18, which corresponds to VLAN 18; the corresponding MAC address is MAC2; and the version number is 20. Further, for the third neighbor with neighbor index number 3, the IP address is 15.15.15.15, which corresponds to VLAN 15; the corresponding MAC address is MAC2; and the version number is 30. The above examples are provided for illustration purposes only. Different neighbors may have the same version number, but their IP address and MAC address will be unique.

In addition, there exists a firewall session policy table, which includes information, such as permission (e.g., permit or deny access), destination network address translation (DNAT), source network address translation (SNAT), rate limiting, etc. In some embodiments, a flow table can be used for stateful firewall purposes in addition to firewall purposes. According to the present disclosure, the firewall session policy table and/or flow table can be modified to additionally include the next hop information and version information, and used for routing purposes.

During operation, the system monitors every session based on flow-based destination IP address, which persists through the entire session. Because the value of the destination IP address does not change for a particular session, information such as those cached in the route cache can be cached in the session for easier access during the session instead of performing a lookup operation on a packet-by-packet basis. As mentioned previously in description regarding FIG. 3, a typical forwarding pipeline includes a bridge lookup operation, a firewall session policy lookup operation, and subsequent routing operations. The firewall session policy lookup operation checks the firewall policy based on the destination IP address of the packets. For example, a user may be redirected to a captive portal page when the user is trying to access a destination IP address of “1.1.1.1” according to the enterprise firewall policy configurations.

More specifically, the system will be forwarding packets to “1.1.1.1” in a session (or a flow). Accordingly, the system will initiate a route lookup based on the destination IP address of the session, e.g., “1.1.1.1.” For illustration purposes only, assuming that the route lookup returns a next hop IP address of “10.10.10.10.” The system will then perform a neighbor lookup based on the resulting IP address of the next hop. In this example, it is assumed that the neighbor lookup returns MAC1 (e.g., 01:02:03:04:05:06) and VLAN identifier V10.

It is important to note that the firewall session policies include source and destination IP addresses along with other keys. Therefore, in every session for which the system performs session-based forwarding, if the session is determined to be a router session (which means that the session is not a client-to-client session that the network system can simply forward the packets by bridging the packets, but a session that requires a router to route the packets to the Internet), the system can then cache the routing lookup results in the session itself.

In another example, one router is configured as the default router. Thus, every packet will be sent to the same MAC address corresponding to the default router, but the next hop address will be different depending on the destination IP address of each session. In this example, it is possible to eliminate all user VLANs and to use only one VLAN to route all packets to the default router. In addition, a guest VLAN may be configured to route all guest traffic to a network controller within the WLAN.

Therefore, the disclosed system may optimize the forwarding pipeline by caching the routing information in each session. However, it is possible that during an active session, a route to the destination IP address may change. In such scenarios, the disclosed system will update the session table to reflect the route changes. To improve the efficiency in the updating operations, rather than maintaining a copy of the relevant route information in the session table, the disclosed system maintains a reference to an entry in the route table and a version number corresponding to the route reference, as well as a reference to an entry in the neighbor table and a version number corresponding to the neighbor reference.

With this solution, the disclosed system no longer needs to perform a route lookup after a firewall session lookup, because the system can obtain route information directly from the sessions. Neither is it necessary to maintain a route cache in the system any longer, which reduces the cost for routing operations compared with other solutions.

Maintenance of Route Consistency

The disclosed system is also able to efficiently maintain the consistency of the route information in the session table (or flow table) and the route information in the route table and the neighbor table (or ARP table). Because the reference and version number to each route is maintained in the session, any time when a route changes, the system will be able to quickly detect the route change based on one or more of a change in the route reference, route version, neighbor reference, and/or neighbor version. Moreover, the system can easily update the session with the route change by updating one or more of the route reference, route version, neighbor reference, and/or neighbor version to reflect the route change.

Specifically, to detect a route change, for every packet being forwarded in a session, the system de-references the route index and the neighbor index in the session entry to retrieve the corresponding entries in the route table and the neighbor table. The system then obtains the current version number corresponding to the route index in the route table and the current version number corresponding to the neighbor index in the neighbor table. Next, the system determines whether certain conditions that indicate that the route in the session table is stale have occurred. For example, the system can determine whether the route version number and the neighbor version number maintained in the session entry match the current version numbers and/or neighbor index obtained above. If either version number in the session table is different from its corresponding version number in the route table or the neighbor table or if the neighbor index in the session table is different from its corresponding neighbor index in the neighbor table, then the session entry is stable. Therefore, the system will perform a route lookup using longest prefix match of the destination IP address of the session, and update the route information based on the results from the route lookup.

Note that, in the route table, a version number corresponding to a route index changes whenever there is a change in the route, e.g., a change in the next hop address. On the other hand, in the neighbor table, a version number corresponding to a neighbor index changes whenever the mapping between the IP address and the MAC address of a network node (e.g., a default gateway or a next hop for a particular VLAN) changes, for example, when the Ethernet interface changes.

The following sections describe a few exemplary scenarios in which a route could be changed during an active session, and how the system will detect the route change in each scenario. These examples are provided for illustration only. They are not intended to be an exhaustive list of all possible scenarios. One skilled in the art can apply the techniques disclosed herein to detect other types of route changes without departing from the spirit of the invention.

A. Route Change

In this scenario, the default gateway's IP address may change during a session because the route gets modified, for example, from “10.10.10.10” at time point t1 to “18.18.18.18” at time point t2. In the route table, assuming that, at t1, the route entry has the version number of 100, index value of 1, prefix/length value of “1.0.0.0/8,” and next hop IP address of “10.10.10.10.”

Accordingly, in the session table, at time point t1, the session entry has values as shown in session entry 760 in session table 720, e.g., having a source IP address of “100.100.100.100,” a destination IP address of “1.1.1.1,” a route index corresponding to route1, a route version of 100, a neighbor index corresponding to ARP1, and a neighbor version of 10.

Based on the change above, at time point t2, the same route entry has the version number of 101, index value of 1, prefix/length value of “1.0.0.0/8,” and next hop IP address of “18.18.18.18.”

When the system receives the first packet in the session after the time point t2, the system will detect that the route version in the session table (e.g., 100) does not match the route version in the route table (e.g., 101). Thus, the system will deem the session entry as stale and perform a route lookup to update the session entry.

After the update, the session entry will have the values as shown in session entry 765 in session table 720, e.g., having a source IP address of “100.100.100.100,” a destination IP address of “1.1.1.1,” a route index corresponding to route1, a route version of 101, a neighbor index corresponding to ARP2, and a neighbor version of 20. Note that, the route version has changed from 100 to 101; the neighbor index has changed from ARP1 (corresponding to “10.10.10.10” in neighbor table 760) to ARP2 (corresponding to “18.18.18.18” in neighbor table 760); and, the neighbor version has changed from 10 (corresponding to APR1 in neighbor table 760) to 20 (corresponding to ARP2 in neighbor table 760).

Similarly, when a route is deleted rather than modified, the system will also detect a mismatch in the version numbers between the session table and the route table, because the corresponding route entry in the route table is missing. Therefore, the system will initiate a route lookup to search for a new route to the destination IP address based on the longest prefix match to the current route table (in which the original route was deleted), and update the session table with the information regarding the new route.

Also, note that, the route lookup is performed when the system receives a packet in the session after time point t2. Thus, if no packet is received which indicates that the client is in an idle state, then the session entry will remain to be stale despite that a change has occurred in the route table at time point t2. This is because when the client is idle, the client is not using the route, and thus there is no need to spend any resources on updating the route for an idle client. Furthermore, if the session entry remains to be stale for an excessive amount of time because no client is using the session, the session entry will eventually be removed without ever being updated with the route change at all.

B. Equal Cost Multiple Paths (ECMP)

FIG. 8A illustrates a use case scenario of ECMP, where there exits multiple paths with equal cost. In this example, a packet in a session flow from node A 800 to a destination address in L2/L3 Network 880 needs to be forwarded by the disclosed system. Further, node A 800 is interconnected with two different next hop nodes, which are next hop A 810 and next hop B 820. Each of the two different next hop nodes corresponds to a unique path to the destination address with equal cost.

For illustration purposes, assuming that, next hop A 810 is associated with the IP address of “10.10.10.10” and is pre-existing at time point t1. Thus, at time point t1, in the route table, the route entry has the version number of 101, index value of 1, prefix/length value of “1.0.0.0/8,” and next hop IP address of “10.10.10.10” corresponding to next hop A 810.

In addition, assuming that, at time point t2, a new route with equal cost from the source address to the destination address through next hop B 820 becomes available during the session. Assuming that, next hop B 820 is associated with the IP address of “18.18.18.18.”

Accordingly, the route table will be updated. Thus, the same route entry now has the version number of 101, index value of 1, prefix/length value of “1.0.0.0/8,” and next hop IP address of “10.10.10.10; 18.18.18.18” corresponding to both next hop A 810 and next hop B 820. In this example, each next hop IP address corresponds to a unique ECMP index. In addition to caching the route index, the session may also cache the ECMP index in the cases involving ECMPs. Note that, the version number remains the same, but the ECMP index is increased with the additional ECMP route becoming available.

Subsequently, any new sessions to a destination address for which “1.0.0.0/8” provides the longest prefix match will be using the new ECMP route corresponding to next hop B 820 with the IP address of “18.18.18.18.” Nevertheless, any existing route will continue to use next hop A 810 with the IP address of “10.10.10.10,” because the route version number has not changed. As a result, the system will determine that the route through next hop A 810 is not stale and does not need to be changed or updated.

This scheme can be particularly useful when traffic from a private IP network is to be forwarded to two or more networks corresponding to different uplink service providers (e.g., AT&T® and Verizon®) via two or more different IP addresses, such as IP1 and IP2. It is desirable that the traffic to the first service provider is only transmitted via IP1, and the traffic to the second service provider is only transmitted via IP2. Therefore, the traffic from users of different service providers will not get mixed with each other.

C. Virtual Router Redundancy Protocol (VRRP)

Virtual Router Redundancy Protocol (VRRP) is a computer networking protocol that provides for automatic assignment of available IP routers to participating hosts. This increases the availability and reliability of routing paths via automatic default gateway selections on an IP sub-network. The VRRP protocol achieves this by creation of virtual routers, which are an abstract representation of multiple routers, i.e., master and backup routers, acting as a group. The default gateway of a participating host is assigned to the virtual router instead of a physical router. If the physical router that is routing packets on behalf of the virtual router fails, another physical router is selected to automatically replace it. The physical router that is forwarding packets at any given time is called the master router. Thus, at any given time, there is only one physical router that is actively forwarding the traffic.

FIG. 8B illustrates a use case scenario of VRRP, where there exits two alternative paths with alternative next hop nodes. In this example, a packet in a session flow from node A 800 to a destination address in L2/L3 Network 880 needs to be forwarded by the disclosed system. Further, node A 800 is interconnected with two different next hop nodes, which are next hop A 810 and next hop B 820. Each of the two different next hop nodes corresponds to the same virtual router under VRRP 815. Therefore, at any given time, traffic from node A 800 will be able to reach next hop C 830 via either next hop A 810 or next hop B 820 but not both, and the traffic will continue to be forwarded to next hop C 830 via L2/L3 network 880.

Assuming that, at time point t1, next hop A 810 is active under VRRP 815 and corresponds to the IP address of “10.10.10.10.” Thus, in the neighbor table, the neighbor entry has the version number of 10, index value of 1, IP address of “10.10.10.10,” MAC address of MACA (corresponding to next hop A 810), and VLAN identifier value of V10.

At time point t2, assuming that next hop A 810 fails, and next hop B 820 starts to function as the virtual router. Therefore, in the neighbor table, the same neighbor entry now has the version number of 11, index value of 1, IP address of “10.10.10.10,” MAC address of MACB (corresponding to next hop B 820), and VLAN identifier value of V10.

Because the neighbor version has changed from 10 to 11, the system will determine that the route has become stale. Consequently, the system will perform a route lookup and update the session entry in the session table with the new neighbor version number.

D. Shorter Alternative Route

FIG. 8C illustrates a use case scenario in which an alternative shorter route is added during the session. In this example, a packet in a session flow from node A 800 to a destination node D 890 is initially forwarded through next hop B 820 and L2/L3 Network 880. Subsequently, assuming that node A 800 is interconnected with an additional next hop node, which is next hop A 810, and that next hop A 810 is further interconnected with destination node D 890. Here, for illustration purposes only, it is assumed that next hop A is associated with an IP address of “1.1.0.0” and that next hop B is associated with an IP address of “1.0.0.0.” Therefore, there are two alternative routes between node A 800 and destination node D 890. Specifically, traffic from node A 800 will be able to reach destination node D 890 via next hop A 810, which provides a shorter route than the original route via next hop B 820 and L2/L3 Network 880.

In some embodiments, route information can be maintained in a patricia trie. A patricia trie generally refers a space-optimized trie data structure, where each node with only one child is merged with its child. As a result, every internal node has at least two children. Unlike in regular tries, edges can be labeled with sequences of elements as well as single elements. This makes them much more efficient for small sets (especially if the strings are long) and for sets of strings that share long prefixes.

FIG. 9 illustrates an exemplary patricia trie used in session-based forwarding according to the present disclosure. In this given example, patricia trie 900 includes at least root node 920, node 940, and node 960, which correspond to “0.0.0.0/0,” “1.0.0.0/8,” and “1.1.0.0/16” respectively. Note that, node 940 is a child node of root node 920. Also, node 960 is a child node of node 940 and a grandchild node of root node 920. Therefore, in the patricia trie illustrated in FIG. 9, a child node always has a longer prefix match than its parent node.

Furthermore, when a new route is inserted into a patricia trie as a new node (e.g., assuming that node 960 is inserted as a child node of node 940), the disclosed system performs at least two operations: First, the system will add the new route (e.g., “1.1.0.0/16”) to the route table with a new route index that is different from the route index of the original route corresponding to the parent node in the patricia trie. Second, the system will increase, in the route table, the version number of the route corresponding to the parent node of the inserted node (e.g., the version number of route “1.0.0.0/8” will be increased from 100 to 101; note that the route “1.0.0.0/8” corresponds to parent node 940 of the inserted node 960 in this example).

Because the version number of the route corresponding to the parent node in the patricia trie gets updated, the corresponding route entry (e.g., “1.0.0.0/8”) becomes stale due to the difference in the route version number in the session table (e.g., 100) and in the route table (e.g., 101). Thus, the system will perform a route lookup to update the route information. As a result, the route lookup will return the shorter route, e.g., “1.1.0.0/16”, with a new route index instead of the original route (e.g., “1.0.0.0/8”). Thus, subsequent traffic from the same source node to the same destination node will be forwarded through the updated shorter route.

Note that, as mentioned above, when traffic from a private IP network is to be forwarded to two or more networks corresponding to different uplink service providers (e.g., AT&T® and Verizon®) via two or more different IP addresses, such as IP1 and IP2, it is desirable that the traffic to the first service provider is only transmitted via IP1, and the traffic to the second service provider is only transmitted via IP2. In such scenarios, typically a network address translation (NAT) of either source IP address or destination IP address of the packets is involved. Nevertheless, when no NAT operation is involved and a shorter route is found during a session as in the example illustrated above in the description of FIG. 8C and FIG. 9, it would be desirable to switch to a shorter route during an existing session.

Note that, the route lookup is performed only when the system receives a packet in a session. Thus, if in a second session where the new route “1.1.0.0/16” can be used, but no packet is received in the second session due to the client being idle, then the session entry will remain to be stale despite that a new and shorter route has been inserted in the route table. This is because when the client is idle, the client is not using the route, and thus there is no need to spend any resources on updating the route for an idle client. Furthermore, if the session entry remains to be stale for an excessive amount of time because no client is using the session, the session entry will eventually be removed without ever being updated with the route change at all.

It is important to note that, in the present disclosure, only active purging is used, and there is not background purging involved. Therefore, a session is updated only when there are active traffic activities in the session. No background process is used to update the session entries, because there is no need to utilize the resource for a session when the session is idle.

Caching Session Information in Secured Tunnels

In some embodiments, a computing environment as illustrated in FIG. 1 may have a L2/L3 network between controller 100 and access point 160. In such embodiments, a L2 or L3 tunnel (e.g., a Generic Routing Encapsulation (GRE) tunnel) can be established between controller 100 and access point 160 for transmission of packets to and from client 170. A tunnel is generally represented as (source IP address, destination IP address, protocol, L4 attributes), and is usually identified by a unique session key.

As illustrated in FIG. 3, a typical network forwarding process includes, inter alia, a bridge lookup 315 (e.g., on an Ethernet packet formatted as IEEE 802.3 packet), a firewall session lookup 320, a route lookup 325, a forwarding lookup 330, etc. Then, if it is determined that the packet needs to be transmitted via a tunnel between a controller and an access point, the packet is further encapsulated with an outer header, e.g., a GRE header, to be converted to an IEEE 802.11 packet format. The encapsulated packet in IEEE 802.11 format again goes through the same series of network forwarding process, e.g., a firewall session lookup 320, a route lookup 325, a forwarding lookup 330, etc., before it can be properly forwarded to its destination.

In a system as illustrated in FIG. 2, one of datapath processors 220 (e.g., FP CPU 1 240, FP CPU 2 242, . . . , or FP CPU N 248) will perform a session and/or route lookup, then send the packet to a security engine (not shown) for encryption. The encrypted (and encapsulated) packet will be returned back to datapath processors 220, which will then forward the encrypted and encapsulated packet to its corresponding network interface 250.

In some embodiments, datapath processors 220 may perform encapsulation prior to sending the packet to the security engine for encryption. In addition, datapath processors 220 may instruct the security engine which destination network interface 250 is associated with the packet. Therefore, after the security engine completes the encryption of the packet, the security engine can directly forward the packet to its corresponding destination network interface 250 without returning the encapsulated packet to datapath processors 220.

Specifically, the system will first perform a route and neighbor (e.g., ARP) lookup, which will return a MAC address and a VLAN identifier corresponding to the destination IP address in the packet. Next, based on the combination of MAC and VLAN identifier, the system performs a bridge lookup, which will return a destination network interface that can be either a port identifier or a tunnel identifier.

Based on the MAC address corresponding to the destination IP address, the system can determine whether the packet is a unicast packet or a multicast packet. If the packet is a unicast packet, the system can use a unicast key to encrypt the packet. On the other hand, if the packet is a multicast packet, the system can use a tunnel or multicast key to encrypt the packet.

Furthermore, based on the destination network interface, the system can determine whether the packet needs to encapsulated. For example, if the destination network interface returned from the bridge lookup is associated with a GRE tunnel, then the system can determine that the packet will need to be encapsulated with the GRE headers before being forwarded to its destination. Typically, in order to perform an encapsulation, the system needs to know the tunnel information for the packet, which includes the source and destination IP addresses (available in the header of the packet), the transmission protocol (which can be determined based on the tunnel identifier), and L4 attributes associated with the packet (which are usually cached in the tunnel). Therefore, upon successful bridge lookup, the system would be able to perform an encapsulation of the packet based on the information returned from the bridge lookup.

Note that, if the system identifies that a packet needs to be encrypted and encapsulated, the system can perform the encapsulation prior to the encryption, and thereby avoiding the need for the packet to be returned to datapath processors after encryption. This simplified packet flow within the system, e.g., from the FP processors to security engine directly to network interface without the packet being returned to the FP processors by the security engine, allows for dramatic performance enhancement in a high performance controlling and switching system.

After session entries are cached in a tunnel, the system can combine the tunnel encapsulation operations with the L2/L3 lookups (such as, firewall session lookup 320, route lookup 325, forwarding lookup 330, etc.), and thereby avoid feeding the network forwarding process twice with the same packet (but differently formatted as IEEE 802.3 for the first time and IEEE 802.11 for the second time). In one embodiment, a link from the tunnel to the session (e.g., index value of the corresponding session entry in the session table) is maintained where the session further includes routing information as described above. The link will provide quick access to important routing information stored in session, and therefore allowing for determination of whether the system can leverage the session information for simplified session-based forwarding (e.g., where no complex firewall operations are required for the session).

Furthermore, for subsequent packets within the same session, the system can use the cached link to the session entry to retrieve the routing information, and thus avoiding feeding the packets through the session forwarding pipeline process. In summary, rather than feeding every packet through the session forwarding pipeline twice (first with IEEE 802.3 format and second with IEEE 802.11 format), the present disclosure allows for the first packet in a flow to be sent through the session forwarding pipeline once whereby the encryption and encapsulation operations are combined into the pipeline process, and for any subsequent packets to bypass the session forwarding pipeline by providing a direct link from tunnel to the corresponding session entry, which caches the corresponding routing information returned from the route lookup performed for the first packet in the flow.

Processes for Session-Based Forwarding

FIGS. 10A-10B are flowcharts illustrating exemplary processes for session-based forwarding. Specifically, FIG. 10A illustrates an exemplary session-based forwarding process in which route references are cached in the session to avoid per-packet based route lookup. During operation, the disclosed system receives a first data packet in a session (operation 1000). The system then performs a route lookup to determine a route for the first packet (operation 1005). In addition, for session-based forwarding, the disclosed system caches a reference to the route and the neighbor in the session (operation 1010). Further, the disclosed system optionally caches a reference to the session and the neighbor in a tunnel in cases where tunnel-based forwarding mechanism is used (operation 1015). The reference to the route includes one or more of a route index, a route version number, a neighbor index, and a neighbor version number. In some embodiments, the tunnel can be a GRE tunnel within which packets in the session are to be forwarded. Note that, caching the reference to the session in the tunnels allows for direct access to the route information from the tunnel.

Moreover, the disclosed system compares a first route version number cached in the session with a second route version number cached in a route (operation 1020), and then determines whether the route is stale (operation 1025). In some embodiments, the system further compares a first neighbor index and version number cached in the session with a second neighbor index and version number corresponding to the route in a neighbor table, and determines that the route is stale if the first neighbor index or version number is different from the second neighbor index or version number.

If the system determines that the route is stale, the system will perform another route lookup to update the route (operation 1030). Specifically, the system may update the route with one or more of an updated route index, an updated route version number, an updated neighbor index, and an updated neighbor index number. Nevertheless, in some embodiments, if the system determines that the route is stale but the session is inactive, the system will delay route lookup until at least one packet is received in the session.

In some embodiments, at least two paths with identical cost corresponding to the route are stored in the route table; and, each path is identified by a unique Equal Cost Multiple Path (ECMP) index. When a new ECMP index is added to the route table, a subsequent session uses the path associated with the new ECMP index, but an existing session continues to use an existing path associated with an existing ECMP index.

In some embodiments, at least two next hop nodes use Virtual Router Redundancy Protocol (VRRP), the route is determined to be stale based on the difference between a first neighbor version number cached in the session and a second neighbor version number corresponding to the route in the neighbor table.

Next, when tunnel-based forwarding mechanism is used, the system can use the cached reference to the session in the tunnel for forwarding subsequent packets in the session (operation 1035). Thus, the system only needs to perform a route lookup for the first packet in a session unless there are route changes during the session that prompts for another route lookup to update the route.

In other embodiments, when a route is determined to be stale, the system performs another route lookup to update the session with an updated route index and an updated route version number. Such updated route index and updated route version number may correspond to a shorter alternative route than the original route. If so, the system will forward subsequent packets in the session using the shorter alternative route. In one embodiment, the shorter alternative route is stored in a patricia trie as a child node of a parent node. Specifically, the parent node corresponds to the route; and, a route version number corresponding to the parent node is increased when a child node is inserted in the patricia trie.

FIG. 10B illustrates an exemplary session-based forwarding process in which a packet is encapsulated prior to being forwarded to a security engine such that the security engine can forward encrypted packet directly to the corresponding network interface. During operation, the disclosed system performs a bridge lookup based on a received packet (operation 1040). Then, the system encapsulates the packet based on information returned from the bridge lookup (operation 1045). Also, the system identifies a network interface that the packet is to be transmitted on (operation 1050). Then, the system sends the packet to a security engine for encryption (operation 1055). Furthermore, the system instructs the security engine to forward the encrypted packet to the identified network interface (operation 1060). Thus, unlike conventional forwarding process, the security engine does not need to return the packet to a process within the system and can directly forward the encrypted packets to a network via the identified network interface.

System for Session-Based Forwarding

FIG. 11 is a block diagram illustrating a network device system for session-based forwarding according to embodiments of the present disclosure. Network device 1100 includes at least a network interface 1110 capable of communicating to a wired network, a shared memory 1120 capable of storing data, a exception processing processor core 1130 capable of processing network data packets, and one or more forwarding processor cores, including forwarding processor core 1142, forwarding processor core 1144, . . . , forwarding processor core 1148, which are capable of processing network data packets. Moreover, network device 1100 may be used as a network switch, network router, network controller, network server, etc. Further network device 1100 may serve as a node in a distributed or a cloud computing environment.

Network interface 1110 can be any communication interface, which includes but is not limited to, a modem, token ring interface, Ethernet interface, wireless IEEE 802.11 interface (e.g., IEEE 802.11n, IEEE 802.11ac, etc.), cellular wireless interface, satellite transmission interface, or any other interface for coupling network devices. In some embodiments, network interface 1110 may be software-defined and programmable, for example, via an Application Programming Interface (API), and thus allowing for remote control of the network device 1100.

Shared memory 1120 can include storage components, such as, Dynamic Random Access Memory (DRAM), Static Random Access Memory (SRAM), etc. In some embodiments, shared memory 1120 is a flat structure that is shared by all datapath processors (including, e.g., exception processing processor core 1130, forwarding processor core 1142, forwarding processor core 1144, . . . , forwarding processor core 1148, etc.), and not tied to any particular CPU or CPU cores. Any datapath processor can read any memory location within shared memory 1120. Shared memory 1120 can be used to store various tables to assist session-based packet forwarding. For example, the tables may include, but are not limited to, a bridge table, a session table, a user table, a station table, a tunnel table, a route table and/or route cache, etc. It is important to note that there is no locking mechanism associated with shared memory 1120. Any datapath processor can have access to any location in lockless shared memory in network device 1100.

Exception processing processor core 1130 typically includes a networking processor core that is capable of processing network data traffic. Exception processing processor core 1130 is a single dedicated CPU core that typically handles table managements. Note that, slowpath processor core 1130 only receives data packets from one or more forwarding processor cores, such as forwarding processor core 1142, forwarding processor core 1144, . . . , forwarding processor core 1148. In other words, exception processing processor core 1130 does not receive data packets directly from any line cards or network interfaces. Only the plurality of forwarding processor cores can send data packets to exception processing processor core 1130. Moreover, exception processing processor core 1130 is the only processor core having the write access to shared memory 1120, and thereby will not cause any data integrity issues even without a locking mechanism in place for shared memory 1120.

Forwarding processor cores 1142-1148 also include networking processor cores that are capable of processing network data traffic. However, by definition, forwarding processor cores 1142-1148 only performs “fast” packet processing. Thus, forwarding processor cores 1142-1149 do not block themselves and wait for other components or modules during the processing of network packets. Any packets requiring special handling or wait by a processor core will be handed over by forwarding processor cores 1142-1148 to exception processing processor core 1130.

Each of forwarding processor cores 1142-1148 maintains one or more counters. The counters are defined as a regular data type, for example, unsigned integer, unsigned long long, etc., in lieu of an atomic data type. When a forwarding processor core 1142-1148 receives a packet, it may increment or decrement the values of the counters to reflect network traffic information, including but not limited to, the number of received frames, the number of received bytes, error conditions and/or error counts, etc. A typical pipeline process at forwarding processor cores 1142-1148 includes one or more of: port lookup; VLAN lookup; port-VLAN table lookup; bridge table lookup; firewall session table lookup; route table lookup; packet encapsulation; packet encryption; packet decryption; tunnel de-capsulation; forwarding; etc.

Moreover, forwarding processor cores 1142-1148 each can maintain a fragment table. Upon receiving a data fragment without information necessary for session processing (e.g., a transport layer or L4 header), forwarding processor cores 1142-1148 will queue the data fragments in their own fragment table, and perform various fragment table management tasks.

Periodically, exception processing processor core 1130 may receive a query corresponding to one or more forwarding processor cores 1142-1148 from a control plane process. Exception processing processor core 1130 identifies one or more memory locations in the shared memory storing data for the one or more forwarding processor cores 1142-1148 corresponding to the query, retrieves one or more data values at the identified memory locations, and responds to the query. In some embodiments, exception processing processor core 1130 can further aggregate retrieved data values to generate an aggregated data value, and respond to the query based on the aggregated data value.

According to embodiments of the present disclosure, network services provided by network device 1100, solely or in combination with other wireless network devices, include, but are not limited to, an Institute of Electrical and Electronics Engineers (IEEE) 802.1x authentication to an internal and/or external Remote Authentication Dial-In User Service (RADIUS) server; an MAC authentication to an internal and/or external RADIUS server; a built-in Dynamic Host Configuration Protocol (DHCP) service to assign wireless client devices IP addresses; an internal secured management interface; Layer-3 forwarding; Network Address Translation (NAT) service between the wireless network and a wired network coupled to the network device; an internal and/or external captive portal; an external management system for managing the network devices in the wireless network; etc.

The present disclosure may be realized in hardware, software, or a combination of hardware and software. The present disclosure may be realized in a centralized fashion in one computer system or in a distributed fashion where different elements are spread across several interconnected computer systems coupled to a network. A typical combination of hardware and software may be an access point with a computer program that, when being loaded and executed, controls the device such that it carries out the methods described herein.

The present disclosure also may be embedded in non-transitory fashion in a computer-readable storage medium (e.g., a programmable circuit; a semiconductor memory such as a volatile memory such as random access memory “RAM,” or non-volatile memory such as read-only memory, power-backed RAM, flash memory, phase-change memory or the like; a hard disk drive; an optical disc drive; or any connector for receiving a portable memory device such as a Universal Serial Bus “USB” flash drive), which comprises all the features enabling the implementation of the methods described herein, and which when loaded in a computer system is able to carry out these methods. Computer program in the present context means any expression, in any language, code or notation, of a set of instructions intended to cause a system having an information processing capability to perform a particular function either directly or after either or both of the following: a) conversion to another language, code or notation; b) reproduction in a different material form.

As used herein, “digital device” generally includes a device that is adapted to transmit and/or receive signaling and to process information within such signaling such as a station (e.g., any data processing equipment such as a computer, cellular phone, personal digital assistant, tablet devices, etc.), an access point, data transfer devices (such as network switches, routers, controllers, etc.) or the like.

As used herein, “access point” (AP) generally refers to receiving points for any known or convenient wireless access technology which may later become known. Specifically, the term AP is not intended to be limited to IEEE 802.11-based APs. APs generally function as an electronic device that is adapted to allow wireless devices to connect to a wired network via various communications standards.

As used herein, the term “interconnect” or used descriptively as “interconnected” is generally defined as a communication pathway established over an information-carrying medium. The “interconnect” may be a wired interconnect, wherein the medium is a physical medium (e.g., electrical wire, optical fiber, cable, bus traces, etc.), a wireless interconnect (e.g., air in combination with wireless signaling technology) or a combination of these technologies.

As used herein, “information” is generally defined as data, address, control, management (e.g., statistics) or any combination thereof. For transmission, information may be transmitted as a message, namely a collection of bits in a predetermined format. One type of message, namely a wireless message, includes a header and payload data having a predetermined number of bits of information. The wireless message may be placed in a format as one or more packets, frames or cells.

As used herein, “wireless local area network” (WLAN) generally refers to a communications network links two or more devices using some wireless distribution method (for example, spread-spectrum or orthogonal frequency-division multiplexing radio), and usually providing a connection through an access point to the Internet; and thus, providing users with the mobility to move around within a local coverage area and still stay connected to the network.

As used herein, the term “mechanism” generally refers to a component of a system or device to serve one or more functions, including but not limited to, software components, electronic components, electrical components, mechanical components, electro-mechanical components, etc.

As used herein, the term “embodiment” generally refers an embodiment that serves to illustrate by way of example but not limitation.

It will be appreciated to those skilled in the art that the preceding examples and embodiments are exemplary and not limiting to the scope of the present disclosure. It is intended that all permutations, enhancements, equivalents, and improvements thereto that are apparent to those skilled in the art upon a reading of the specification and a study of the drawings are included within the true spirit and scope of the present disclosure. It is therefore intended that the following appended claims include all such modifications, permutations and equivalents as fall within the true spirit and scope of the present disclosure.

While the present disclosure has been described in terms of various embodiments, the present disclosure should not be limited to only those embodiments described, but can be practiced with modification and alteration within the spirit and scope of the appended claims. Likewise, where a reference to a standard is made in the present disclosure, the reference is generally made to the current version of the standard as applicable to the disclosed technology area. However, the described embodiments may be practiced under subsequent development of the standard within the spirit and scope of the description and appended claims. The description is thus to be regarded as illustrative rather than limiting.

Claims

1. A method comprising:

receiving, by a network device, a first packet in a session;
performing, by the network device, a route lookup based on a header of the first packet to determine a route for the first packet; and
caching, by the network device, a reference to the route and a neighbor in the session such that subsequent packets in the session are routed based on the cached reference in lieu of subsequent route lookups.

2. The method of claim 1, wherein the reference to the route comprises one or more of: a route index, a route version number, a neighbor index, and a neighbor index number.

3. The method of claim 1, further comprising:

comparing, by the network device, a first route version number cached in the session and a second route version number in a route table corresponding to the route referenced by a route index in the session; and
determining, by the network device, that the route is stale in response to the first route version number being different from the second route version number.

4. The method of claim 3, further comprising:

comparing, by the network device, a first neighbor index and version number cached in the session with a second neighbor index and version number in a neighbor table corresponding to the route referenced by the route index in the session; and
determining, by the network device, that the route is stale in response to the first neighbor index or version number being different from the second neighbor index or version number.

5. The method of claim 4, further comprising:

in response to determining that the route is stale, performing another route lookup to update the route with one or more of an updated route index, an updated route version number, an updated neighbor index, and an updated neighbor version number.

6. The method of claim 4, further comprising:

in response to determining that the route is stale and the session is inactive, delaying route lookup until at least one packet is received in the session.

7. The method of claim 3, wherein at least two paths with identical cost corresponding to the route are stored in the route table, each path being identified by a unique Equal Cost Multiple Path (ECMP) index.

8. The method of claim 7, wherein, when a new ECMP index is added to the route table, a subsequent session uses the path associated with the new ECMP index and an existing session continues to use an existing path associated with an existing ECMP index.

9. The method of claim 4, wherein, when at least two next hop nodes use Virtual Router Redundancy Protocol (VRRP), the route is determined to be stale based on difference between the first neighbor version number cached in the session and the second neighbor version number corresponding to the route in the neighbor table.

10. The method of claim 3, further comprising:

in response to the route determined to be stale, performing another route lookup to update the session with an updated route index and an updated route version number;
in response to the updated route index and the updated route version corresponding to a shorter alternative route than the route, forwarding subsequent packets in the session using the shorter alternative route.

11. The method of claim 10, wherein the shorter alternative route is stored in a patricia trie as a child node of a parent node, wherein the parent node corresponds to the route, and wherein a route version number of the route corresponding to the parent node is increased in response to the child node being inserted in the patricia trie.

12. The method of claim 1, further comprising:

caching, by the network device, a reference to the session in a tunnel within which packets in the session are to be forwarded, thereby allowing direct access to the route from the tunnel.

13. The method of claim 1, further comprising:

encapsulating, by the network device, the first packet based on information returned from a bridge lookup prior to encrypting the first packet;
identifying, by the network device, a network interface that the first packet is to be transmitted on;
sending the first packet to a security engine of the network device to encrypt the first packet; and
instructing the security engine to forward encrypted first packet to the identified network interface in lieu of returning the encrypted first packet to a processor within the network device.

14. A network device having a symmetric multiprocessing architecture, the network device comprising:

a plurality of CPU cores;
a network interface to receive one or more data packets; and
a memory whose access is shared by the dedicated CPU core and the plurality of CPU cores;
wherein the plurality of CPU cores are to: receive a first packet in a session; perform a route lookup based on a header of the first packet to determine a route for the first packet; and cache a reference to the route and a neighbor in the session such that subsequent packets in the session are routed based on the cached reference in lieu of subsequent route lookups.

15. The network device of claim 14, wherein the reference to the route comprises one or more of: a route index, a route version number, a neighbor index, and a neighbor index number.

16. The network device of claim 14, wherein the plurality of CPU cores are further to:

compare a first route version number cached in the session and a second route version number in a route table corresponding to the route referenced by a route index in the session; and
determine that the route is stale in response to the first route version number being different from the second route version number.

17. The method of claim 16, wherein the plurality of CPU cores are further to:

compare a first neighbor index and version number cached in the session with a second neighbor index and version number in a neighbor table corresponding to the route referenced by the route index in the session; and
determine that the route is stale in response to the first neighbor index or version number being different from the second neighbor index or version number.

18. The network device of claim 17, wherein the plurality of CPU cores are further to:

perform another route lookup to update the route with one or more of an updated route index, an updated route version number, an updated neighbor index, and an updated neighbor version number in response to determining that the route is stale.

19. The network device of claim 17, wherein the plurality of CPU cores are further to:

delay route lookup until at least one packet is received in the session in response to determining that the route is stale and the session is inactive.

20. The network device of claim 16, wherein at least two paths with identical costs corresponding to the route are stored in the route table, each path being identified by a unique Equal Cost Multiple Path (ECMP) index.

21. The network device of claim 20, wherein, when a new ECMP index is added to the route table, a subsequent session uses the path associated with the new ECMP index and an existing session continues to use an existing path associated with an existing ECMP index.

22. The network device of claim 17, wherein, when at least two next hop nodes use Virtual Router Redundancy Protocol (VRRP), the route is determined to be stale based on difference between the first neighbor version number cached in the session and the second neighbor version number corresponding to the route in the neighbor table.

23. The network device of claim 16, wherein the plurality of CPU cores further to:

perform another route lookup to update the session with an updated route index and an updated route version number in response to the route determined to be stale;
forward subsequent packets in the session using the shorter alternative route in response to the updated route index and the updated route version corresponding to a shorter alternative route than the route.

24. The network device of claim 23, wherein the shorter alternative route is stored in a patricia trie as a child node of a parent node, wherein the parent node corresponds to the route, and wherein a route version number of the route corresponding to the parent node is increased in response to the child node being inserted in the patricia trie.

25. The network device of claim 14, wherein the plurality of CPU cores are further to:

cache a reference to the session in a tunnel within which packets in the session are to be forwarded, thereby allowing direct access to the route from the tunnel.

26. The network device of claim 14, wherein the plurality of the CPU cores are further to:

encapsulate the first packet based on information returned from a bridge lookup prior to encrypting the first packet;
identify a network interface that the first packet is to be transmitted on;
send the first packet to a security engine of the network device to encrypt the first packet; and
instruct the security engine to forward encrypted first packet to the identified network interface in lieu of returning the encrypted first packet to a processor within the network device.

27. A non-transitory computer-readable storage medium storing embedded instructions for a plurality of operations that are executed by one or more mechanisms implemented within a network device having a symmetric multiprocessing architecture, the plurality of operations comprising:

receiving a first packet in a session;
performing a route lookup to determine a route for the first packet;
caching a reference to the route in the session;
caching a reference to the session and a neighbor in a tunnel within which packets in the session are forwarded;
comparing a first route version number cached in the session with a second route version number in a route table corresponding to the route referenced by a route index in the session;
determining whether the route is stale based on the first and second route version numbers;
performing another route lookup to update the route in response to determining that the route is stale; and
using cached reference to the session in the tunnel for forwarding subsequent packets in the session.
Patent History
Publication number: 20140153577
Type: Application
Filed: Jun 14, 2013
Publication Date: Jun 5, 2014
Inventors: Ramsundar Janakiraman (Sunnyvale, CA), Ravinder Verma (San Jose, CA), Bhanu S. Gopalasetty (San Ramon, CA)
Application Number: 13/918,748
Classifications
Current U.S. Class: Processing Of Address Header For Routing, Per Se (370/392)
International Classification: H04L 12/745 (20060101);