START-UP DELAY FOR EVENT-DRIVEN VIRTUAL LINK AGGREGATION
Embodiments of the invention relate to virtual link aggregation. One embodiment includes a system including a first aggregation switch. A second aggregation switch is connected to the first aggregation switch with a first link. An access switch is connected to the first aggregation switch with a second link and the second aggregation switch with a third link. The first access switch establishes a first virtual link with the first aggregation switch and the second aggregation switch. A first detection module is connected with the first aggregation switch. The first detection module maintains multiple networking ports in a first disabled state, determines a link-status of the first link, and activates a first event upon the determination of a first status indication for the first link, and switches at least one of the multiple networking ports to a first enabled state upon detecting the first event.
Latest IBM Patents:
The present invention relates to network switches and switching, and more particularly, this invention relates to start-up delay for event-driven virtual link aggregation.
In a data center comprising one or more access switches, each access switch connects two aggregation switches for redundancy. Link aggregation uses available bandwidth across a switch boundary at an aggregation layer.
BRIEF SUMMARYEmbodiments of the invention relate to virtual link aggregation. One embodiment includes preventing network traffic loss upon rebooting of a first networking device based on: maintaining a first group of networking ports in a first disabled state, determining a link-status of a first link between the first networking device and a second networking device, activating a delay timer for delaying the rebooting of the first network device upon the determination of a first status indication for the first link, and switching at least one of the first group of networking ports to a first enabled state upon expiration of the delay timer.
Another embodiment comprises a virtual aggregation link system includes a first aggregation switch. A second aggregation switch is coupled to the first aggregation switch with a first link. At least one access switch is coupled to the first aggregation switch with a second link and the second aggregation switch with a third link. The first access switch establishes a first virtual link with each one of the first aggregation switch and the second aggregation switch. A first detection module is coupled with the first aggregation switch. The first detection module maintains a first plurality of networking ports in a first disabled state, determines a link-status of the first link, and activates a first event upon the determination of a first status indication for the first link, and switches at least one of the first plurality of networking ports to a first enabled state upon detecting the first event.
One embodiment comprises a non-transitory computer-useable storage medium for virtual link aggregation, the computer-useable storage medium having a computer-readable program. The program upon being processed on a computer causes the computer to implement: coupling a first networking system to a second networking system using a first link. The first networking system includes a first plurality of networking ports, and the second networking system includes a second plurality of networking ports. The computer further implements: rebooting the first networking system, maintaining the first plurality of networking ports in a first disabled state, and determining a link-status of the first link. The link-status includes a first up-state to indicate that the first networking system is able to communicate with the second networking system via the first link, and a first down-state to indicate that the first networking system is not able to communicate to the second networking system via the first link. A first event is activated upon the determination of a first up-state, and at least one of the first plurality of networking ports is switched to a first enabled state upon detecting the first event.
Other aspects and embodiments of the present invention will become apparent from the following detailed description, which, when taken in conjunction with the drawings, illustrate by way of example the principles of the invention.
As will be appreciated by one skilled in the art, aspects of the present invention may be embodied as a system, method or computer program product. Accordingly, aspects of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, micro-code, etc.) or an embodiment combining software and hardware aspects that may all generally be referred to herein as “logic,” a “circuit,” “module,” or “system.” Furthermore, aspects of the present invention may take the form of a computer program product embodied in one or more computer readable medium(s) having computer readable program code embodied thereon.
Any combination of one or more computer readable medium(s) may be utilized. The computer readable medium may be a computer readable signal medium or a non-transitory computer readable storage medium. A non-transitory computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the non-transitory computer readable storage medium include the following: a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), a portable compact disc read-only memory (CD-ROM), a Blu-ray disc read-only memory (BD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a non-transitory computer readable storage medium may be any tangible medium that is capable of containing, or storing a program or application for use by or in connection with an instruction execution system, apparatus, or device.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may be any computer readable medium that is not a non-transitory computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device, such as an electrical connection having one or more wires, an optical fibre, etc.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fibre cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C++, or the like, and conventional procedural programming languages, such as the “C” programming language or similar programming languages. The program code may execute entirely on a user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer or server may be connected to the user's computer through any type of network, including a local area network (LAN), storage area network (SAN), and/or a wide area network (WAN), or the connection may be made to an external computer, for example through the Internet using an Internet Service Provider (ISP).
Aspects of the present invention are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatuses (systems), and computer program products according to various embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, may be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer readable medium that may direct a computer, other programmable data processing apparatus, or other devices to function in a particular manner, such that the instructions stored in the computer readable medium produce an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.
The computer program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide processes for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
Referring now to the drawings,
In use, the gateway 101 serves as an entrance point from the remote networks 102 to the proximate network 108. As such, the gateway 101 may function as a router, which is capable of directing a given packet of data that arrives at the gateway 101, and a switch, which furnishes the actual path in and out of the gateway 101 for a given packet.
Further included is at least one data server 114 coupled to the proximate network 108, which is accessible from the remote networks 102 via the gateway 101. It should be noted that the data server(s) 114 may include any type of computing device/groupware. Coupled to each data server 114 is a plurality of user devices 116. Such user devices 116 may include a desktop computer, laptop computer, handheld computer, printer, and/or any other type of logic-containing device. It should be noted that a user device 111 may also be directly coupled to any of the networks in some embodiments.
A peripheral 120 or series of peripherals 120, e.g., facsimile machines, printers, scanners, hard disk drives, networked and/or local storage units or systems, etc., may be coupled to one or more of the networks 104, 106, 108. It should be noted that databases and/or additional components may be utilized with, or integrated into, any type of network element coupled to the networks 104, 106, 108. In the context of the present description, a network element may refer to any component of a network.
According to some approaches, methods and systems described herein may be implemented with and/or on virtual systems and/or systems which emulate one or more other systems, such as a UNIX system that emulates an IBM z/OS environment, a UNIX system that virtually hosts a MICROSOFT WINDOWS environment, a MICROSOFT WINDOWS system that emulates an IBM z/OS environment, etc. This virtualization and/or emulation may be enhanced through the use of VMWARE software in some embodiments.
In other examples, one or more networks 104, 106, 108, may represent a cluster of systems commonly referred to as a “cloud.” In cloud computing, shared resources, such as processing power, peripherals, software, data, servers, etc., are provided to any system in the cloud in an on-demand relationship, therefore allowing access and distribution of services across many computing systems. Cloud computing typically involves an Internet connection between the systems operating in the cloud, but other techniques of connecting the systems may also be used, as known in the art.
In one example, the workstation may have resident thereon an operating system, such as the MICROSOFT WINDOWS Operating System (OS), a MAC OS, a UNIX OS, etc. It will be appreciated that other examples may also be implemented on platforms and operating systems other than those mentioned. Such other examples may include operating systems written using JAVA, XML, C, and/or C++ language, or other programming languages, along with an object oriented programming methodology. Object oriented programming (OOP), which has become increasingly used to develop complex applications, may also be used.
Virtual link aggregation group (vLAG) is a feature that uses all available bandwidth without sacrificing redundancy and connectivity. Link aggregation is extended by vLAG across the switch boundary at the aggregation layer. Therefore, an access switch 320 has all uplinks in a LAG, while the aggregation switches 330, 335 cooperate with each other to maintain this vLAG. The vLAG domain 350 comprises virtual links to the primary aggregator switch 330 and the secondary aggregator switch 335 that may comprise, for example, combinations of physical links 331 and 336 to establish virtual links.
Since vLAG is an extension to standard link aggregation, layer 2 and layer 3 features may be supported on top of vLAG. In the system 300 shown in
In system 300, when either of the primary aggregator switch 330 or the secondary aggregator switch 335 is starting up/rebooting, vLAG networking traffic suffers traffic bandwidth loss from this delay due to the network traffic improperly flowing to the networking ports of the aggregator switches 330/335.
In one embodiment, the detection module 440 determines a link-status of the ISL 308, activates a first event (e.g., a first delay timer) upon the determination of a first status indication for the ISL 308 (e.g., the ISL 308 is enabled/UP), and switches at least one of the networking ports (e.g., networking port 332) to a first enabled state (e.g., enabled/UP) upon detecting that the timer has elapsed. Similarly, the detection module 445 determines a link-status of the ISL 308, activates a first event (e.g., a first programmable delay timer) upon the determination of a first status indication for the ISL 308 (e.g., the ISL 308 is enabled/UP), and switches at least one of the networking ports (e.g., networking port 337) to a first enabled state (e.g., enabled/UP) upon detecting that the timer has elapsed. In one example, the health check 460 message/status provides the primary aggregator switch 330 and the secondary aggregator switch 335 updates on the health status of each other (e.g., responsive/non-responsive, error/fault, etc.).
In one example, the delay value of the time is programmable based on a typical amount of time that a type of aggregator switch is employed as either the primary aggregator switch 330 or the secondary aggregator switch 335 take to startup/reboot (e.g., 5 sec., 10 sec., 30 sec., etc.).
In one example, the first status indication (e.g., enabled/UP) for the ISL 308 determined by either the detection modules 440/445 indicates that the primary aggregator switch 330 is able to communicate with the secondary aggregator switch 335 via the ISL 308. In another example, the link-status may be determined by the detection module 440/445 to be in a first down-state status (e.g., DOWN/disabled) that indicates the primary aggregator switch 330 is not able to communicate to the secondary aggregator switch 335 via the ISL 308.
In one embodiment, the detection modules 440/445 provide the delay timer in order to allow the primary aggregator switch 330 or the secondary aggregator switch 335 to obtain the necessary protocols ready through the ISL 308. Upon the networking ports (e.g., networking ports 332/337) of a vLAG being in an UP/enabled state, all of the networking traffic through the links will correctly flow. In one example, the delay timer provides for the primary aggregator switch 330/335 to obtain any routing protocols (e.g., Border Gateway Protocol (BGP), Open Shortest Path First (OSPF)) ready on the uplinks for traffic forwarding.
In one embodiment, the state of the vLAG networking ports (e.g., networking ports 332/337) may be changed by the detection module 440/445 based on different network operations within system 400. In one example, when one of the vLAG switches (e.g., primary aggregator switch 330 or secondary aggregator switch 335) reboots, and before the ISL 308 is in an UP/enabled state, all of the networking ports of the vLAG (e.g., networking ports 332 or 337) underlying the vLAG switch are placed in a first disabled state (e.g., ERRDISABLED). When the ISL 308 attains an UP status, the startup-delay timer of the detection module 440/445 is started. After the startup-delay timer expires, the vLAG networking ports state is modified to an UP/enabled state, one after another.
In one example, if the vLAG networking ports are shut down in the established configuration, then the vLAG networking ports are out of the control of the startup-delay timer, the vLAG networking ports state are modified to disabled after boot up, and the vLAG networking ports can only become UP by a no shut down request. If the vLAG switch (e.g., primary aggregator switch 330 or secondary aggregator switch 335) is getting rebooted, the detection module 440/445 does not identify the peer switch (either primary aggregator switch 330 or secondary aggregator switch 335), for example, in a scenario when both the primary aggregator switch 330 and secondary aggregator switch 335 reboot, the startup-delay timer will be cancelled and vLAG networking ports are brought up immediately.
Operations on Networking Ports of System 400:
In one scenario, the vLAG networking ports are either operationally or administratively shut down. In one example, the vLAG networking ports states are changed to disabled by the detection module 440/445 based on detecting that the vLAG networking ports are shut down. In one embodiment, the detection module 440/445 starts the delay timer. When the delay timer expires, the ports are maintained in the disabled state. In another example, based on the above example, an operational or administrative request is made that the vLAG networking ports are not shut down (e.g., a no shut down request). In this example, the detection module 440/445 changes the vLAG networking ports states to UP and starts the delay timer. When the delay timer expires, the vLAG networking ports status is unchanged.
In another scenario, a cable of one or more of the vLAG networking ports is removed. In one embodiment, the detection module 440/445 does not change the state of the vLAG networking ports and starts the delay timer. Upon an insertion of a cable on the vLAG networking ports with a delay timer running, the detection module 440/445 maintains the state of the vLAG networking ports in ERRDISABLED state. When the delay timer expires, the vLAG networking ports state are changed to UP if the vLAG networking ports can be linked up.
Operations on Trunk Portion of the System 400:
In one scenario, a new trunk is enabled to a vLAG with delay timer running. In one embodiment, all of the vLAG networking ports (e.g., networking ports 332/337) in the trunk should be DOWN, and the detection module 440/445 changes the state of those ports will be changed to an ERRDISABLED state if the vLAG networking ports in the trunk are not disabled (i.e., shut down). In another scenario, a trunk is disabled from a vLAG with delay timer running. In one embodiment, the vLAG networking ports of the trunk should be linked up if their state is ERRDISABLED, and those vLAG networking ports of the trunk state are changed to UP if they can be linked up, or changed to DOWN if they can not be linked up.
In another scenario, new networking ports (e.g., networking ports 332/337) are added to the vLAG with delay timer running. In one embodiment, the detection module 440/445 changes the vLAG networking ports state to ERRDISABLED if they are not disabled. In one scenario, vLAG networking ports are removed from the vLAG with delay timer running. In one embodiment, the removed vLAG networking ports are linked up if the vLAG networking ports state is not disabled, and those vLAG networking ports state are changed by the detection module 440/445 to UP if they can be linked up, or DOWN if they cannot be linked up.
Operations on ISL 308 and Health Check 460 of the System 400:
For operations on the ISL 308 and Health Check 460, a single-failure scenario refers to the ISL 308 failure and a double-failure scenario refers to the ISL 308 failure and health check 460 failure.
ISL 308 Single-Failure Scenario:
In one embodiment, for the primary aggregator switch 330, the detection module 440 maintains the networking ports 332 status as ERRDISABLED and delay timer is cancelled if running.
ISL 308 is UP after ISL 308 Single-Failure Scenario:
In one embodiment, after a single-failure of the ISL 308, when the ISL 308 is brought to an UP state, there is no change of the networking ports 332 of the primary aggregator switch 330, and for the secondary aggregator switch 335. The detector module 445 brings all of the vLAG networking ports (e.g., networking ports 337) to an UP state, one after the other, starts the start-up delay timer, and upon the delay timer expiring, does not change status on the vLAG networking ports.
ISL 308 and Health Check 460 Double-Failure Scenario:
In one embodiment, for the networking ports 332/337 of primary aggregator switch 330 and the secondary aggregator switch 335, no status changes are made to the vLAG network ports. If the detector module 440/445 has the start-up delay timer running, start-up delay timer is cancelled, the vLAG networking ports that were maintained in the ERRDISABLED state are brought to an UP state, one after another.
The ISL 308 is UP after a Double-Failure Scenario:
In one embodiment, for the networking ports 332/337 of primary aggregator switch 330 and the secondary aggregator switch 335, no status changes are made to the vLAG network ports. The detector module 440/445 starts the start-up delay timer, and upon expiration of the delay timer, the vLAG networking ports that were maintained in the ERRDISABLED state are brought to an UP state, one after another.
As shown in
In processing block 640, a first event (e.g., a delay timer) for delaying the startup of the first network device is activated upon the determination of a first status indication (e.g., an UP state) for the first link (e.g., ISL 308). In process block 650, a STATE of at least one of the first number of networking ports is switched to a first enabled state (e.g., UP) upon expiration of the delay timer.
In one example, process 600 may further include switching at least one of the first number of networking ports to a disabled state upon detecting a port operation, and upon expiration of a first delay value of the delay timer after detecting the port operation, at least one of the first number of networking ports is maintained in a disabled state. In another example, process 600 may include switching at least one of the first number of networking ports to an ERRDISABLED state upon detecting a first trunk operation and the at least one port was previously in an enabled state, and switching at least one of the first number of networking ports to an UP state or a DOWN state upon detecting a second trunk operation and the at least one port was previously in an ERRDISABLED state.
In another example, process 600 may include maintaining a state of at least one of the first number of networking ports upon detecting a first link failure (e.g., ISL 308 failure), and upon expiration of the first delay value of the delay timer after detecting the first link failure, all of the first number of networking ports in an ERRDISABLED state are changed to an UP/enabled state. A state of at least one of the first number and a second number of networking ports are maintained upon detecting the first link failure and a health-check failure, and upon expiration of the first delay value after detecting the first link failure and the health-check failure, all of the first number and second number of networking ports in an ERRDISABLED state are changed to an UP/enabled state.
According to various embodiments, the process 600 may be performed by a system, computer, or some other device capable of executing commands, logic, etc., as would be understood by one of skill in the art upon reading the present descriptions.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the singular forms “a”, “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises” and/or “comprising,” when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
The flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
It should be emphasized that the above-described embodiments of the present invention, particularly, any “preferred” embodiments, are merely possible examples of implementations, merely set forth for a clear understanding of the principles of the invention.
Many variations and modifications may be made to the above-described embodiment(s) of the invention without departing substantially from the spirit and principles of the invention. All such modifications and variations are intended to be included herein within the scope of this disclosure and the present invention and protected by the following claims.
Claims
1-9. (canceled)
10: A virtual aggregation link system comprising:
- a first aggregation switch;
- a second aggregation switch coupled to the first aggregation switch with a first link; and
- at least one access switch coupled to the first aggregation switch with a second link and the second aggregation switch with a third link, the first access switch establishes a first virtual link with each one of the first aggregation switch and the second aggregation switch;
- a first detection module coupled with the first aggregation switch, wherein the first detection module maintains a first plurality of networking ports in a first disabled state, determines a link-status of the first link, and activates a first event upon the determination of a first status indication for the first link, and switches at least one of the first plurality of networking ports to a first enabled state upon detecting the first event.
11: The system of claim 10, wherein the first status indication comprises a first up-state status that indicates the first networking element is able to communicate with the second networking element via the first link.
12: The system of claim 11, wherein the link-status further comprises a first down-state status that indicates the first networking element is not able to communicate to the second networking element via the first link.
13: The system of claim 12, further comprising:
- a third networking element coupled to the first networking element using a second link, wherein the third networking element includes a third plurality of networking ports, and a third link couples the second networking element to the third networking; and
- a first virtual link established between the third networking element and each one of the first networking element and the second networking element using the third and second links.
14: The system of claim 10, wherein the first event is activated based at least in part on a first delay value.
15: The system of claim 14, further comprising:
- a second detection module coupled with the second aggregation switch, wherein the second detection module maintains the second plurality of networking ports in the first disabled state, determines the link-status of the first link, and activates the first event upon the determination of the first status indication for the first link.
16: A non-transitory computer-useable storage medium for virtual link aggregation, the computer-useable storage medium having a computer-readable program, wherein the program upon being processed on a computer causes the computer to implement:
- coupling a first networking system to a second networking system using a first link, wherein the first networking system includes a first plurality of networking ports, and the second networking system includes a second plurality of networking ports;
- rebooting the first networking system;
- maintaining the first plurality of networking ports in a first disabled state;
- determining a link-status of the first link, wherein the link-status includes a first up-state to indicate that the first networking system is able to communicate with the second networking system via the first link, and a first down-state to indicate that the first networking system is not able to communicate to the second networking system via the first link;
- activating a first event upon the determination of a first up-state; and
- switching at least one of the first plurality of networking ports to a first enabled state upon detecting the first event.
17: The program of claim 16, coupling the first networking system to a third networking system using a second link, wherein the third networking system includes a third plurality of networking ports;
- coupling the second networking system to the third networking system using a third link; and
- establishing a first virtual link between the third networking system and each one of the first networking system and the second networking system using the third and second links.
18: The program of claim 16, wherein the activating of the first event is based at least in part on a first programmable delay value.
19: The program of claim 19, further causing the computer to implement:
- switching at least one of the first plurality of networking ports to a disabled state upon detecting a port operation, and upon expiration of the first programmable delay value after detecting the port operation, the at least one of the first plurality of networking ports remains in the disabled state;
- switching at least one of the first plurality of networking ports to an errdisabled state upon detecting a first trunk operation and the at least one port was previously in an enabled state;
- switching at least one of the first plurality of networking ports to an up state or a down state upon detecting a second trunk operation and the at least one port was previously in an errdisabled state;
- maintaining a state of at least one of the first plurality of networking ports upon detecting a first link failure, and upon expiration of the first programmable delay value after detecting the first link failure, all of the first plurality of networking ports in an errdisabled state change to an enabled state; and
- maintaining a state of at least one of the first plurality and the second plurality of networking ports upon detecting the first link failure and a health-check failure, and upon expiration of the first programmable delay value after detecting the first link failure and the health-check failure, all of the first plurality and second plurality of networking ports in an errdisabled state change to an enabled state.
Type: Application
Filed: Jan 4, 2013
Publication Date: Jul 10, 2014
Applicant: International Business Machines Corporation (Armonk, NY)
Inventors: Lei Bao (Wuxi), Gangadhar Hariharan (Santa Clara, CA), Tamanna Z. Sait (San Jose, CA), Venkatesan Selvaraj (Sunnyvale, CA)
Application Number: 13/734,654
International Classification: H04L 12/24 (20060101);