System management controller (SMC) negotiation protocol for determining the operational mode of SMCs
A computer system module includes a system management controller to negotiate with other system management controllers to determine the controller's initial operational state. In an embodiment, negotiation with other system management controllers is based at least in part on one of controller capability, user configured preferences, module type, and geographical address.
Latest Patents:
- METHODS AND COMPOSITIONS FOR RNA-GUIDED TREATMENT OF HIV INFECTION
- IRRIGATION TUBING WITH REGULATED FLUID EMISSION
- RESISTIVE MEMORY ELEMENTS ACCESSED BY BIPOLAR JUNCTION TRANSISTORS
- SIDELINK COMMUNICATION METHOD AND APPARATUS, AND DEVICE AND STORAGE MEDIUM
- SEMICONDUCTOR STRUCTURE HAVING MEMORY DEVICE AND METHOD OF FORMING THE SAME
This application is a continuation of U.S. patent application Ser. No. 10/092,793, filed on Mar. 8, 2002, now U.S. Pat. No. 7,058,703, the entire disclosure of which is incorporated herein by reference.
FIELD OF THE INVENTIONEmbodiments of the present invention relate to system management. In particular, embodiments of the present invention relate to a negotiation protocol for determining the operational mode of system management controllers.
BACKGROUNDComputers and other electronic systems contain various components that may malfunction during the life of the system. In order to reduce and/or remedy such malfunctions, some systems include built-in features such as the ability to monitor and control the “health” or performance of the system hardware. Such features are sometimes referred to as system management, but also may be referred to by other names such as management, hardware management, platform management, etc. System management features may include, for example, the monitoring of elements such as temperatures, voltages, fans, power supplies, bus errors, system physical security, etc. In addition, system management features may also include determining information that helps identify a failed hardware component, and issuing an alert specifying that a component has failed.
One of the components that may be used to handle system management functions is a system management controller (also referred to herein as a “controller”). A system management controller may be a microprocessor, micro-controller, application specific integrated circuit (ASIC), or other type of processing unit that controls system management tasks. A system management controller may perform tasks such as receiving system management information, sending messages to control system performance, logging system management information, etc. For example, a management controller may receive an indication from a temperature sensor that system temperature is rising, may send a command to increase fan speed, and may log the temperature reading.
One of the management controllers in a system may perform the role of the central system management controller and perform central management functions such as for example logging events, collecting field replaceable unit (FRU) inventory information, user interface, host CPU interface, etc. The central management controller for a system may be referred to as the baseboard management controller (BMC) for the system. Other non-central management controllers may be referred to as satellite management controllers (SMCs). An SMC may perform system management for a particular part or feature of a system. For example, a computer system may contain a number of circuit boards and other components that are connected by busses, with one board containing a BMC for that system and other boards containing SMCs that performs other system management functions.
Some system management controllers have the ability to operate in a BMC mode or in an SMC mode (i.e., to perform in the role of a BMC or an SMC). In some prior systems, a system management controller that is attached to a circuit board may adapt its functionality based on the slot in which that board is inserted. In such a system, a specific slot in a system chassis may be reserved for a board that performs the BMC functionality for that system and may have a pin that provides such an indication to the resident module. In this case, a system management controller may determine upon reset if it is in the BMC slot, and if so may set itself to act as the BMC (i.e., set itself to BMC mode). In such systems, a person assembling the system or changing circuit boards may need to determine which slot is the BMC slot and to ensure that a board with desired BMC capabilities is placed in the appropriate BMC slot.
DESCRIPTION OF THE DRAWINGS
According to embodiments of the present invention, a system management controller negotiates with other system management controllers to determine an initial operational mode (e.g., the mode after a reset or other initiation). Such negotiation may be accomplished, for example, by sending messages between system management controllers. In an embodiment, a system management controller determines after a reset that its initial operational mode is central management controller mode (e.g., BMC mode) based upon the lack of a response to one or more controller mode requests sent by that system management controller. In another embodiment, the initial mode for the system management controller may be based upon the content of a response received by that system management controller.
Embodiments of the present invention provide a controller mode negotiation protocol. In an embodiment, each system management controller in the system is adapted to perform the negotiation protocol. The negotiation protocol may be performed for events such as system initiation or when a single system management controller performs a reset. For example, when a system is powered on, each system management controller in the system may send a controller mode request to other system management controllers according to the negotiation protocol, and may transition to an initial mode based upon a response to the controller mode request. The negotiation protocol may also define the protocol for a system management controller to respond to a mode request that it receives. In embodiments of the present invention, controllers transition through a series of negotiation states which may include: request, wait, SMC, standby BMC, and active BMC. In embodiments, the negotiation may be based at least in part on criteria such as controller capability, user configured preference, module type, and geographic (physical) address.
Each module shown in system 100 contains a system management controller (113, 123, 133, 143) and a computer readable medium (115, 125, 135, 145). Each system management controller may be a processor that is capable of performing system management functions as discussed above. Each computer readable medium may be any type of medium capable of storing instructions, such as a read only memory (ROM), a programmable read only memory (PROM), or an erasable programmable read only memory (EPROM). In an embodiment, the computer readable medium is a non-volatile memory. Each computer readable medium in
In one example of the operation of an embodiment of the present invention, system management controller 113 may execute mode negotiation protocol instructions 117 to negotiate with system management controllers 123, 133, and 143 to determine the initial system management mode for one or more of system management controllers 113, 123, 133, and 143. In an embodiment, possible system management modes for a controller may be active-BMC mode, standby-BMC mode, and SMC mode. In this embodiment, the active BMC may perform the BMC functions for the system, while the standby-BMC may be available to become the active-BMC in case of failure of the current active-BMC (e.g., may receive and log the same management information as the active-BMC). In other embodiments, there may be more or less possible management modes.
In the simplest embodiment, which is discussed with reference to
Management controllers, such as those show in system 200, may be capable of operating in one, some, or all of BMC mode, standby-BMC mode, or SMC mode. For example, BMC 215 may also be capable of operating as a standby-BMC or an SMC, standby-BMC 225 may also be capable or operating as a BMC or an SMC, and SMC 235 may only be capable or operating as an SMC. In other embodiments, for example, SMC 235 may be capable of operating as BMC, and/or BMC 215 may not be capable or operating as an SMC.
Fan tray module 240 is shown as including a new system management controller 245. This controller has been labeled as “new” for the purposes of illustration to show a case where one of the system management controllers is being initialized while the other system management controllers have already assumed an operational mode. A management controller may be initialized, for example, when the entire system is turned on or reset, or (in the case of
As shown in
The negotiation protocol may also define the action taken upon receipt of a mode response (or failure to receive a mode response). For example, the protocol may provide that a controller transitions to the SMC state upon receipt of a GoToSMC command. As another example, which is discussed with reference to
In the embodiment shown in
If a response is not received within a timeout period (e.g., 100 ms) (304 and 306), then the controller may determine if a retry limit has been reached (307). If the retry limit has not been reached, then the controller may transition back to request state, may send another controller mode request, and may wait as discussed above. In an embodiment, the retry limit may be three retries. Of course, other timeout periods and retry limits may be used. If the retry limit has been reached, the controller may set itself to active-BMC mode (308). After assuming BMC mode, the controller may then process requests from other controllers (309) in addition to performing the BMC functions. Thus, in this embodiment, if a controller does not receive a response to a controller mode request, it may assume the BMC mode. The priority may be based on any different factors such as, for example, those discussed below with reference to
In the example discussed above, all of the controllers but one have previously assumed an operational mode. However, the method shown in
In a further embodiment, there is an absence of a response to a mode request if a threshold number of requests have been sent by the controller without receiving a response within a timeout period.
According to the embodiment shown in
Thus, according to an embodiment of the invention, a response that is sent back to the sender of the controller mode request may be based at least in part on the current state of the receiver. The response may be based at least in part on the controller mode capabilities of the receiver and may be based at least in part on a user-configured mode preference. The method shown in
As shown in
In an embodiment, the capability field 512 may indicate the system management mode capabilities of the controller that sends controller mode request 510. In an embodiment, available capabilities sets are BMC-Only, BMC/SMC, and SMC-Only. In a further embodiment, BMC-Only is the highest priority and SMC-Only is the lowest priority. In an embodiment, the only module which can be BMC-Only is a module that is dedicated to be the central management agent for the chassis, which may be referred to as a “Chassis Management Module” (CMM), and is designed for star or hybrid topologies.
In an embodiment, controllers with the BMC/SMC capability set (i.e., controllers that may act as either BMC or SMC) may optionally implement a user configuration feature to allow a user to specify a preference of BMC, SMC, or no preference. A user may input such preference using, for example, a BIOS set-up option, a software setting, a DIP switch, a jumper setting, or running or loading software. This information may be included in user preference field 513 of controller mode request 510. In an embodiment, modules that do not implement the user configuration preference feature, including BMC-Only and SMC-Only modules, may report no preference. In an embodiment, BMC-only is the highest priority, no preference is the middle priority, and SMC only is the lowest priority. Because different module types may have different geographic address domains, in embodiments module type may be used in determining prioritization. In an embodiment, different available values for the module type field 514, in order from lowest to highest priority, are power module, other chassis specific types, fan tray, node board, switch board, and dedicated CMM. Of course, other module types and other orders or priority may be used.
The geographic address field 515 may contain the geographic address (e.g., slot address) for the module of which the controller is a part. In an embodiment, when a comparison of other criteria results in a tie, the controller with the lower geographic address is determined to have the higher priority. In a further embodiment, controllers in the BMC states may also use the geographic address to decide how to respond. For example, BMCs may use geographic address to determine which module should be active after an initial power up.
As discussed above, controllers receiving a controller mode request may respond based on their current state and the requestor's priority relative to their own.
In an embodiment, when a controller comes out of reset, it enters the request state 620. In the request state, the controller may broadcast a controller mode request and wait for responses. Other controllers receiving the controller mode request may respond according to their current state and relative priority to the requester. According to an embodiment, the negotiation protocol supports prioritization so modules that are not capable of acting as SMCs will take precedence as the BMC over modules that are capable of acting as SMCs. BMC priority may be based on capabilities, preference settings, module type, and geographic address. In an embodiment, if no response is received to the controller mode request (after retries), the requestor may set itself to the active-BMC mode. Otherwise the requestor may be told what mode to run in via either a GoToSMC response. In an embodiment, the controller may also receive an unsolicited message which was not sent in response to a particular controller mode request and which requests that the controller assume a certain mode. Such an unsolicited message may be referred to as a set mode command. In an embodiment, set mode commands are sent by a BMC during operation of the system to make changes to controller modes after initial modes have been assumed.
The various state transitions according to embodiments of the invention will now be described in more detail. After a controller in the request state broadcasts a controller mode request it may receive one or more responses such as a GoToSMC response (622) or a wait response (621). If a GoToSMC response is received, the controller may transition to the SMC state (640). If a wait response is received, the controller may transition to the Wait state (630). If no response is received after timeouts and retries, the controller may transition to the Active BMC state (624). In addition, a controller in the request state may receive one or more set mode commands that may instruct the controller to go the standby-BMC mode (623) or may instruct the controller to go the SMC mode (622).
In the embodiment shown, a controller that is in the Wait state 630 may wait to receive a set mode command or a GoToSMC response. If a GoToSMC response is received, the controller may transition to the SMC state (632). If a set mode command is received the controller may transition to the appropriate state specified in the set mode command (e.g., transitions to standby-BMC state 633 or transitions to SMC state 632). In this embodiment, if neither a GoToSMC response nor a set mode command is received within a timeout period, the controller may transition back to the request state (631), where it may re-broadcast the controller mode request.
A controller in the SMC state may act as a satellite management controller. As shown in
A controller in the Standby BMC state may act as a standby BMC. As discussed above, in an embodiment a controller in the standby BMC state may maintain synchronized state information with the active BMC and may perform a watchdog function for the active BMC. In a further embodiment, the standby BMC shall transition to the active BMC state (652) if the active BMC fails. Depending on the management topology and installed modules, a new standby BMC may be selected upon a failure of the active BMC. As shown in
In the active-BMC state, the controller may performs normal BMC functions. In an embodiment, the active BMC may select a standby BMC that is appropriate for the topology, and may synchronizes state information with the standby BMC. In the embodiment discussed above, the BMC(s) are ultimately responsible for telling the other negotiating controllers to go to the SMC state. In one embodiment, for example where a dual bus topology is used, the controllers may be only told to go to SMC mode by a BMC after a standby BMC has been established. In this embodiment, controllers are prevented from reaching the SMC state before a standby BMC is be established. If the active BMC were to fail before establishing a standby and all other controllers had reached the SMC state, the system may be left without a BMC. In an embodiment, CMMs that are specifically designed for star or hybrid topologies may tell other non-CMM modules to go to the SMC state prior to establishing a standby CMM, because only another star or hybrid CMM can be the standby BMC. In an embodiment, the active-BMC may transition to the standby-BMC state upon receipt of a standby set mode command (662), which may occur for example when there is a user-triggered switch of the standby-BMC to active BMC mode (which may be knows as a “failover”), if a controller of a priority higher than the standby-BMC is hot-swapped in, or for other reasons.
According to an embodiment, controllers receiving a controller mode request (i.e., the receiver) may respond to the requestor (i.e., the controller that sent the request) as shown in the following Table I. This table shows 15 different cases. As shown below, the response may be based on the receiver's state and the requestor's relative priority. In some cases the response depends upon the requestor's relative geographic priority, and in some cases the response depends upon whether a standby-BMC has already been established. In Table I, the designation “X” indicates that for this case the content of the response is not based on this criteria. Relative controller priorities may be determined based on capability, user preference, module type, and geographic address (GA) as discussed, for example, with regard to
In the first three cases in Table I, the receiver of the controller mode request is the active-BMC. If the requestor's priority (without geographic address) is higher than the receiver's priority, then a wait response may be sent. Examples of situations where the requestor may have a higher priority than the active BMC is where the requester was hot-swapped in or where the requester took a relatively long time to come out of rest. A requestor with a higher priority than the active BMC may be sent to the Wait state, rather than directly becoming the active-BMC, so that it may become synced before changing to the active-BMC. If the requestor's priority (without geographic address) is equal or lower than the receiver's priority, and a standby-BMC has been established, then a wait response or GoToSMC response may be sent. The wait response may be issued if the requestor is to become the new standby-BMC. If a standby-BMC has not been established, then in an embodiment the receiver may only issue a GoToSMC response if the active BMC is a CMM specifically designed for a star or hybrid topology and the requestor is not a CMM; otherwise, the receiver may issue a wait response.
In cases 4-6 of Table I, the receiver is in the standby-BMC state (which by definition means that a standby-BMC was established). In the embodiment shown, a wait response may be sent if the requestor's priority (without geographic address) is higher than the receiver's priority, and a GoToSMC response may be sent if the requestor's priority (without geographic address) is lower than the receiver's priority. A GoToSMC response will generally be sent if the requestor's priority (without geographic address) is equal to the receiver's priority, but wait response may be sent in this case when, for example, there has been a decision made to change the standby-BMC.
In the remaining cases 7-15 in Table I, the response is not dependent upon whether a standby has been established. In case 7, the receiver is in the SMC state, and no response is sent regardless of relative priority. Thus, in this embodiment a controller in the SMC state does not respond to a controller mode request. In cases 8-11, the receiver is in the Wait state, and the geographic priority is used to break ties. In these cases, no response is sent if the requestor has a higher priority, and a wait response is sent if the requestor has a lower priority. Finally, in cases 12-15, the receiver is in the request state, and the geographic priority is used to break ties. In these cases, if the requestor has a higher priority, no response is sent and the receiver sets itself to the Wait state. If the receiver is in the request state and the requester has a lower priority, a wait response is sent.
Table I represents only one embodiment of a negotiation protocol according to the present invention. In other embodiments, other receiver states may be available, and the responses may be different in one or more of the cases.
In embodiments disclosed above, a controller that does not receive any responses to the controller mode request and to retries (and has not set itself to the Wait state) may set itself to the active BMC state. In the embodiment shown, controllers are only told to go to SMC mode by a BMC when there is an established standby-BMC to prevent controllers from reaching the SMC state before a standby BMC can be established. If the active BMC were to fail before establishing a standby, and all other controllers had reached the SMC state, the system may be left without a BMC. Use of the mode negotiation protocol disclosed in embodiments of the present invention may automatically determine which controllers will be the active and standby BMCs while avoiding conflicts between controllers.
Several examples of embodiments of the present invention are specifically illustrated and/or described herein. However, it will be appreciated that modifications and variations of the present invention are covered by the above teachings and within the purview of the appended claims without departing from the spirit and intended scope of the invention. For example, the priority determination and protocol for responding to a request may differ from that shown above. As another example, the system management negotiation protocol may be embodied in hardware or software.
Claims
1. A system comprising: a first system management controller capable of negotiating with at least a second system management controller to determine an initial operational mode of said first system management controller, said first system management controller comprising an input/output port and being capable of sending mode request messages that comply with the Intelligent Platform Management Interface specification via said input/output port to negotiate with said second system management controller to determine said initial operational mode of said first system management controller.
2. The system of claim 1, wherein an available initial operational mode for said first system management controller includes active baseboard management controller mode, standby baseboard management controller mode, and satellite management controller mode.
3. The system of claim 1, wherein said initial operational mode of said first system management controller is based on, at least in part, one of controller mode capability, user configuration preference, module type, and geographical address.
4. The system of claim 1, wherein said initial operational mode of said first system management controller is based on, at least in part, a response, or a lack of response, to said mode request sent via said input/output port.
5. The system of claim 1, comprising a plurality of system management controllers, said first system management controller determining said initial operational mode based on, at least in part, responses, or lack of responses, to said mode request by said plurality of system management controllers.
6. The system of claim 1, wherein the first system management controller further comprises a second input/output port to send a duplicate copy of system management messages to other system management controllers.
7. A machine-readable medium having stored thereon instructions to be executed by a first system management controller, when executed, said instructions causing the first system management controller to:
- transition from a reset to a request state; and
- send a controller mode request to other system management controllers to determine an initial operational mode of said first system management controller, said controller mode request comprising a message that complies with the Intelligent Platform Management Interface specification.
8. The machine-readable medium of claim 7, wherein said instructions also include instructions to select an initial operational mode based on, at least in part, a received reply, or a lack of a received reply, to said controller mode request from said other system management controllers.
9. The machine-readable medium of claim 8, wherein said instructions include instructions to select an active baseboard mode of operation if no reply to said controller mode request is received.
10. The machine-readable medium of claim 8, wherein said instructions include instructions to select a satellite management controller mode of operation in response to a go to satellite management controller mode reply received from an active baseboard management controller or from a standby baseboard management controller.
11. The machine-readable medium of claim 7, wherein said instructions also include instructions to cause the system management controller to:
- determine that a mode request has been received from a second system management controller;
- send a wait response to the second system management controller if the second system management controller has a lower priority than the first system management controller and the first system management controller is in either a request state or a wait state; and
- send a wait response to the second system management controller if the second system management controller has a higher priority than the first system management controller and the first system management controller is in either an active baseboard management controller state or a standby baseboard management controller state.
12. The machine-readable medium of claim 11, wherein the instructions also include instructions to cause the system management controller to:
- determine the relative priority of the first system management controller and second system management controller based on at least one of controller mode capability, user-configured preference, module type, and geographical address.
13. The machine-readable medium of claim 11, wherein the instructions also include instructions to cause the first system management controller to send one of a wait response and a go to satellite management controller mode to the second system management controller if the second system management controller has an equal or lower priority than the first system management controller and the first system management controller is in either active baseboard management controller state or standby baseboard management controller state.
14. The machine-readable medium of claim 11, wherein the instructions also include instructions to cause the system management controller to:
- determine that no response should be send to the second system management controller if the second system management controller has a higher priority than the first system management controller and the first system management controller is in either a request state or a wait state; and
- determine that no response should be sent to the second system management controller if the first system management controller is in a satellite management controller state.
15. A method of determining an initial operational mode of a first system management controller comprising:
- transitioning said first system management controller from a rest state to a request state;
- sending a mode request from said first system management controller to at least a second system management controller, said mode request sent as a message that complies with the Intelligent Platform Management Interface specification; and
- selecting an initial operational mode of said first system management controller based on, at least in part, a response, or a lack of a response, from said second system management controller to said mode request, said response or lack of response being based on, at least in part, a negotiation protocol state of said second system management controller.
16. The method of claim 15, wherein said response, or lack of response, from said second system management controller is based on, at least in part, the relative priority of said first and second system management controllers.
17. The method of claim 16, wherein said relative priority of said first and second system management controllers is based on at least one of a controller mode capability, a user preference, a controller's module type, or a controller geographical address.
18. The method of claim 15, wherein said initial operation mode of said first system management controller is one of an active baseboard management controller mode, a standby baseboard management controller mode, and a satellite management controller mode.
19. The method of claim 18, wherein an active baseboard management controller mode of operation is selected in the event of a lack of response to the mode request.
20. The method of claim 18, wherein a satellite management controller mode of operation is selected in the event of a go to satellite management controller response to said mode request that is sent by said second system management controller having an active baseboard management controller mode or a standby baseboard management controller mode.
Type: Application
Filed: Jun 6, 2006
Publication Date: Oct 5, 2006
Applicant:
Inventor: Peter Hawkins (San Luis Obispo, CA)
Application Number: 11/447,399
International Classification: G06F 15/177 (20060101);