SYSTEM WITH FABRIC MODULES

In some examples, a chassis contains a fabric module and a plurality node modules that are arranged in a plurality of rows. The fabric module is positioned in a space between a first row and a second row of the plurality of rows, and the fabric module is connected to at least two node modules of the plurality of node modules to provide communications connectivity between the at least two node modules, the chassis to accept longitudinal insertion in a longitudinal direction of the plurality of node modules and the fabric module, the fabric module being removable in the longitudinal direction from the chassis by moving the fabric module in the space between the first row and the second row without first removing the node modules in the plurality of rows.

Skip to: Description  ·  Claims  · Patent History  ·  Patent History
Description
CROSS REFERENCE TO RELATED APPLICATIONS

This is a continuation of U.S. application Ser. No. 13/808,507, filed Jan. 4, 2013, which is a national stage application under 35 U.S.C. §371 of PCT/US2010/048970, filed Sep. 15, 2010, both hereby incorporated by reference.

BACKGROUND

A typical blade system includes a chassis for holding several “blades”. Each blade can include one or more processor nodes, each of which includes one or more processors and associated memory. The chassis can include a backplane that provides power and connectivity, input/output (I/O) connectivity including network connectivity and inter-blade connectivity. In some blade systems, front connector bars spanning two or more blades provide or supplement inter-blade connectivity. In some blade systems, the inter-blade connectivity provides for cache coherent operation among processor blades associated with different blades. This allows a set of blades to operate as a single more powerful computer rather than as a network of separate computers that happen to be in the same chassis. Blade systems can be upgraded conveniently by swapping previous-generation blades with more capable newer-generation blades. In this sense, blade systems provide a hedge against obsolescence and allow a customer's investment to be amortized over a longer time, decreasing the overall cost of ownership.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a schematic diagram of a modular computer system in accordance with an embodiment.

FIG. 2 is a schematic diagram of a fabric module of the computer system of FIG. 1.

FIG. 3 is a flow chart of a process implementable in the computer system of FIG. 1.

FIG. 4 is a flow diagram for a module swap operation and showing front and side views of a blade computer system in accordance with an embodiment.

FIG. 5 is a front schematic view of a chassis of the computer system of FIG. 4.

FIG. 6 is a schematic view of a blade of the computer system of FIG. 4.

FIG. 7 is a schematic view of a fabric module of the computer system of FIG. 4.

FIG. 8 is a schematic view of a fabric module of the computer system of FIG. 4.

FIG. 9 is a schematic view of a fabric module of the computer system of FIG. 4.

FIG. 10 is a flow chart of a process implementable in the system of FIG. 4.

FIG. 11 is a schematic diagram for a fabric module installable in the system of FIG. 4.

FIG. 12 is a schematic diagram for a fabric module installable in the system of FIG. 4.

DETAILED DESCRIPTION

A modular computer system 100, shown in FIG. 1, includes a chassis 101, node modules N1 and N2, and a fabric module 107 that provides routing 109 over which node modules N1 and N2 communicate with each other. To this end, fabric module 107 includes connectors C1 and C2 on its top face 109 for connecting to respective node modules, e.g., node modules N1 and N2, as shown in FIG. 2. Fabric module 107 can be inserted or removed at a process segment 301, in the same longitudinal dimension (into or out of the page given the front view of FIG. 1.) that node modules N1 and N2 can be removed. Node modules N1 and N2 can be run cooperatively, e.g., as a single computer, with fabric module 107 installed at a process segment 302 (which can occur before a removal or after an insertion of process segment 301).

Herein, “module” refers to a hardware entity that can be inserted into a chassis. The terms “node module” and “fabric module” are defined relative to each other so that a “fabric module” is a module configured to provide communications connectivity between or among node modules. The node modules can include general-purpose or application-specific computer modules (providing processing, storage, and communications, e.g., I/O and networking”, modules that emphasize one data handling function, e.g., storage modules, connectivity modules (e.g., network switches dedicated to one or more network layers), application-specific hardware (e.g., sensors and controllers). A “processor module” is a computer module including one or more processor nodes (each of which can include one or more processors and memory). “Blade” herein, refers to a type of node module having a physically “thin” configuration.

Unlike systems in which connectivity is provided by a backplane or a midplane, system 100 permits connectivity to be upgraded without replacing an entire chassis. Fabrics typically can accommodate a small number of computer (e.g., processor) upgrades, but eventually are outstripped by the capabilities of the newer computers. Using replaceable fabric modules allows the chassis to be retained, e.g., for a decade or more rather than for just a few years, through more generations of upgrades. Furthermore, fabric module 107 does not impede front-to-back airflow. Relative to systems that provide inter-blade connectivity using front connector bars, serviceability is improved as fabric module 107 does not have to be removed to replace a connected node module.

A blade system 400 includes a chassis 401, a vertically adjacent pair of rows of blades (node modules) B10 and B20, and fabric modules FM1, FM2, and FM3, as shown in FIG. 4. Row B10 includes blades B11, B12, B13, and B14. Row B20 includes blades B21, B22, B23, and B24. Other embodiments provide for different numbers of rows and different numbers of blades per row. Also, alternative embodiments provide for different numbers of fabric modules for a given number of blade rows. For example, top and bottom fabric modules, such as FM1 and FM3 in FIG. 4, can be omitted.

As shown in the bottom half of FIG. 4 for blade B24 and fabric module FM2, blades B11-B14, blades B21-B24, and fabric modules FM1-FM3 are all removable and insertable through the front 403 of chassis 401 using longitudinal motions. Herein, a horizontal-vertical-longitudinal coordinate system is used. “Horizontal” is the dimension in which the blades of a row are spaced; for example, blades B11-B14 are arranged left to right along the horizontal dimension. “Vertical” is the dimension in which a fabric module is spaced from a row of blades connected to it; for example, fabric module FM1, row B10, fabric module FM2, row B20, and fabric module FM3 are arranged bottom to top along the vertical dimension. “Longitudinal” refers to the dimension of insertion for modules. In the illustrated embodiment, these dimensions are substantially orthogonal to each other, in other words, each pair of dimensions defines an acute or right angle more than 45° so that they are more orthogonal than aligned. Herein, terms such as “front”, “rear”, “top”, “bottom”, “left”, “right” “above”, and “below” are to be interpreted in the context of the coordinate system.

Chassis 401, shown separately in FIG. 5, provides blade (node module) slots 501 and fabric-module slots 503 for receiving, guiding, and securing blades and fabric modules. In some embodiments, other node modules, e.g., network modules, can be installed in the vertical blade slots, with computer-network linkages through fabric modules. In addition, chassis 401 provides power connections 505 for all modules (including blades) and data connections e.g., with peripherals and networks for all blades. Other embodiments include cam and clamping mechanisms for securing modules and for effecting connections between blades and fabric modules. Also, in some embodiments, chasses include integrated cooling, e.g., fans or liquid cooling features. Also, in some embodiments, management features are provided, e.g., virtual ports.

Blades B12-B14 and B21-B24 are similar to B11, described with reference to FIG. 6. Blade B11 is a processor module including two processor nodes P1 and P2. In alternative embodiments, blades can include one processor node or more than two processor nodes; different blades can have different numbers of processors and amounts of memory per node. Processor node N1 includes four processors CP1, memory (RAM) ME1, and storage (ST1). Processor node N2 includes four processors CP2, memory (RAM) ME2, and storage (ST2). Processor nodes P1 and P2 are coupled so that they can operate coherently—i.e., as though memories ME1 and ME2 constituted a unified memory that can be addressed directly by any processor of nodes N1 and N2.

Blade B11 includes a top connector 611 and a bottom connector 613 for connecting to vertically adjacent fabric modules, e.g., modules FM2 and FM1, respectively in FIG. 4. Each blade can be configured so that zero, one, or both of its top and bottom connectors are active. Blade B11 also includes on its rear face 635 power and network connectors 621 and 623, respectively, for receiving power and establishing specialized connectivity, e.g., such as for management. Connectors 621 and 623 are on the top 631 and bottom 633 of blade B11 respectively, more toward the rear face 635 of blade B11 than its front face 637.

Various embodiments employ different techniques to achieve connectivity with communication signals: (a) card edge connector, where the card edge of the blade module slides between two adjacent connectors, (b) cam down or up actuation, where the blade slides in and is pressed down against a traditional connector on the shelf via a cam action, (c) flex-cable connector, where the blade slides in and the connector's flex cable extension is pressed down or up against a shelf using a cam action, and (d) optically interconnected systems where the shelf contains optical “traces” (e.g., waveguides) and the connection may be made with a slide-by, self-aligning set of optical connectors.

Fabric module FM2 is represented in FIG. 7. Routing glue logic 701 provides a switched star topology for full 8×8 routing. (Connectors shown using dashed lines are on the bottom of fabric module FM2). A particular routing can be selected via a front panel of fabric module FM2 or by a management station connected to blade system 400. In addition to routing, routing glue logic 701 provides other glue functions such as signal buffering and snoop filtering.

Fabric module FM1 is represented in FIG. 8. It is adapted to couple the two rightmost blades and the two leftmost blades in a row. Fabric module FM3 is depicted in FIG. 9. It is adapted to connect four blades in a row. In an alternative embodiment, a fabric module has the pattern and connectors shown in FIG. 8 on its top face and the pattern and connectors of FIG. 9 on its bottom face; other alternative embodiments can have different patterns on their top and bottom faces. In some such embodiments, a connection configuration can be changed by inverting a fabric module with different patterns (and connectors) on its top and bottom faces even though the fabric module itself is not dynamically reconfigurable.

In practice, computer system 400, FIG. 4, could be operated with the blades communicatively connected to fabric module FM2 and not communicatively coupled to fabric modules FM1 and FM3, e.g., at a process segment 1001 of a process 1000 flow charted in FIG. 10. To prepare for repairing or replacing fabric module FM2, process segment 1002 provides for breaking the connection between fabric module FM2 and any blades connected to it. If it is desired to run some of these blades coherently, they can be reconnected at process segment 1003 using other fabric modules, e.g., fabric modules FM1 and FM3. This may not result in the exact configuration provided by fabric module FM2, but may still improve upon running blades individually.

Fabric module FM2 can be removed longitudinally at process segment 1004. The removed fabric module can be repaired, inverted, or replaced at process segment 1005. Once repair is complete, the module inverted, or a replacement is found, the resulting fabric module can be inserted longitudinally at process segment 1006. To effect coherent processing via connections of the newly inserted fabric modules, Communicative connections to the other fabric modules can be broken and communicative connections to the newly inserted module established at process segment 1007.

A couple of other fabric modules FM4 and FM5 are shown in FIGS. 11 and 12, respectively. Fabric module FM4 provides four-point connections between vertical pairs of two-node blades. Thus, for example, the two processor nodes of blade B11 and the two processor nodes of blade B21 are grouped to form a four processor-node set that runs coherently. Fabric module FM5 provides eight-point connections for left and right blocks of four blades.

Blade system 400 provides for fabric modules of many other glued and glueless configurations. Also, communicative connections can be electrical, optical or both. Some fabric modules may provide for different connection configurations by inverting the fabric module vertically, i.e., installing it after a 180° rotation about a longitudinal axis. Some fabric modules may have fixed configurations; others may be reconfigurable while installed. Connections can be made using more than one fabric module by using both top and bottom blade connectors. Such connections can be made through the processor nodes of a blade. Also, a specially designed blade can allow a direct connection between upper and lower fabric modules. For example, blade B21 of FIG. 4 provides for a direct connection between fabric modules FM2 and FM3.

System 400 thus provides for a customer installable fabric module “shelf”; in general, this can mean that the cost for high speed blade-to-blade connectivity is only paid when it is used. Connected blades can be independently serviced or upgraded. One chassis can be enabled for a variety of configurations depending on the nature of the fabric modules installed. Also, in its horizontal orientation, a fabric module does not impede front-to-back airflow, unlike a traditional front-plane or mid-plane. Furthermore, the interconnect shelf itself can be independently serviced or upgraded.

In other blade systems, blades may be swappable through the front face of a chassis while fabric modules are swappable through the rear. This configuration can allow connections between fabric and blades that are effected by abutting instead of sliding by each other. Some blade systems use one-sided patterns; to connect to blade both above and below, a pair of fabric modules can be inserted back-to-back. The back-to-back connection can provide inter-layer connections between the fabric modules and thus between rows of blades. Some blade systems use both horizontal and vertical fabric modules (and corresponding slots), defining a rectangular array of “cubbie holes”, each of which can hold multiple compute and network blades.

In some embodiments, a chassis configured to hold processor modules and one or more fabric modules is unpopulated, i.e., does not presently hold any processor modules or fabric modules. In other embodiments, such a chassis is partially or completely populated by other types of modules, e.g., modules having primary purposes, e.g., data input or output or both, other than data processing or inter-module routing. “Coherently” herein applies to a multi-processor system for which multiple processors treat a common or collective memory as a unified whole having a single consistent state. “Cooperatively” means working interactively (either coherently or non-coherently) to accomplish a goal.

Herein, a “system” is a set of interacting elements, wherein the elements can be, by way of example and not of limitation, mechanical components, electrical elements, atoms, the physically encoded forms of instructions encoded in storage media, and process segments. In this specification, related art is discussed for expository purposes. Related art labeled “prior art”, if any, is admitted prior art. Related art not labeled “prior art” is not admitted prior art. The illustrated and other described embodiments, as well as modifications thereto and variations thereupon are within the scope of the following claims.

Claims

1. A system comprising:

a chassis;
a fabric module contained inside the chassis; and
a plurality node modules contained inside the chassis and arranged in a plurality of rows, the fabric module positioned in a space between a first row and a second row of the plurality of rows, the fabric module physically connected to at least two node modules of the plurality of node modules to provide communications connectivity between the at least two node modules, the chassis to accept longitudinal insertion in a longitudinal direction of the plurality of node modules and the fabric module, the fabric module being removable in the longitudinal direction from the chassis by moving the fabric module in the space between the first row and the second row without first removing the node modules in the plurality of rows.

2. The system of claim 1, wherein the chassis is to provide power to the plurality of node modules when held by the chassis.

3. The system of claim 1, wherein the node modules include processor modules, and the fabric module provides for coherent processing between at least two of the processor modules.

4. The system of claim 1, wherein the chassis is to hold vertically adjacent pairs of horizontal rows of node modules and plural fabric modules.

5. The system of claim 1, wherein the at least two node modules comprise processor modules, and the communications connectivity provided by the fabric module between the processor modules provide for coherent processing between the processor modules.

6. The system of claim 5, wherein the fabric module is to provide for different groupings of coherent processor modules in response to commands without being removed from the chassis.

7. The system of claim 5, wherein the fabric module is to provide for glueless coherent processing between processor modules of different rows of the plurality of rows.

8. The system of claim 1, wherein the fabric module is to provide for different groupings of coherent processor modules according to a physical inversion state of the fabric module.

9. The system of claim 1, wherein the fabric module is a first fabric module, the system further comprising a second fabric module, and wherein after the first fabric module is removed from the chassis and disconnected from the at least two node modules, the at least two node modules are communicatively connected to the second fabric module to establish communications connectivity with the second fabric module.

10. A method comprising:

providing, in a chassis, a fabric module;
providing, in the chassis, a plurality node modules arranged in a plurality of rows;
positioning the fabric module in a space between a first row and a second row of the plurality of rows; and
connecting the fabric module to at least two node modules of the plurality of node modules to provide communications connectivity between the at least two node modules, the chassis to accept longitudinal insertion in a longitudinal direction of the plurality of node modules and the fabric module, the fabric module being removable in the longitudinal direction from the chassis by moving the fabric module in the space between the first row and the second row without first removing the node modules in the plurality of rows.

11. The method of claim 10, wherein the first and second rows are vertically adjacent rows of node modules.

12. The method of claim 10, further comprising, while the fabric module is in the space, reconfiguring the fabric module so as to change a grouping of node modules.

13. The method of claim 10, further comprising communicatively coupling the fabric module and another fabric module through one of the plurality of node modules disposed between the fabric modules.

14. The method of claim 10, further comprising after removing a first fabric module of the fabric modules, reinserting, into the chassis, the first fabric module inverted relative to its previous orientation.

15. The method of claim 9,

wherein at least some of the node modules are processor modules, and the fabric module provides for glueless coherent running of a group of the processor modules.
Patent History
Publication number: 20160113143
Type: Application
Filed: Dec 21, 2015
Publication Date: Apr 21, 2016
Inventors: Martin Goldstein (Campbell, CA), Dale C. Morris (Steamboat Springs, CO), Michael R. Krause (Boulder Creek, CO)
Application Number: 14/977,095
Classifications
International Classification: H05K 7/14 (20060101); B23P 19/00 (20060101);