Continuing from my last blog about InfiniBand building blocks, now lets review the network switches used inside Oracle's Engineered Systems a little bit in detail. This will help you in understanding the overall integration, network design, architecture and troubleshooting in later articles.
There are total two category of network switches used to prepare computing environment inside the rack.
- Sun Oracle 36-port InfiniBand Switch
- Sun Oracle InfiniBand Gateway Switch
Ethernet Switch - primarily for management purposes
- Cisco Catalyst 4948
The following table will get you started quickly and save me a lot of writing.
| || |
| || || || |
Let me first give you some more insight on the InfiniBand switches and then we will talk about the Cisco Catalyst 4948. The following picture shows the 36-port IB switch. Gateway switch also looks similar with slight difference for the EoIB ports on extreme right.
Common information that applies to both of these InfiniBand switches
As you might have figured out by now that the IB Gateway switch is almost like a super set of 36-port switch in terms of features and capabilities.
Differences between 36-port and Gateway InfiniBand switches
Comparatively, there are four additional IB ports on 36-port switch. On the Gateway switch these are internally consumed to enable Ethernet over InfiniBand (EoIB) functionality. I am sure you are wondering how this is done. The simple explanation here is that there are two additional hardware devices installed inside IB Gateway switch. These are called Bridge-X, each of which internally connects to InfiniBand fabric via two IB ports. Hence, I showed the math of 36-4=32 in the table above. Towards the external world, they expose EoIB ports as 0A-ETH and 1A-ETH in QSFP+ form factor. But all devices in the the Ethernet world may not understand QSFP+ and we are not commonly using 40Gbps Ethernet too, so these are split into four (4) SFP+ at 10Gbps signalling rate each. Thats why the final port label on EoIB side is 0A-ETH-[N] and 1A-ETH-[N] where N has a fixed value from 1 to 4.
Why do we have two Ethernet ports on the InfiniBand switches ?
For those who have seen or will get their hands on these two InfiniBand switches, let me clarify something about the Ethernet management port. Visually, you will see two RJ45 ports on the switch but there is only one target interface inside. There is a small bridge inside the switch which connects to the management Ethernet and provides two connections to outside world. No, this is not for redundancy or high availability. It is there to allow you to create linear bus topology, if you need it. In simple term, you can daisy chain more than one such switch.
What about these Leaf and Spine switches ?
Okay, now that I have talked about these two InfiniBand switches... let me introduce you to two keywords which you will be hearing a lot and this will set the ground for further discussions.
These are roles of a switch in the topology or connectivity layouts. I may write more about the topologies later but for now lets just keep this blog short, concise and in context of Oracle's Engineered Systems.
The switch where hosts are directly connected takes up the role of Leaf Switch.
The switch where there are no direct hosts attached but does have inter switch links (ISL) to provide alternate paths or for expanding the fabric takes up the role of Spine Switch.
In Exadata and SuperCluster racks, both roles are provided by 36-port InfiniBand switches.
In Exalogic racks, Leaf role is provided by Gateway switches whereas Spine role is provided by a 36-port switch.
How is the InfiniBand connectivity and topology build out
Consider all hosts with one dual-port HCA installed in their PCI-E slots. Connect port-1 to designated leaf switch-1 with an IB cable. When you are done, this completes a star topology. Now repeat the same on port-2 but this time use designated leaf switch-2. So, each host is connected to two leaf switches via independent port. This sets up your dual star topology. But wait, we need some inter switch links also. Why ? To ensure guaranteed communication in an asymmetric topology. For example, host A may be using port-1 while host-B may switch to port-2 for some reason.
Inter switch links may be as simple as cables between two leaf switches or they may go through another switch, which is known as Spine switch. I will not go into micro level details here as you can read more about how ISLs are chosen in various rack configurations in respective product guides.
Cisco Catalyst 4948
Each host and end point has a management network port. This is
always Ethernet based. Cisco 4948 switch integrates all such management
ports inside the rack. Everything is pre-wired and all you need is to
connect an uplink from this Cisco switch to your data center access
switch. Now be careful and do not connect two cables into your data
center access switch without planning for Spanning Tree Protocol. This switch is fully managed and also provides VLAN capabilities
based on 802.1Q specifications. By default, all hosts inside rack
connected to this switch are on same VLAN.
Overall Network Design
At a very high level, we have the following setup:
Next time, I will talk more about the virtual networks that are carried over this physical network. Thanks for reading and I welcome all your comments and questions.