High-availability Networking for Solaris Containers
By Jeff Victor-Oracle on Mar 21, 2008
Here's another example of Containers that can manage their own affairs.
Sometimes you want to closely manage the devices that a Solaris Container uses. This is easy to do from the global zone: by default a Container does not have direct access to devices. It does have indirect access to some devices, e.g. via a file system that is available to the Container.
By default, zones use NICs that they share with the global zone, and perhaps with other zones. In the past these were just called "zones." Starting with Solaris 10 8/07, these are now referred to as "shared-IP zones." The global zone administrator manages all networking aspects of shared-IP zones.
Sometimes it would be easier to give direct control of a Container's devices to its owner. An excellent example of this is the option of allowing a Container to manage its own network interfaces. This enables it to configure IP Multipathing for itself, as well as IP Filter and other network features. Using IPMP increases the availability of the Container by creating redundant network paths to the Container. When configured correctly, this can prevent the failure of a network switch, network cable or NIC from blocking network access to the Container.
As described at docs.sun.com, to use IP Multipathing you must choose two network devices of the same type, e.g. two ethernet NICs. Those NICs are placed into an IPMP group through the use of the command ifconfig(1M). Usually this is done by placing the appropriate ifconfig parameters into files named /etc/hostname.<NIC-instance>, e.g. /etc/hostname.bge0.
An IPMP group is associated with an IP address. Packets leaving any NIC in the group have a source address of the IPMP group. Packets with a destination address of the IPMP group can enter through either NIC, depending on the state of the NICs in the group.
Delegating network configuration to a Container requires use of the new IP Instances feature. It's easy to create a zone that uses this feature, making this an "exclusive-IP zone." One new line in zonecfg(1M) will do it:
zonecfg:twilight> set ip-type=exclusiveOf course, you'll need at least two network devices in the IPMP group. Using IP Instances will dedicate these two NICs to this Container exclusively. Also, the Container will need direct access to the two network devices. Configuring all of that looks like this:
global# zonecfg -z twilight zonecfg:twilight> create zonecfg:twilight> set zonepath=/zones/roots/twilight zonecfg:twilight> set ip-type=exclusive zonecfg:twilight> add net zonecfg:twilight:net> set physical=bge1 zonecfg:twilight:net> end zonecfg:twilight> add net zonecfg:twilight:net> set physical=bge2 zonecfg:twilight:net> end zonecfg:twilight>add device zonecfg:twilight:device> set match=/dev/net/bge1 zonecfg:twilight:net> end zonecfg:twilight>add device zonecfg:twilight:device> set match=/dev/net/bge2 zonecfg:twilight:net> end zonecfg:twilight> exitAs usual, the Container must be installed and booted with zoneadm(1M):
global# zoneadm -z twilight install global# zoneadm -z twilight bootNow you can login to the Container's console and answer the usual configuration questions:
global# zlogin -C twilight <answer questions> <the zone automatically reboots>After the Container reboots, you can configure IPMP. There are two methods. One uses link-based failure detection and one uses probe-based failure detection.
Link-based detection requires the use of a NIC which supports this feature. Some NICs that support this are hme, eri, ce, ge, bge, qfe and vnet (part of Sun's Logical Domains). They are able to detect failure of the link immediately and report that failure to Solaris. Solaris can then take appropriate steps to ensure that network traffic continues to flow on the remaining NIC(s).
Other NICs do not support this link-based failure detection, and must use probe-based detection. This method uses ICMP packets ("pings") from the NICs in the IPMP group to detect failure of a NIC. This requires one IP address per NIC, in addition to the IP address of the group.
Regardless of the method used, configuration can be accomplished manually or via files /etc/hostname.<NIC-instance>. First I'll describe the manual method.
Link-based DetectionUsing link-based detection is easiest. The commands to configure IPMP look like these, whether they're run in an exclusive-IP zone for itself, or in the global zone, for its NICs and for NICs used by shared-IP Containers:
# ifconfig bge1 plumb # ifconfig bge1 twilight group ipmp0 up # ifconfig bge2 plumb # ifconfig bge2 group ipmp0 upNote that those commands only achieve the desired network configuration until the next time that Solaris boots. To configure Solaris to do the same thing when it next boots, you must put the same configuration information into configuration files. Inserting those parameters into configuration files is also easy:
/etc/hostname.bge1: twilight group ipmp0 upThose two files will be used to configure networking the next time that Solaris boots. Of course, an IP address entry for twilight is required in /etc/inet/hosts.
/etc/hostname.bge2: group ipmp0 up
If you have entered the ifconfig commands directly, you are finished. You can test your IPMP group with the if_mpadm command, which can be run in the global zone, to test an IPMP group in the global zone, or can be run in an exclusive-IP zone, to test one of its groups:
# ifconfig -a ... bge1: flags=201000843If you are using link-based detection, that's all there is to it!
mtu 1500 index 4 inet 184.108.40.206 netmask ffff0000 broadcast 220.127.116.11 groupname ipmp0 ether 0:14:4f:f8:9:1d bge2: flags=201000843 mtu 1500 index 5 inet 0.0.0.0 netmask ff000000 groupname ipmp0 ether 0:14:4f:fb:ca:b ... # if_mpadm -d bge1 # ifconfig -a ... bge1: flags=289000842 mtu 0 index 4 inet 0.0.0.0 netmask 0 groupname ipmp0 ether 0:14:4f:f8:9:1d bge2: flags=201000843 mtu 1500 index 5 inet 0.0.0.0 netmask ff000000 groupname ipmp0 ether 0:14:4f:fb:ca:b bge2:1: flags=201000843 mtu 1500 index 5 inet 18.104.22.168 netmask ffff0000 broadcast 22.214.171.124 ... # if_mpadm -r bge1 # ifconfig -a ... bge1: flags=201000843 mtu 1500 index 4 inet 126.96.36.199 netmask ffff0000 broadcast 188.8.131.52 groupname ipmp0 ether 0:14:4f:f8:9:1d bge2: flags=201000843 mtu 1500 index 5 inet 0.0.0.0 netmask ff000000 groupname ipmp0 ether 0:14:4f:fb:ca:b ...
As mentioned above, using probe-based detection requires more IP addresses:
/etc/hostname.bge1: twilight netmask + broadcast + group ipmp0 up addif twilight-test-bge1 \\ deprecated -failover netmask + broadcast + up
/etc/hostname.bge2: twilight-test-bge2 deprecated -failover netmask + broadcast + group ipmp0 upThree entries for hostname and IP address pairs will, of course, be needed in /etc/inet/hosts.
All that's left is a reboot of the Container. If a reboot is not practical at this time, you can accomplish the same effect by using ifconfig(1M) commands:
twilight# ifconfig bge1 plumb twilight# ifconfig bge1 twilight netmask + broadcast + group ipmp0 up addif \\ twilight-test-bge1 deprecated -failover netmask + broadcast + up twilight# ifconfig bge2 plumb twilight# ifconfig bge2 twilight-test-bge2 deprecated -failover netmask + \\ broadcast + group ipmp0 up
Whether link-based failure detection or probe-based failure detection is used, we have a Container with these network properties:
- Two network interfaces
- Automatic failover between the two NICs