IPMP Development Update
By meem on Feb 06, 2007
A number of people have sent me emails asking for updates on the Next-Generation IPMP work. In short, there's a lot to do, but development is progressing smoothly and early-access bits are on the horizon. At this point, one can:
- Create, destroy, and reconfigure IPMP groups with arbitrary numbers of interfaces and IP addresses, using either the legacy or new administrative model.
- Load-spread inbound and outbound traffic across the interfaces and addresses. As per the new model, all IP addresses are hosted on "IPMP" interfaces and the kernel handles the binding of IP addresses to interfaces in the group internally. There is no longer a visible concept of failover or failback.
- Use in.mpathd to track the failure and repair of interfaces. It notifies the kernel of these changes so that the kernel can update its interface-to-address bindings.
- Use if_mpadm to offline and undo-offline interfaces. Again, this causes the kernel to update its interface-to-address bindings.
To illustrate where I'm at, let me use last night's build to show the lay of the land. (What's been implemented is almost identical to what was proposed in the high-level design document -- so please consult that document for additional background.) For starters, one can use the old IPMP administrative commands as before -- e.g., to create a two-interface group with two IP data addresses:
# ifconfig ce0 plumb group ipmp0 10.8.57.34/24 up # ifconfig ce1 plumb group ipmp0 10.8.57.210/24 upBut what you end up with looks a bit different:
# ifconfig -a lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1 inet 127.0.0.1 netmask ff000000 ce0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2 inet 0.0.0.0 netmask ff000000 groupname ipmp0 ether 0:3:ba:94:3b:74 ce1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 4 inet 0.0.0.0 netmask ff000000 groupname ipmp0 ether 0:3:ba:94:3b:75 ipmp0: flags=8001000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,IPMP> mtu 1500 index 3 inet 10.8.57.34 netmask ffffff00 broadcast 10.8.57.255 groupname ipmp0 ipmp0:1: flags=8001000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,IPMP> mtu 1500 index 3 inet 10.8.57.210 netmask ffffff00 broadcast 10.8.57.255Above, we can see that ifconfig has created an ipmp0 interface for IPMP group a and placed the two data addresses that we configured onto it. The ce0 and ce1 interfaces have no actual addresses configured on them (though they would if we'd configured test addresses), but are marked UP so that they can be used to send and receive traffic. Note that ipmp0 is marked with a special IPMP flag to indicate that it is an IP interface that represents an IPMP group.
Though the legacy configuration works, we will recommend configuring IPMP through the new model, since it better expresses the intent. The same configuration as above would be achieved instead by doing:
# ifconfig ipmp0 ipmp 10.8.57.34/24 up addif 10.8.57.210/24 up # ifconfig ce0 plumb group ipmp0 up # ifconfig ce1 plumb group ipmp0 upNote the presence of the ipmp keyword, which tells ifconfig that the interface represents an IPMP group. Because of this keyword, an IPMP interface can actually be given any valid unused IP interface name -- e.g., ifconfig xyzzy0 ipmp will create an IPMP interface named xyzzy0. This follows the Project Clearview tenet that IP interface names must not be tied to the interface type -- which in turn allows one to roll out new networking technologies without disturbing the system's higher-level network configuration.
In general, an IPMP interface can be used like any other IP interface -- e.g., to create a default route through ipmp0, we can do:
# route add default 10.8.57.248 -ifp ipmp0We can also examine the ARP table to see the current distribution of ipmp0's IP addresses to IP interfaces in the group (once development is complete, this will be able to be done more easily with ipmpstat):
# arp -an | grep ipmp0 ipmp0 10.8.57.34 255.255.255.255 SPLA 00:03:ba:94:3b:74 ipmp0 10.8.57.210 255.255.255.255 SPLA 00:03:ba:94:3b:75Here, we see that 10.8.57.34 is using ce0's hardware address, and 10.8.57.210 is using ce1's hardware address. If we offline ce0, we can see the kernel will change the binding:
# if_mpadm -d ce0 # arp -an | grep ipmp0 ipmp0 10.8.57.34 255.255.255.255 SPLA 00:03:ba:94:3b:75 ipmp0 10.8.57.210 255.255.255.255 SPLA 00:03:ba:94:3b:75One interesting consequence of the new design is that it's possible to remove all of the interfaces in a group and still preserve the IPMP group configuration. For instance:
# ifconfig ce0 unplumb # ifconfig ce1 unplumb # ifconfig -a lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1 inet 127.0.0.1 netmask ff000000 ipmp0: flags=8001000803<UP,BROADCAST,MULTICAST,IPv4,IPMP> mtu 1500 index 3 inet 10.8.57.34 netmask ffffff00 broadcast 10.8.57.255 groupname ipmp0 ipmp0:1: flags=8001000803<UP,BROADCAST,MULTICAST,IPv4,IPMP> mtu 1500 index 3 inet 10.8.57.210 netmask ffffff00 broadcast 10.8.57.255Since all of the network configuration (e.g., the routing table) is tied to ipmp0 rather than to the underlying interfaces, it's unaffected. However, of course, no network traffic can flow through ipmp0 until another interface is placed back into the group -- as evidenced by the fact that the RUNNING flag has been cleared on ipmp0.
Those familiar with the existing IPMP implementation may be asking yourself what's left to do. The answer is "quite a bit". Notable current omissions include:
- Broadcast and multicast support on IPMP interfaces.
- IPv6 traffic on IPMP interfaces.
- Probe-based failure detection.
- DR support of interfaces using IPMP through RCM.
- MIB and kstat support on IPMP interfaces.
- DHCP over IPMP interfaces.
Of the above, the first four are supported by the existing IPMP implementation and are (with minor exceptions) requirements for any early-access candidate. That said, as I mentioned earlier, development is proceeding at a good clip -- especially now that the hairy IP configuration multithreading model as been tamed, and several lethal bugs in IP have been nailed. So stay tuned.
Footnotes If you're a Sun customer interested in kicking the tires in a pre-production environment, please send an email to meem AT eng DOT sun DOT com.
 For instance, see http://mail.opensolaris.org/pipermail/clearview-discuss/2007-February/000685.html or http://mail.opensolaris.org/pipermail/clearview-discuss/2007-February/000661.html.