IPMP Development Update #2
By meem on Apr 25, 2007
Several folks have again (understandably) asked for updates on the Next-Generation IPMP work. Significant progress has been made since my last update. Notably:
- Probe-based failure detection is operational (in addition to the earlier support for link-based failure detection).
- DR support of interfaces using IPMP through RCM works. Thanks to the new architecture, the code is almost 1000 lines more compact than Solaris's current implementation -- and more robust.
- Boot support is now complete. That is any number (including all) interfaces can be missing at boot and then transparently repaired during operation.
- At long last, ipmpstat. As discussed in the high-level design document, this is a new utility that allows the IPMP subsystem to be compactly examined.
Since ipmpstat allows other aspects of the architecture to be succinctly examined, let's take a quick look at a simple two-interface group on my test system:
# ipmpstat -g GROUP GROUPNAME STATE FDT INTERFACES net57 a ok 10000ms ce1 ce0
As we can see, the "-g" (group) output mode tells us all the basics about the group: the group interface name and group name (these will usually be the same, but differ above for illustrative purposes), its current state ("ok", indicating that all of the interfaces are operational), the maximum time needed to detect a failure (10 seconds), and the interfaces that comprise the group.
We can get a more detailed look at the IPMP health and configuration of the interfaces under IPMP using the "-i" (interface) output mode:
# ipmpstat -i INTERFACE ACTIVE GROUP FLAGS LINK PROBE STATE ce1 yes net57 ------ up ok ok ce0 yes net57 ------ up disabled ok
Here, we can see that ce0 has probe-based failure detection disabled. We can also see issues that prevent an interface from being used (aka being "active") -- e.g., if suppose we enable standby on ce0:
# ifconfig ce0 standby # ipmpstat -i INTERFACE ACTIVE GROUP FLAGS LINK PROBE STATE ce1 yes net57 ------ up ok ok ce0 no net57 si---- up disabled ok
We can see that ce0 is now no longer active, because it's an inactive standby (indicated by the "i" and "s" flags). This means that all of the addresses in the group must be restricted to ce1 (unless ce1 becomes unusable), which we can see via the "-a" (address) output mode ("-n" turns off address-to-hostname resolution):
# ipmpstat -an ADDRESS GROUP STATE INBOUND OUTBOUND 10.8.57.210 net57 up ce1 ce1 10.8.57.34 net57 up ce1 ce1
For fun, we can offline ce1 and observe the failover to ce0:
# if_mpadm -d ce1 # ipmpstat -i INTERFACE ACTIVE GROUP FLAGS LINK PROBE STATE ce1 no net57 ----d- disabled disabled offline ce0 yes net57 s----- up disabled ok[ In addition to the "offline" state, the "d" flag also indicates that all of the addresses on ce0 are down, preventing it from receiving any traffic. ]
# ipmpstat -an ADDRESS GROUP STATE INBOUND OUTBOUND 10.8.57.210 net57 up ce0 ce0 10.8.57.34 net57 up ce0 ce0We can also convert ce0 back to a "normal" interface, online ce1 and observe the load spreading configurations:
# ifconfig ce0 -standby # if_mpadm -r ce1 # ipmpstat -i INTERFACE ACTIVE GROUP FLAGS LINK PROBE STATE ce1 yes net57 ------ up ok ok ce0 yes net57 ------ up disabled ok # ipmpstat -an ADDRESS GROUP STATE INBOUND OUTBOUND 10.8.57.210 net57 up ce0 ce1 ce0 10.8.57.34 net57 up ce1 ce1 ce0In particular, this indicates that incoming traffic to 10.8.57.210 will go to ce0 and inbound traffic to 10.8.57.34 will go to ce1 (as per the ARP mappings). However, outbound traffic will potentially flow over either interface (though to sidestep packet ordering issues, a given connection will remain latched unless the interface becomes unusable).
This also highlights another aspect of the new IPMP design: the kernel is responsible for spreading the IP addresses across the interfaces (rather than the administrator). The current algorithm simply attempts to keep the number of IP addresses "evenly" distributed over the set of interfaces, but more sophisticated policies (e.g., based on load measurements) could be added in the future.
To round out the ipmpstat feature set, one can also monitor the targets and probes used during probe-based failure detection:
# ipmpstat -tn INTERFACE MODE TESTADDR TARGETS ce1 mcast 10.8.57.12 10.8.57.237 10.8.57.235 10.8.57.254 10.8.57.253 10.8.57.207 ce0 disabled -- --Above, we can see that ce1 is using "mcast" (multicast) mode to discover its probe targets, and we can see the targets it has decided to probe, in firing order. We can also look at the probes themselves, in real-time:
# ipmpstat -pn TIME INTERFACE PROBE TARGET RTT RTTAVG RTTDEV 1.15s ce1 112 10.8.57.237 1.09ms 1.14ms 0.11ms 2.33s ce1 113 10.8.57.235 1.11ms 1.18ms 0.13ms 3.94s ce1 114 10.8.57.254 1.07ms 2.10ms 2.00ms 5.38s ce1 115 10.8.57.253 1.08ms 1.14ms 0.10ms 6.19s ce1 116 10.8.57.207 1.43ms 1.20ms 0.19ms 7.73s ce1 117 10.8.57.237 1.04ms 1.13ms 0.11ms 9.47s ce1 118 10.8.57.235 1.04ms 1.16ms 0.13ms 10.67s ce1 119 10.8.57.254 1.06ms 1.97ms 1.76ms \^CAbove, the inflated RTT average and standard deviation for 10.8.57.254 indicate that something went wrong with 10.8.57.254 in the not-too-distant past. (As an aside: "-p" also revealed a subtle longstanding bug in in.mpathd that was causing inflated jitter times for probe targets; see 6549950.)
Anyway, hopefully all this gives you not only a feel for ipmpstat, but a feel for how development is progressing. It should be noted that several key features are still missing, such as:
- Broadcast and multicast support on IPMP interfaces.
- IPv6 traffic on IPMP interfaces.
- IP Filter support on IPMP interfaces.
- MIB and kstat support on IPMP interfaces.
- DHCP on IPMP interfaces.
- Sun Cluster support.