Thursday Oct 18, 2007

NWAM Phase 1 SMF restarter work

We've been working on refreshing the Phase 1 design for NWAM (see here) in the light of what we've learnt during implementation so far. I was tasked with reworking nwamd (1M) to become an SMF delegated restarter, and have tried to describe the design here. As the name suggests, a delegated restarter's job is to take over many of the duties that the master SMF restarter, svc.startd, carries out in managing child instances. In our case, these child instances are Network Configuration Units (NCUs) representing network links and IP interfaces, environments or locations (which specify which name service to use, etc) and External Network Modifiers (which are external-to-NWAM entities that carry out network configuration, such as a VPN client). The reason a delegated restarter is required is that these entitities don't really fit into the standard start/stop/refresh service model - for example a link needs to respond to events such as a WiFi connection dropping, an IP interface needs to respond to events such as acquiring an IP address through DHCP etc. Since it's nwamd's job to monitor for such events, making it a delegated restarter makes sense - it can then handle such events and run appropriate methods and transition the relevant services through appropriate states.

A delegated restarter has to handle events from the master restarter, svc.startd, and these include:

  • Notification from svc.startd of new instances to handle (these will be instances that specify the delegated restarter as their restarter, you can see examples of how this is done in any inetd service manifest, e.g. echo.xml)
  • Notification that an administrator has initiated an action, e.g. disable, enable
  • Notification that the dependencies of an instance are satisfied, i.e. it is ready to run

One area we're particularly interested in is feedback on the naming of instances representing links and IP interfaces. I've put some suggestions in the design doc mentioned above, but nothing is cast in stone at this stage of course, so any feedback on is welcome. At present, my suggestion is to have all NWAM-managed services grouped under the "network/auto" FMRI, with separate services for NCUs, environments and ENMs, i.e.

  • svc:/network/auto/ncu
  • svc:/network/auto/env
  • svc:/network/auto/enm

note that the use of env(ironment) to describe name service and other configuration may change, this is being discussed on nwam-discuss at present.

In terms of NCUs in particular, my suggestion is to prefix the instance names by their type/class, i.e. for link NCUs, the prefix is "link-", for IPv4 interfaces "ipv4-", and for IPv6 "ipv6-". I had been thinking about having different services for links and IP interfaces, but then realized that in order to use the same instance name for IPv4 and IPv6 interface instances, then i'd need separate IPv4 and IPv6 services. So it seems better to have all NCUs grouped by service, and prefix them, allowing the user to run "svcs \*bfe" say to see all bfe instances

As an example, here's the service listing for the prototype restarter on my laptop, which has a builtin "iwi" wireless card and a builtin "bfe" wired interface:

# svcs ncu
online 7:51:13 svc:/network/auto/ncu:link-bfe0
online 7:51:16 svc:/network/auto/ncu:link-iwi0
online 7:51:21 svc:/network/auto/ncu:ipv4-iwi0
offline 7:51:13 svc:/network/auto/ncu:ipv6-bfe0
offline 7:51:15 svc:/network/auto/ncu:ipv4-bfe0
offline 7:51:16 svc:/network/auto/ncu:ipv6-iwi0

Note that once Clearview vanity naming integrates, the link name ("iwi0" or "bfe0" or whatever) will be dropped in favour of the vanity names for those links.

So, is this reasonable? Is there a better name than "auto"? Let us know!

Thursday Aug 23, 2007

daemon debugging using DTrace

Having taken a DTrace course recently, i've been thinking about ways that it can help debugging during development. here's a simple example: nwamd(1M) uses LOG_ERR syslog(3C) messages to log errors when they occur, but they can of course occur for a variety of reasons. We often want access to more information than we record in the log, so here's a simple DTrace script that prints out the syslog() caller's stack on getting a LOG_ERR (3) message:

#!/usr/sbin/dtrace -s

/ arg0 == 3 /

This can be enhanced to match to a specific syslog() error message (using strstr(copyinstr(arg1), "errmsg") as an additional predicate clause), but even better is speculative tracing - we pick a starting point in the code to start a speculation, and if we hit the error condition, we commit the data we've recorded, otherwise discard it. This is particularly handy in cases where functions have void return values, since if the return value indicates the error, we just need pid$target::function:return /arg1 == errcode/

NWAM Phase 0 code walkthrough part 1

I've been deep in the bowels of the NWAM implementation for awhile and thought it might be useful to provide a code walkthrough. There's a lot of interactions to juggle in trying to understand what NWAM is doing, so it's probably best to start with a brief introduction and describe what NWAM is up to. I'll follow up with more specifics later. Just to be clear, this is the NWAM implementation currently in Nevada, which represents the first phase of the ongoing project detailed at the NWAM project page.

Firstly, the code for the nwam daemon is here

It's launched by the "nwam" instance of the network/physical service. The "default" instance of this service does traditional network configuration, and to switch on nwam, it should be disabled, and nwam should be enabled, i.e. svcadm disable network/physical:default
svcadm enable network/physical:nwam

The nwam method script launches the NWAM daemon, nwamd.

The general design of nwamd is a set of threads that listen for and feed events into a state machine which alters system state. Signal handling is carried out by a dedicated thread, which blocks on SIGWAIT and handles various signal- driven events for the daemon as a whole (as otherwise we would have to ensure each thread does signal handling as there are no guarantees which thread gets the signal delivered to it). When forking child processes, we restore the signal mask.

The key concepts NWAM phase 0 introduces are:

Link Layer Profiles (LLPs): these specify, for each IP interface, how it attains an address ("static"'ally or via "dhcp"). In the former case, the static address must be specified. The /etc/nwam/llp file is used to store LLPs, and it can either be hand-crafted, where the IP interfaces are listed in preferred order, or it is created automatically by nwamd (in this case, wired interfaces are ordered first). For the first phase of NWAM (what we call phase 0), only one LLP can be active at a time.

Upper Layer Profiles (ULPs): these are tied to LLPs, and once an LLP is active, nwamd looks for the file /etc/nwam/ulp/check-conditions. If this file exists and is executable, it is run. This should print a single line of output, which is the name of the profile that the user wishes to activate based on the current conditions. If such a line is read successfully (foo in this example), then /etc/nwam/ulp/foo/bringup is executed. Likewise, when the interface gets torn down for whatever reason, /etc/nwam/ulp/foo/teardown is executed. The "bringup" and "teardown" scripts are invoked via pfexec(1) with default basic privileges. Such scripts can be used to initiate VPN connection, etc. No upper layer profiles are provided by default.

For more details, see nwamd(1M).


main.c: initialization, signal handling, main event loop
events.c: code to enqueue/dequeue events, routing events thread
interface.c: IP interface handling, bringing up/down interfaces etc.
llp.c: code for creating/manipulating /etc/nwam/llp file, switching LLPs, finding the best available LLP, etc.
state_machine.c: code to handle events dequeued by the event handler
util.c: general utility functions, logging etc.
wireless.c: wireless handling code


The program consists of a number of threads:

the main event loop thread: takes events off the event queue and calls state_machine() to process them.

the signal handling thread: signals are blocked everywhere else, and it handles SIGALARM timers, shutdown and SIGHUP (refresh) signals.

routing event thread: opens a RTSOCK AF_UNIX socket and read()s from it to get RTM_IFINFO (interface flag changes) and RTM_NEWADDR (new address for interface) events.

periodic wireless scan thread: checks if the signal level has dropped below the minimum acceptable or if the AP has disconnected, and initiates wireless scans.

gather interface info thread: run when a cable is plugged in or a wireless interface is being configured. runs DHCP on wired interfaces or does a wireless scan.


A timer is initialized when DHCP is started on an interface. Timers are started via the alarm() system call, and on expiry, the SIGALRM signal is caught in the signal handling thread. From here we walk all interfaces to find which interfaces timer expiry occured for, and we create EV_TIMER events for those interfaces that expiry occured for. When a timer expires on an interface, we check if DHCP has succeeded - if not, we select the best available interface (one which DHCP has not failed on), and make that the active LLP.


To get an idea of what NWAM is up to, let's take a simple example: a laptop running NWAM for the first time, with a wireless interface and a wired interface that is not plugged in.

All IP interfaces will be brought down by initialize_interfaces(), and dhcpagent is killed. Then "ifconfig -a plumb" is carried out to plumb all interfaces. We then add these interfaces to our interfaces list. The /etc/nwam/llp file is initialized to contain the wired and wireless interfaces, with wired appearing before wireless, and specifying address sources for both as "dhcp". We then start event collection, creating the routing event thread and the periodic wireless scan thread. Then the gather_interface_info() thread is fired off for each IP interface that is IFF_RUNNING or wireless. Note that some drivers will show IFF_RUNNING even if a cable is not plugged in - they do not support DL_NOTE_LINK_UP notifications. If the wired driver supports link state notifications, nothing is done with it, and the wireless scan kicks in and initiates a scan on the interface. We will be prompted with a list of wireless networks, and select one, possibly entering a WEP key if required. If connection succeeds, we generate a "newif" event, which the event handling thread uses to evaluate the best LLP given the current scenario.

If the driver does not support link state notifications, it will show IFF_RUNNING regardless of whether a cable is plugged in or not, and an attempt to start DHCP on the interface will be made. A timer is started, and we add a "newif" (new interface) event to the event queue. In response to this, the state machine gets the preferred LLP of those available (in this case the wired one), and compares it to the current LLP, and swaps if necessary, making the wired profile active. From here the Upper Layer Profile conditions are evaluated, and the appropriate ULP is activated.

I'll follow up with more details about initialization and the main event loop.




« April 2014