NWAM Phase 0 code walkthrough part 1
By user12820842 on Aug 23, 2007
I've been deep in the bowels of the NWAM implementation for awhile and thought it might be useful to provide a code walkthrough. There's a lot of interactions to juggle in trying to understand what NWAM is doing, so it's probably best to start with a brief introduction and describe what NWAM is up to. I'll follow up with more specifics later. Just to be clear, this is the NWAM implementation currently in Nevada, which represents the first phase of the ongoing project detailed at the NWAM project page.
Firstly, the code for the nwam daemon is here
It's launched by the "nwam" instance of the network/physical service. The "default" instance of this service does traditional network configuration, and to switch on nwam, it should be disabled, and nwam should be enabled, i.e.svcadm disable network/physical:default
svcadm enable network/physical:nwam
The nwam method script launches the NWAM daemon, nwamd.
The general design of nwamd is a set of threads that listen for and feed events into a state machine which alters system state. Signal handling is carried out by a dedicated thread, which blocks on SIGWAIT and handles various signal- driven events for the daemon as a whole (as otherwise we would have to ensure each thread does signal handling as there are no guarantees which thread gets the signal delivered to it). When forking child processes, we restore the signal mask.
The key concepts NWAM phase 0 introduces are:
Link Layer Profiles (LLPs): these specify, for each IP interface, how it attains an address ("static"'ally or via "dhcp"). In the former case, the static address must be specified. The /etc/nwam/llp file is used to store LLPs, and it can either be hand-crafted, where the IP interfaces are listed in preferred order, or it is created automatically by nwamd (in this case, wired interfaces are ordered first). For the first phase of NWAM (what we call phase 0), only one LLP can be active at a time.
Upper Layer Profiles (ULPs): these are tied to LLPs, and once an LLP is active, nwamd looks for the file /etc/nwam/ulp/check-conditions. If this file exists and is executable, it is run. This should print a single line of output, which is the name of the profile that the user wishes to activate based on the current conditions. If such a line is read successfully (foo in this example), then /etc/nwam/ulp/foo/bringup is executed. Likewise, when the interface gets torn down for whatever reason, /etc/nwam/ulp/foo/teardown is executed. The "bringup" and "teardown" scripts are invoked via pfexec(1) with default basic privileges. Such scripts can be used to initiate VPN connection, etc. No upper layer profiles are provided by default.
For more details, see nwamd(1M).
main.c: initialization, signal handling, main event loop
events.c: code to enqueue/dequeue events, routing events thread
interface.c: IP interface handling, bringing up/down interfaces etc.
llp.c: code for creating/manipulating /etc/nwam/llp file, switching LLPs, finding the best available LLP, etc.
state_machine.c: code to handle events dequeued by the event handler
util.c: general utility functions, logging etc.
wireless.c: wireless handling code
The program consists of a number of threads:
the main event loop thread: takes events off the event queue and calls state_machine() to process them.
the signal handling thread: signals are blocked everywhere else, and it handles SIGALARM timers, shutdown and SIGHUP (refresh) signals.
routing event thread: opens a RTSOCK AF_UNIX socket and read()s from it to get RTM_IFINFO (interface flag changes) and RTM_NEWADDR (new address for interface) events.
periodic wireless scan thread: checks if the signal level has dropped below the minimum acceptable or if the AP has disconnected, and initiates wireless scans.
gather interface info thread: run when a cable is plugged in or a wireless interface is being configured. runs DHCP on wired interfaces or does a wireless scan.
A timer is initialized when DHCP is started on an interface. Timers are started via the alarm() system call, and on expiry, the SIGALRM signal is caught in the signal handling thread. From here we walk all interfaces to find which interfaces timer expiry occured for, and we create EV_TIMER events for those interfaces that expiry occured for. When a timer expires on an interface, we check if DHCP has succeeded - if not, we select the best available interface (one which DHCP has not failed on), and make that the active LLP.
To get an idea of what NWAM is up to, let's take a simple example: a laptop running NWAM for the first time, with a wireless interface and a wired interface that is not plugged in.
All IP interfaces will be brought down by initialize_interfaces(), and dhcpagent is killed. Then "ifconfig -a plumb" is carried out to plumb all interfaces. We then add these interfaces to our interfaces list. The /etc/nwam/llp file is initialized to contain the wired and wireless interfaces, with wired appearing before wireless, and specifying address sources for both as "dhcp". We then start event collection, creating the routing event thread and the periodic wireless scan thread. Then the gather_interface_info() thread is fired off for each IP interface that is IFF_RUNNING or wireless. Note that some drivers will show IFF_RUNNING even if a cable is not plugged in - they do not support DL_NOTE_LINK_UP notifications. If the wired driver supports link state notifications, nothing is done with it, and the wireless scan kicks in and initiates a scan on the interface. We will be prompted with a list of wireless networks, and select one, possibly entering a WEP key if required. If connection succeeds, we generate a "newif" event, which the event handling thread uses to evaluate the best LLP given the current scenario.
If the driver does not support link state notifications, it will show IFF_RUNNING regardless of whether a cable is plugged in or not, and an attempt to start DHCP on the interface will be made. A timer is started, and we add a "newif" (new interface) event to the event queue. In response to this, the state machine gets the preferred LLP of those available (in this case the wired one), and compares it to the current LLP, and swaps if necessary, making the wired profile active. From here the Upper Layer Profile conditions are evaluated, and the appropriate ULP is activated.
I'll follow up with more details about initialization and the main event loop.