Monitoring zone boot and shutdown using DTrace

Several people have expressed a desire for a way to monitor zone state transitions such as zone boot or shutdown events. Currently there is no way to get notified when a zone is booted or shutdown. One way would be to run zoneadm list -p at regular intervals and parse the output, but this has some drawbacks that make this solution less ideal:

  • it is inefficient because you are polling for events,
  • you will probably start at least two processes for each polling cycle (zoneadm(1M) and nawk(1)),
  • more importantly, you could miss transitions if your polling interval is too large. Since a zone reboot might take only seconds, you would need to poll often in order not to miss a state change.

A better, much more efficient solution can be built using DTrace, the 'Swiss Army knife of system observability'. As mentioned in this message on the DTrace forum, the zone_boot() function looks like a promising way to get notifications when a zone is booted. Listing all FBT probes with the string 'zone_' in their name (dtrace -l fbt|grep zone_) turns up another interesting function: zone_shutdown(). To verify that these probes are fired when a zone is either booted or shutdown, let's enable both probes:

# dtrace -n 'fbt:genunix:zone_boot:entry, fbt:genunix:zone_shutdown:entry {}'
dtrace: description 'fbt:genunix:zone_boot:entry, fbt:genunix:zone_shutdown:entry ' matched 2 probes

When zoneadm -z zone1 boot is executed we see that the zone_boot:entry probe fires:

CPU     ID                    FUNCTION:NAME
  0   6722                  zone_boot:entry

The zone_shutdown:entry probe fires when the zone is shutdown (either by zoneadm -z zone1 halt or using init 0 from within the zone):

  0   6726              zone_shutdown:entry

This gives us the basic 'plumbing' for the monitoring script. By instrumenting the zone_boot() and zone_shutdown() functions with the FBT provider we can wait for zone boot and shutdown with almost zero overhead. Now what is left is finding out the name of the zone that was booted or shutdown. This requires some knowledge of the implementation and access to the source (anyone interested can take a look at the source after OpenSolaris is launched, so stay tuned).

A quick look at the source shows that we can get the zone name by instrumenting a third function, zone_find_all_by_id() that is called by both zone_boot() and zone_shutdown(). This function returns a pointer to a zone_t structure (defined in /usr/include/sys/zone.h). The DTrace script below uses a common DTrace idiom: in the :entry probe we set a thread-local variable trace that is used as a predicate in the :return probes (the :return probes have the information we're after). The FBT provider :return probe stores the function return value in args[1] so we can access the zone name as args[1]->zone_name in fbt:genunix:zonefind_all_by_id:return and save it for later use in fbt:genunix:zone_boot:return and fbt:genunix:zone_shutdown:return.

#!/usr/sbin/dtrace -qs

self string name;

fbt:genunix:zone_boot:entry
{
        self->trace = 1;
}

fbt:genunix:zone_boot:return
/self->trace && args[1] == 0/
{
        printf("Zone %s booted\\n", self->name);
        self->trace = 0;
        self->name = 0;
}

fbt:genunix:zone_shutdown:entry
{
        self->trace = 1;
}

fbt:genunix:zone_shutdown:return
/self->trace && args[1] == 0/
{
        printf("Zone %s shutdown\\n", self->name);
        self->trace = 0;
        self->name = 0;
}

fbt:genunix:zone_find_all_by_id:return
/self->trace/
{
        self->name = stringof(args[1]->zone_name);
}

Starting the script and booting and shutting down some Zones gives the following result:

# ./zonemon.d
Zone aap booted
Zone noot booted
Zone noot shutdown
Zone noot booted
Zone aap shutdown

So there you have it, a simple DTrace script that will efficiently wait for zone boot and shutdown events. Enjoy.

Technorati Tag: Solaris

Technorati Tag: DTrace

Comments:

Very cool!

i've been looking for this for quite a while.. Not having access to source made the "zone-id -> zone-name" piece somewhat difficult.

Was wondering if you were open to an RFE for this script...

Many HA-solutions have an engine-log that basically shows the "transitions" resource-groups etc. go through, which is something that can be quite helpfull when it comes to (post-mortum) debugging, root-cause analysis, etc.

From "zone.h" we can see that local-zones can also go through transitions... on the way up, down, sideways...

/\* zone_status \*/
typedef enum {
        ZONE_IS_UNINITIALIZED = 0,
        ZONE_IS_READY,
        ZONE_IS_BOOTING,
        ZONE_IS_RUNNING,
        ZONE_IS_SHUTTING_DOWN,
        ZONE_IS_EMPTY,
        ZONE_IS_DOWN,
        ZONE_IS_DYING,
        ZONE_IS_DEAD
} zone_status_t;

Do you think that instead of (in addition to?) tying into "zone_startup" and "zone_shutdown" it would make sense to tie into something like "zone_status_wait".... where you basically monitor the state-change for ANY zone... and report on it as the zone goes through those transitions?

Either way, many thanks for the script so far, it was very much needed.

-- MikeE

Posted by guest on May 25, 2005 at 09:29 AM CEST #

Mike,

Glad you like it :-) Yes, the script could be extended/changed to watch more state transitions than just boot and shutdown (but I haven't had the time). Also, there is an RFE open for a generic way to get zone state transitions (5052723). You might want to have a call record added to that.

Posted by Menno Lageman on May 25, 2005 at 10:33 AM CEST #

Cool!

I'll have our SUN guys append us to RFE "5052723", thank you. Not having source (waiting for OpenSol?) it may be a little difficult for us to modify your dtrace script to catch transitions.

If you find some "spare time", perhaps you can show us the way :-)

Either way, this is very useful already.

Nogmaals bedankt,

-- MikeE

Posted by guest on May 25, 2005 at 04:19 PM CEST #

Now that was easy! To see all transitions it is sufficient to instrument just one function:

#!/usr/sbin/dtrace -qs

BEGIN
{
        state[0] = "Uninitialized";
        state[1] = "Ready";
        state[2] = "Booting";
        state[3] = "Running";
        state[4] = "Shutting down";
        state[5] = "Empty";
        state[6] = "Down";
        state[7] = "Dying";
        state[8] = "Dead";
}

zone_status_set:entry
{
        printf("Zone %s status %s\\n", stringof(args[0]->zone_name),
                state[args[1]]);
}

Bouncing a zone (boot, ready, shutdown, left, right) gives us:

# ./zonestatus.d
Zone aap status Ready
Zone aap status Booting
Zone aap status Running
Zone aap status Shutting down
Zone aap status Down
Zone aap status Empty
Zone aap status Dying
Zone aap status Ready
Zone aap status Dead
Zone aap status Booting
Zone aap status Running
Zone aap status Shutting down
Zone aap status Empty
Zone aap status Down
Zone aap status Dead

Nice and simple, eh?

Posted by Menno Lageman on May 26, 2005 at 09:52 AM CEST #

awesome!

Simpler than the previous one even.
I should be able to add time-stamps to this myself, and we're in business.

Thanks again for the quick help.

-- MikeE

Posted by guest on May 26, 2005 at 04:17 PM CEST #

Post a Comment:
Comments are closed for this entry.
About

menno

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today
News

No bookmarks in folder

Blogroll

No bookmarks in folder