SMF/ Predictive Self Healing: svcadm(1)

Today we'll take a look at SMF's main administrative tool, svcadm(1M).  With this tool, you will enable, disable, and maintain your services.


Let's start with a commonly used service, ssh.

[straylight] % svcs network/ssh
STATE          STIME    FMRI
online         Sep_24   svc:/network/ssh:default

From another machine, we can verify that everything is running fine:

[proxima-centauri] % ssh straylight

As you can see, ssh is enabled and running on my machine.  Now let's disable it.

[straylight] % svcadm disable network/ssh
[straylight] % svcs network/ssh
STATE          STIME    FMRI
disabled       0:56:48 svc:/network/ssh:default

Now the service reads as disabled, and we can see:

[proxima-centauri] % ssh straylight
ssh: connect to host straylight port 22: Connection refused

...that it actually is.  Turning the service on is just as easy.

[straylight] % svcadm enable network/ssh
[straylight] % svcs network/ssh
STATE          STIME    FMRI
online         0:58:07  svc:/network/ssh:default

Note that it's now online.  (The time changes each time we execute an administrative action)

Enable and disable have some extra options that are quite useful.  Say we enabled ssh, but that we noticed that it was offline:

[straylight] % svcs network/ssh
STATE          STIME    FMRI
offline        1:00:47 svc:/network/ssh:default

We know from my previous post that offline means that the service is enabled, but something it depends on is missing (either disabled, or offline).  Let's see what is wrong:

[straylight] % svcs -d network/ssh:default
STATE          STIME    FMRI
disabled       1:00:24 svc:/system/cryptosvc:default
online         Sep_24   svc:/network/loopback:default
online         Sep_24   svc:/system/filesystem/usr:default

Ah hah, cryptosvc isn't enabled.  You might have a service with lots of dependencies that are disabled, or you might have dependencies disabled many levels deep.

Do you want to walk through all those services, find out why they're not on, and enable every dependency by hand?  Of course you don't.  So svcadm has a "recursive enable" option that goes through and enables everything that your service depends on.

[straylight] % svcadm enable -r network/ssh

[straylight] % svcs network/ssh
STATE          STIME    FMRI
online         1:02:23 svc:/network/ssh:default

[straylight] % svcs -d network/ssh:default
STATE          STIME    FMRI
online         Sep_24   svc:/network/loopback:default
online         Sep_24   svc:/system/filesystem/usr:default
online         1:02:22 svc:/system/cryptosvc:default

As you can see, we recursively enabled not only ssh, but everything it depended on, allowing it to come online.

One last option of note for enable/disable is the "temporary" option.  Say that you want to enable/disable a service just for this session, but have it revert to its previous state on reboot, in case there are problems.  If ssh is disabled and you issue:

[straylight] % svcadm enable -t network/ssh 

The enable will only be temporary.  If you reboot the machine, the service will once again be disabled.


Refresh serves two purposes.  One is if you've changed any of the properties of your service, say that you've added a dependency or changed the timeout for starting, you refresh the service, and the properties become active.  The other purpose is that there's an optional method, in addition to "start" and "stop", called "refresh" that you can define.  If your daemon can be sent a HUP signal to re-read its configuration file, you put this in the refresh method, and when you refresh the service, this method is called.

A good example of this is DHCP.  If you change one of the parameters in dhcpsvc.conf, you issue:

[straylight] % svcadm refresh network/dhcp-server 

... and your changes become active.


Restart is pretty self evident.  Restarting a service means that you stop it and start it again.  Where in the past you might have issued a /etc/init.d/sendmail stop followed by /etc/init.d/sendmail start, now you would use:

[straylight] % svcadm restart network/smtp:sendmail 

... which will restart sendmail.

mark (degraded | maintenance)

Mark is used to force a service into a certain state.  (The states are here if you've forgotten them)  An administrator might want to force a service into the maintenance state to let other administrators know that there's something wrong with it that needs to be addressed before it's started again.  You can force a service into either maintenance (which will shut the service down) or degraded (which will leave it running, but let others know that it's running in a degraded state).

Keeping with our earlier example of ssh:

[straylight] % svcadm mark maintenance network/ssh

[straylight] % svcs network/ssh
STATE          STIME    FMRI
maintenance    1:12:47 svc:/network/ssh:default


Clear is used to "reset" the state of a service, and have it be re-evaluated.  For example, say that syslog is in maintenance:

[straylight] % svcs system/system-log 
STATE          STIME    FMRI 
maintenance    1:15:33  svc:/system/system-log:default

You debug the problem, and realize that syslog failed to start because someone had accidentally deleted syslog.conf, which syslog needs to start.  It attempted to start, saw that the conf file was missing, and fell into maintenance.  You repair the file, and issue a clear:

[straylight] % svcadm clear system/system-log

[straylight] % svcs system/system-log
STATE          STIME    FMRI
online         1:25:07  svc:/system/system-log:default


So now you know how to perform basic maintenance on a Solaris 10 machine using SMF.  I hope it's clear that this system of administration is quite easy, and incredibly powerful.  No longer do you have to hunt around for daemons and init scripts, every service is given a unique FMRI, administered through a unified framework.  This, combined with explicit states and dependencies, gives administrators flexibility and power that is unavailable in other Unix distributions.

My next post will be about manifests, which are the XML files used to describe each service.  We'll examine a manifest in depth, and take a look at the properties and the dependencies that make it up.  As always, questions and suggestions are welcome.


[Trackback] Neues zum Thema SMF: Tobin Coziahr's blog

Posted by on September 27, 2004 at 08:14 PM PDT #

It looks like time and date is printed in a locale dependent format. Is there a way to specify ISO 8601 as date format so that it is parsable in any locale or time zone, like e.g. 2004-09-28 14:20:58-07:00 or 20040928T182058Z?

Posted by Anonymous on September 28, 2004 at 04:21 AM PDT #

@Anonymous: svcs(1) provides human-readable output so, yes, times are localized (and parsing is discouraged). For programming purposes, times are stored in the repository as fractional UNIX seconds. For instance,
$ svcprop -p restarter/state_timestamp system/system-log:default
For shell programming, svcprop(1) would be the supported approach to getting parsable information. - Stephen

Posted by Stephen Hahn on September 28, 2004 at 07:22 AM PDT #

To disable a service once a system is up a running, "svcadm disable" does the job well. What would the suggested way of disabling a service before the system is up an running, in order to guarantee a service never runs. Clarification of this point would be from a finish script when a system is being built using a Jumpstart system (which might traditionally use something like rm /a/etc/rc3.d/S55thing), or when creating a zone (which might use rm /zoneroot/zonename/root/etc/rc3.d/S55thing), either of these not running in the true context of the end system.

Posted by Brian on November 03, 2004 at 04:18 AM PST #

Brian: Services don't run automatically. In their manifests, you set whether or not they're enabled by default.
In fact, all of our services come disabled by default, and then something called a "profile", which is a list of all services to enable on boot, is shipped with the system. You can always modify the profile, and when you're creating new services, have them simply disabled by default. You don't need any special scripting to do this.
Soon, I'll be posting about how manifests work, and I'll go over this. Let me know if you have any more questions.

Posted by Tobin Coziahr on November 03, 2004 at 04:28 PM PST #

At what point is the /var/svc/profiles/generic.xml processed? Is it when the system is initially built and booted for the very first time only, or does in include when it's booted immediately after a jumpstart from a flash archive? Thanks, Bri

Posted by Bri on November 04, 2004 at 04:09 PM PST #

I asked around, and none of us can really think of anything machine-specific that's stored in the repository, so jumpstarting from a flash archive with an SMF-enabled machine should behave normally.

In this case, it wouldn't re-evaluate the generic profile, because the master machine that you have cloned the flash archive off of would have already applied the profile when it booted for the first time.

To answer your first question, the profile is evaluated on first boot, or at any point that you manually tell it to read a profile.

Posted by Tobin Coziahr on November 04, 2004 at 05:28 PM PST #

In the case of jumpstarting using a flar - Editing of generic.xml and platform.xml is discouraged (yes?), but the facility to create 'site.xml' as a profile exists. If the flar were created from a system jumpstarted without site.xml, and a new 'site.xml' were created as part of the jumpstart the new profile changes (from site.xml only) would apply, yes?

Posted by TSK on November 12, 2004 at 03:56 AM PST #

Post a Comment:
Comments are closed for this entry.



« July 2016