Thursday Jul 09, 2015

Cron Begone: Predicable Task Scheduling

cron has had a long reign as the arbiter of scheduled system tasks on
Unix systems.  However, it has some critical flaws that make its use
somewhat fraught.  Delivering cronjobs is Hard.  Inserting fragments
into crontab in a reliable and predictable fashion is tricky.  The
fragment-based delivery provided by /etc/cron.d helps, but it’s not perfect.

It’s also far too easy to create systemic problems on a large scale.  If
your cron fragment is packaged and deployed to multiple systems, the
result is that every system runs that job at exactly the same time.  For
jobs that hit a single server, like checking for package updates, the
result is that every machine connects to the server at the same time,
often overloading it.  Like job delivery, it’s possible to work around
the problem, but it takes some gymnastics to do so.

cron also lacks validation, error handling, dependency management, and a
host of other features.

Enter the Periodic Restarter and Scheduled Services.

The Periodic Restarter is a delegated restarter, at
svc:/system/svc/periodic-restarter:default, that allows the creation of
SMF services that represent scheduled or periodic tasks.  Scheduled
Services have all the benefits that other SMF services have, including
error handling, consistent logging, dependencies, predictable lifecycle
management, and more.  We’ve baked in a some extra goodies on top of
that, which I’ll explain in a bit.

As you can imagine, scheduled services have a slightly different model
than other services.  The most visible difference is that instances of
scheduled services can be in the “online” state even when the task isn’t
running.  For these services, the “online” state means that the service
has all of its dependencies met, is scheduled, and is ready to execute
when the appropriate time comes.  The service will stay in the “online”
state while the job is executing, as well.  Other than that, though,
scheduled services behave like any other SMF service.

So what other features do scheduled services have that make them
superior to ordinary cronjobs?

The first is that scheduled services have a mechanism to execute jobs
that have been missed due to system downtime.  Simply by setting a flag
in the service configuration, the periodic restarter will execute your
task when the system comes back up, once all of its dependencies have
been met.  If your task was set to run at 1:35 in the morning, but it
was rebooted over that time, your task will still execute.

The other nice feature is that scheduled services have built in
randomness.  The task has an execution interval that can be constrained
as necessary, but only as necessary.  Everything else is automatically
randomized.  Maybe a task has to be run once a month, but it doesn’t
matter at all when during the month.  For such a task, the only thing
that must be specified is that the task should execute once a month.
From there, the periodic restarter will pick a time during the month to
execute the job.  Another task, such as log rotation, may need to run
during off hours every day.  In that case, you can constrain the
schedule so that it runs during a particular hour, but the exact time
during that hour will be randomized.  This feature enables the
deployment of scheduled tasks on a wide scale without having to consider
load balancing, both between systems and between tasks on the same system.

In subsuquent updates, we’ll discuss how to create scheduled services as
well as how to inspect some of the extra state that comes along with

Friday Feb 03, 2012

Changes to svccfg import and delete

The behavior of svccfg import and svccfg delete fmri has changed in S11 if the manifests are in SMF's standard locations. The standard locations are /lib/svc/manifest and /var/svc/manifest with /lib/svc/manifest being the preferred location. If your manifest is stored under one of these two directories, you shouldn't be using svccfg import at all. You should only use svccfg delete fmri when you are prohibited from removing the manifest for some reason and still want to remove the service from the system. Instead, the preferred action is:

        # svcadm restart manifest-import


The reason is that in S11, SMF keeps the repository in sync with the files in the standard locations, and the manifest-import service is the mechanism for maintaining this synchronization. So instead of using svccfg import copy your manifest to a standard location and type:

        # svcadm restart manifest-import
Instead of using svccfg delete remove your manifest from its location and restart manifest-import.


In each case manifest-import will detect any changes in the standard directories and update the repository accordingly. Note that the manifest-import service runs asynchronously from the svcadm command, so it may take a short amount of time for the changes to take effect.

Also the manifest-import service not only detects file additions and removals. It also detects changes to manifests and profiles. If you are a service provider, this gives you an upgrade path if your manifest changes. Simply deposit your new manifest over the old one and make sure that manifest-import is restarted. Restarting of manifest-import is usually handled by the packaging service.

Let's look at some examples. First, let's get the manifest for our new service imported.

# cp mysvc.xml /lib/svc/manifest/site
# svcadm restart manifest-import
# svcs mysvc
STATE          STIME    FMRI
online         15:19:41 svc:/mysvc:default
Now delete the service:
# rm /lib/svc/manifest/site/mysvc.xml 
# svcadm restart manifest-import
# svcs mysvc
svcs: Pattern 'mysvc' doesn't match any instances
STATE          STIME    FMRI


Now let's look at what happens if you stray from this advice and use svccfg delete. First, reinstall the manifest just as we did before.

# cp mysvc.xml /lib/svc/manifest/site
# svcadm restart manifest-import
# svcs mysvc
STATE          STIME    FMRI
online         15:34:41 svc:/mysvc:default
Now the fun begins.
# svccfg delete -f svc:/mysvc
# svcs mysvc
svcs: Pattern 'mysvc' doesn't match any instances
STATE          STIME    FMRI
It looks as if the service has been removed from the repository, but it really hasn't been. Since the manifest file is still on the file system, the service is merely masked in the repository. This can lead to confusion. Even if you modify the manifest and restart manifest-import, svcs will not find the service. This is because the masking is done at the administrative layer (see Sean Wilcox's discussion of layers). The masking is not removed by changing the manifest, although manifest-import will record the changes in the repository.


How can we find a masked service.

# svccfg listcust -M | grep svc:/mysvc
svc:/mysvc manifest MASKED
svc:/mysvc:default manifest MASKED
The first line of output shows that the service is masked. Masking a service also masks it instances which is why we see the second line.


So if you accidentally mask a service, how can you unmask it? We enter svccfg interactive mode, select the service and then use the delcust command.

# svccfg
svc:> select mysvc
svc:/mysvc> delcust
 Deleting customizations for service: mysvc
svc:/mysvc> quit
# svcs mysvc
STATE          STIME    FMRI
online         15:50:46 svc:/mysvc:default
The svcs command shows that the service is unmasked.



Solaris Service Management Facility information, tips and tricks.


« August 2015