SMF manifest examples for Apache1 and MySQL

In the Service Management in a Day workshop (and the earlier Migrating a Legacy RC Service module from the Solaris Deep Dives) we examine the migration of MySQL from an RC script to a fully managed SMF service.

Why MySQL ? Well, it's a convenient way to point out that MySQL is included in Solaris 10. But the real reason is that it is rather simple and makes a great platform to show what SMF can really do for us - and it's certainly more than a one trick pony.

So let's set up MySQL and see where this goes. You will find the instructions in /etc/sfw/mysql/README.solaris.mysql, but be careful as there is a small error. The last time I looked, chmod -R requires two arguments, not one.

# /usr/sfw/bin/mysql_install_db
# groupadd mysql
# useradd -g mysql mysql
# chgrp -R mysql /var/mysql
# chmod -R 770 /var/mysql
# installf SUNWmysqlr /var/mysql d 770 root mysql
# cp /usr/sfw/share/mysql/my-medium.cnf /var/mysql/my.cnf

Let's start the database manually and make sure that all is well.

# /etc/sfw/mysql/mysql.server start
Starting mysqld daemon with databases from /var/mysql

# /usr/sfw/bin/mysqladmin status
Uptime: 32  Threads: 1  Questions: 1  Slow queries: 0  Opens: 6  
Flush tables: 1  Open tables: 0  Queries per second avg: 0.031

Time for the first SMF value - resilient services. Let's terminate mysqld and see what happens.

# pkill mysql
#  mysqladmin status
mysqladmin: connect to server at 'localhost' failed
error: 'Can't connect to local MySQL server through socket '/tmp/mysql.sock' (2)'
Check that mysqld is running and that the socket: '/tmp/mysql.sock' exists!

This is what we expect. When mysqld terminates, nobody is watching and it remains down until the next reboot (or transition back to run level 3).

So what can SMF do for me here ? Paying attention to a non-transient service is a good start.

What we need now is a manifest for MySQL. You can take a look at mine or if you follow the RC Service Migration howto then you will come up with something very close. Put mysql.xml somewhere in /var/svc/manifest (application or local seem a good place, local probably being the best choice). Reboot or run the manifest-import service method to make SMF aware of the new service definition

# svcs mysql
svcs: Pattern 'mysql' doesn't match any instances
STATE          STIME    FMRI

# /lib/svc/method/manifest-import
Loaded 1 smf(5) service descriptions

# svcadm enable mysql
# svcs mysql
STATE          STIME    FMRI
online         22:39:54 svc:/application/mysql:default

# mysqladmin status
Uptime: 4  Threads: 1  Questions: 1  Slow queries: 0  Opens: 6
  Flush tables: 1  Open tables: 0  Queries per second avg: 0.250

Now, let's try that pkill thing again.

# pkill mysqld

# svcs mysql
STATE          STIME    FMRI
online         22:45:45 svc:/application/mysql:default
Now, if we watch the service log file which is convenientely located at /var/svc/log/application-mysql:default.log you will see svc.startd notice that all of the processes have terminated, yet it isn't a transient service. So there is a problem and the service should be restarted.

[ Feb  8 16:53:36 Stopping because all processes in service exited. ]
[ Feb  8 16:53:36 Executing stop method ("/etc/sfw/mysql/mysql.server stop") ]
No mysqld pid file found. Looked for /var/mysql/
[ Feb  8 16:53:36 Method "stop" exited with status 0 ]
[ Feb  8 16:53:36 Executing start method ("/etc/sfw/mysql/mysql.server start") ][ Feb  8 16:53:36 Method "start" exited with status 0 ]

This is pretty cool. We've made MySQL somewhat more available than it would have been straight out of the box. Does this eliminate the requirement for High Availability Clusters ? No, but it does open an interesting discussion.

In this example my observation of MySQL's availability is rather naive - if it's running it must be OK. For something like a database server you way want to connect and manipulate some tables to see if the service is really running. We should also note that SMF doesn't really handle the platform availability issues - so HA Clusters are still needed. But it's also interesting to note that many HA scripts only provide coverage for a subset of critical services, usually a database, but ignore the dozens of other services that are also required for proper operation of the service being clustered. Lacking a sophisticated dependency framework, a node failover occurs when one of these other services fails.

SMF provides such a framework, including the watchdog monitor (svc.startd) - and it does with really little effort on the part of the administrator or application packager.

But wait, there's more.

In a recent discussion over service minimization (the idea that you don't install software that you have no intention of running) a more subtle value of SMF can be observed. Solaris 10 allows us to separate the question of installation from activation. It's quite easy to install software and then verify that it is disabled. In fact a routine scan of service properties and a comparison against a baseline is a good idea.

Here is where a bit of creativity can be give us additional safeguards. A well developed SMF manifest will allow us to make an additional distinction. We can now observe the installation of a service, the configuration of a service, and whether or not that service has actually been activated.

How is this done ? A dependency on a configuration file is a good start. Let's look at the MySQL manifest and see how this was done.

                <service_fmri value='file://localhost/var/mysql/my.cnf' />   

This is a dependency on a particular configuration file, in this case /var/sql/my.cnf. If this file is missing then then MySQL will not transition to online. If enabled it will immediately transition to the offline state and a check of svcs -l mysql will show the missing configuration file.

Now this is very cool indeed. For this service to be activated it must be installed, configured and enabled. Failing to configure the service (consider the case of sshd which you probably don't want to run without a configuration file) will provide an obvious and easily observed error condition. This may change the way you look at service minimization.

The takeaway from this exercise is that as you plan your RC service migrations to SMF, add a dependency on an easily observed indicatation that the service has been properly configured, such as a configuration file.

This brings me to my next example, an Apache 1 service manifest. We start by copying the Apache 2 service manifest at /var/svc/manifest/network/http-apache2.xml - as that seemed like a good place to start. I changed the service name, the documentation block, and the start/stop methods as before.

There's one new wrinkle - take a look at the following property group

<property_group name='httpd' type='application'>
                        <stability value='Evolving' />
                        <propval name='ssl' type='boolean' value='false' />

If we look at the Apache2 start method /lib/svc/method/http-apache2, you will see a query for this service property

        ssl=`svcprop -p httpd/ssl svc:/network/http:apache2`
        if [ "$ssl" = false ]; then

So this is how we enable SSL support for Apache 2. If we want to do something similar for Apache 1 then we will have to modify the start script /etc/init.d/apache. The other solution would be to remove the property group from the manifest and modify the start definition to call either /etc/init.d/apache start or /etc/init.d/apache startssl.

After you import this new manifest, please remember to unlink the start and stop links from all run level directories (there's one start in rc3.d and one kill in each run level).

This brings me to my last recommendation - using a configuration file dependency to help keep service instances separated. This is particularly important for the http service as all the executables are named httpd. By adding a dependency to the configuration file you have added an important documentation item that will come in handy when diagnosing service failures. If the instance fails and ends up in a maintenance state, a quick look at svcs -l will tell you which instance you need to investigate.

Where can I learn more about this ? The OpenSolaris SMF Community would be a good place to look. In addition to the excellent articles on Solaris Service Management, there is a repostory of contributed manifests that might help you get started. And you are invited to contribute manifests for your converted services - you might even receive a nice OpenSolaris trinket for your efforts.

Technocrati Tags:

Post a Comment:
  • HTML Syntax: NOT allowed

Bob Netherton is a Principal Sales Consultant for the North American Commercial Hardware group, specializing in Solaris, Virtualization and Engineered Systems. Bob is also a contributing author of Solaris 10 Virtualization Essentials.

This blog will contain information about all three, but primarily focused on topics for Solaris system administrators.

Please follow me on Twitter Facebook or send me email


« April 2014