Wednesday Dec 24, 2008

Timezone aware cron finally pushed to OpenSolaris

With this “push” yesterday:

changeset:   8439:51a23ac0d2a6
user:        Chris Gerhard <Chris.Gerhard@sun.com>
date:        Tue Dec 23 15:44:14 2008 +0000
files:       usr/src/cmd/cron/Makefile usr/src/cmd/cron/cron.c usr/src/cmd/cron/cron.h usr/src/cmd/cron/crontab.c usr/src/cmd/cron/funcs.c
description:
PSARC/2007/503 crontab entry environment variables
6518038 cron & crontab should support multiple timezones

OpenSolaris finally contains a version of cron that understands and correctly handles having different timezones. You can also specify a different home directory (useful when you don't want NFS to get involved in your cron job for any reason) and shell to run jobs in. It should be in build 106 of OpenSolaris & Nevada.

This brings you crontab in line with at(1) which has been timezone aware for some time.

To use simply set the variables HOME, TZ and SHELL in your crontab file and all subsequent lines will use those values until the next HOME, TZ and SHELL lines are found:

SHELL=/usr/bin/ksh
TZ=Africa/Abidjan
HOME=/tmp/cg13442
23 \* 1-9,11-26,28-29,31 2-10,12 \* exec /var/tmp/cron/crontest.sh.1.1 23 \\\* 1-9,1
1-26,28-29,31 2-10,12 \\\* Africa/Abidjan
HOME=/var/tmp/cg13442
3 0-7,9-10,12-22 1-6,8-9,11-21,24-26,28 1-7,9-10,12 \* exec /var/tmp/cron/crontes
t.sh.1.1 3 0-7,9-10,12-22 1-6,8-9,11-21,24-26,28 1-7,9-10,12 \\\* Africa/Abidjan
TZ=Africa/Accra
37 0-2,4-5,7-17,20-22 \* \* \* exec /var/tmp/cron/crontest.sh.1.1 37 0-2,4-5,7-17,2
0-22 \\\* \\\* \\\* Africa/Accra

Saturday Feb 09, 2008

cron code delivered.

At last I have handed over the cron changes to support different timezones to Darren who is sponsoring the effort. I've learned a lot in the process so far of trying to do this work from “outside” of Sun. Mostly that the time required to do even a very small project like this is very great and there are times when you can't just put it down if you are busy. This makes it very difficult when doing this in your own “spare” time and can lead to some spectacularly late nights. The other problems were around keeping a build system running at home. The sometimes long times between working on this resulted in considerable effort to keep up with the various flag days. I also had some tangles with mercurial that did not help.

The ARC case was quite painless even if there were elements of Bike Shed Syndrome in it with real dangers of even greater feature creep. Having actually experienced ARCs internally I was probably better prepared for this than a real external engineer.

I got some really great feed back during the code reviews which has resulted in a better end result.

Now I'm just sitting back and waiting.

Monday Jan 28, 2008

cron not setting roots PATH correctly

While my home server continues to run the timezone enabled cron daemon I have after the last upgrade to build 81 I started getting mails to root saying:

Your "cron" job on pearson
/tank/fs/local/snapshot minute tank/fs

produced the following output:

/tank/fs/local/snapshot[2]: zfs:  not found

Which was odd as the script had worked perfectly for years, well months. So why did root's path no longer contain “/usr/sbin”? Here I made a big mistake. I assumed (always a bad thing) that the bug was introduced by my code. Needless to say the timing could not have been worse. I had just put the “final” code changes for code review so finding a new bug was a real fly in the ointment. So finding a bug in that code would just be irritating. Then to add more confusion to the bug if you used crontab -e to edit the crontab, for example to add a cron job like this:


\* \* \* \* \* echo $PATH > /tmp/.root_path

To help debug the problem the problem would go away, at which point you forget about it until you reboot the system (to help diagnose 6653187) and when the job runs now it has the wrong PATH.

After a few minutes staring at the code it is obvious what is wrong. We are using an uninitialized variable to choose which PATH to use. The question was what had I done to cause this? Now I spend a few hours staring at the code running under libumem, running under the debugger, to see how I could have introduced the bug. I could not see how this could ever have worked. Finally I decide to check to see if there have been any recent changes to cron in the hoe that this was not my fault. So it was off to Martins “Mercurial for TeamWare users” page to find how to do this with mercurial:


changeset:   5581:aa8f6b1ea400
user:        basabi
date:        Mon Dec 03 14:32:45 2007 -0800
summary:     6636777 \*cron\* coredumps on NULL home directory

changeset:   5558:0976be4b75d2
user:        basabi
date:        Thu Nov 29 21:09:22 2007 -0800
summary:     6416652 \*cron\* suffers from amnesia if name services aren't there 
at boot time

changeset:   1315:45f0335a274a
user:        basabi
date:        Tue Jan 24 07:11:42 2006 -0800
summary:     6270017 cron/at-jobs log warning about not obtaining latest contrac
t 
from popen(3c)

Perhaps one of those last two putbacks introduced the bug. Time to try the unmodified cron binary (yes the time to try that binary was hours ago but there is no point in being smart after the event). Sure enough the bug is there so I did not introduce it. Time to file the bug an move on.


Bug ID 6655359 Synopsis cron assumes malloc returns zeros memory and then sets root's path by luck rather than judgement


Introduced by: 6416652 \*cron\* suffers from amnesia if name services aren't there at boot time

Moral. Always check the putback logs.

Tuesday Nov 13, 2007

Timezones for cron code review request

I've got the webrevs ready for cron supporting time zones. The slight delay was finding a version of webrev that was aware of mecurial which I found on http://cr.grommit.com/~stevel/webrev_fixes/raw_files/new/usr/src/tools/scripts/webrev.sh. Even this though leaves something to be desired as mercurial insists that any new file that it sees is something that is part of the putback. So if you run this twice it produces a webrev of the webrev. Somewhat confusing but I'm sure we will get sorted out as mercurial moves into the mainstream. For similar reasons I had to pull a new source tree and merge my changes so that mercurial really only saw the things I had really touched and not every file that I had built.

Anyway back on topic. The webrev for:

PSARC 2007/503: crontab entry environment variables
6518038 cron & crontab should support multiple timezone

Is here: http://cr.opensolaris.org/~cjg/cron/webrev/. The code review request has been posted to the OpenSolaris Code alias. Please respond to the review there.

Thursday Nov 08, 2007

Cron Progress

My case to have cron be timezone aware has been approved by PSARC: 2007/503. So now I have to work to bring my workspace up to date and then get the changes back into Nevada.

It is nice to have the ball back in my court.

Tuesday Sep 04, 2007

Timezones for cron progress

At last some real progress on the addition of timezone support to cron that I have been working in my “spare” time.

Yesterday the proposal went for Architectural review. It is PSARC 2007/503 and has been submitted as a fast track. The full proposal includes some feature creep in that it also proposes support for two other environment variables to effect the behaviour of cron. These are:

  • HOME. This will allow you to have cron run your jobs from a directory other than your home directory. This is particularly useful for jobs running on NFS clients when using an authentication flavour that is not SYS.

  • SHELL. Let you have your job run by the specified shell rather than the bourne shell.

The reason for limiting the variables to just three is that only these three effect cron. Any other variable could be set on the command line and the shell in use would take advantage of it.

Finally whether you get this behaviour or not is controlled by the presence of a line in /etc/default/cron. This way OpenSolaris distributions can choose if they want the “correct” behaviour according to the standards or the more useful extended behaviour.

Tuesday Jun 19, 2007

Where are all the log files?

Todays question was:

Is there a list of all the default log files that are used in Solaris?

Not that I know of. Mostly since most software you can configure to log anywhere you wish it would be an impossible task to come up with a complete list that was of any practical benefit.

However there are some places to go looking for log files:

  1. The file /etc/syslog.conf will contain the names of logfiles written to via syslog.

  2. The contents of the directory /var/svc/log is the default location for log files from SMF. These files are connected to any daemons standard out and standard error so can grow.

  3. Then the files in /etc/default will define logfiles for services that are not using syslog. For example /var/adm/sulog

So having ticked off those log files and decided upon a strategy for maintaining them, mine is to keep 100k of log for the logs in /var/svc/log and let logadm(1M) look after them. I keep sulog forever and clean it by hand as I'm paranoid. Configuring logadm to look after the SMF logs is easy:

for i in /var/svc/log/\*.log
do
logadm -w $i -C1 -c -s100k
done

So how can I be sure that there are no more log files out there? You could use find to find all the files modified in the last 24 hours however this will get you a lot of false positives. Since what is really interesting are the active log files that are in the “/” and “/var” file systems, I can use dtrace to find them by running this script for a few hours:

syscall::write:entry
/ ! (execname == "ksh" && arg0 == 63 ) &&
    fds[arg0].fi_oflags & O_APPEND &&
    (fds[arg0].fi_mount == "/" || fds[arg0].fi_mount == "/var" )/
{
        @logs[fds[arg0].fi_pathname] = count();
        logfiles[ fds[arg0].fi_pathname]++
}
syscall::write:entry
/ logfiles[ fds[arg0].fi_pathname] == 1 &&
    ! (execname == "ksh" && arg0 == 63 ) &&
    fds[arg0].fi_oflags & O_APPEND &&
    (fds[arg0].fi_mount == "/" || fds[arg0].fi_mount == "/var" )/
{
        printf("%s %s", fds[arg0].fi_fs, fds[arg0].fi_pathname);
}

in half an hour gives me:

# dtrace -s /home/cjg/lang/d/log.d
dtrace: script '/home/cjg/lang/d/log.d' matched 2 probes
CPU     ID                    FUNCTION:NAME
  0   4575                      write:entry ufs /var/cron/log
  0   4575                      write:entry ufs /var/adm/wtmpx
  0   4575                      write:entry ufs /var/adm/sulog
  0   4575                      write:entry ufs /var/adm/messages
  0   4575                      write:entry ufs /var/apache2/logs/access_log
  0   4575                      write:entry ufs /var/svc/log/system-filesystem-autofs:default.log
  0   4575                      write:entry ufs /var/log/syslog
  0   4575                      write:entry ufs /var/log/exim/mainlog
\^C

  /var/adm/messages                                                 1
  /var/adm/sulog                                                    2
  /var/adm/wtmpx                                                    2
  /var/svc/log/system-filesystem-autofs:default.log                 4
1
  /var/apache2/logs/access_log                                      7
  /var/log/exim/mainlog                                            28
  /var/log/syslog                                                  42
  /var/cron/log                                                  16772
# 

Clearly there is still scope for false positives files in /var/tmp that are opened O_APPEND for example, or if you use a different shell but it gives a very good starting point.



1The autofs log file has been written to thanks to me using the well hidden feature of being able to turn automounter verbose mode on and off by accessing the f file “=v” in the as root in the root of an indirect mount point. Typically this is “/net/=v”. Equally you can set the trace level by accessing “/net/=N” where N is any integer.

2Cron is so busy as I am still running the test jobs for the timezone aware cron.

Wednesday Feb 14, 2007

Testing timezone enabled cron

My proof of concept for cron to understand timezones has moved to being closer to production code and I'm in the process of trying to get it put back in to Solaris. The biggest sticking point at the moment appears to be that the UNIX standards all say you can't do this in a crontab file.

Meanwhile following Adam's point that it's tested or it's broken I have started work on a test suite to test the new functionality in the blind optimism that a way around the standards can be found.

First I wrote a script that understands the five time fields used by cron and a timezone and can tell you if it is running when it should not. It is called crontest and can be run from cron like this:

15,45 \* 1-18,20 2 \* exec lang/sh/crontest 15,45 \\\* 1-18,20 2 \\\* Africa/Addis_Ababa

When it thinks things have gone wrong it prints an error with the process id and the parent process id so that it can be tied back to the entry in /var/cron/log.


Now I just need a crontab file that contains all the possible cron entries. Hmm that is not so easy. However if I can fill the crontab file with random entries I should get good coverage. So I have another script, crontab_create which generates random, but legal, crontab entries which call the crontest script. So I end up with a crontab that looks like this:


TZ=Asia/Tbilisi
42,50 0 1-25,27-28,30 1 \* exec lang/sh/crontest 42,50 0 1-25,27-28,30 1 \\\* Asia/Tbilisi
37,45 0-21,23 1-20,22-23,25 \* \* exec lang/sh/crontest 37,45 0-21,23 1-20,22-23,25 \\\* \\\* Asia/Tbilisi
32,40 0-16,18-19,21 \* 1 \* exec lang/sh/crontest 32,40 0-16,18-19,21 \\\* 1 \\\* Asia/Tbilisi
27,35 \* 1-10,12-13,15-21,23 1-11 \* exec lang/sh/crontest 27,35 \\\* 1-10,12-13,15-21,23 1-11 \\\* Asia/Tbilisi
22,30,57,59 0-6,8-9,11-17,19 1-5,7-8,10-16,18-28,30 1-6,8-9,11 \* exec lang/sh/crontest 22,30,57,59 0-6,8-9,11-17,19 1-5,7-8,10-16,18-28,30 1-6,8-9,11 \\\* Asia/Tbilisi
TZ=Asia/Tehran
17,25,52,54-56 0-1,3-4,6-12,14 2-3,5-11,13-23,25 1,3-4,6 \* exec lang/sh/crontest 17,25,52,54-56 0-1,3-4,6-12,14 2-3,5-11,13-23,25 1,3-4,6 \\\* Asia/Tehran
12,20,47,49-51 1-7,9-19,21 \* \* \* exec lang/sh/crontest 12,20,47,49-51 1-7,9-19,21 \\\* \\\* \\\* Asia/Tehran
7,15,42,44-46 0-2,4-14,16 1,3-13,15-22,24,26-27,29 1-2,4 \* exec lang/sh/crontest 7,15,42,44-46 0-2,4-14,16 1,3-13,15-22,24,26-27,29 1-2,4 \\\* Asia/Tehran
2,10,37,39-41 0-9,11-18,20,22 1-8,10-17,19,21-22,24 1-9,11 \* exec lang/sh/crontest 2,10,37,39-41 0-9,11-18,20,22 1-8,10-17,19,21-22,24 1-9,11 \\\* Asia/Tehran
5,32,34-36 0-4,6-13,15,17-18,20 1-3,5-12,14,16-17,19 1-4,6 \* exec lang/sh/crontest 5,32,34-36 0-4,6-13,15,17-18,20 1-3,5-12,14,16-17,19 1-4,6 \\\* Asia/Tehran
TZ=Asia/Thimphu
0,27,29-31 1-8,10,12-13,15 1-7,9,11-12,14 \* \* exec lang/sh/crontest 0,27,29-31 1-8,10,12-13,15 1-7,9,11-12,14 \\\* \\\* Asia/Thimphu
22,24-26 \* 1-2,4,6-7,9 1-3,5,7-8,10 \* exec lang/sh/crontest 22,24-26 \\\* 1-2,4,6-7,9 1-3,5,7-8,10 \\\* Asia/Thimphu
17,19-21 0,2-3,5 1-2,4-26,28-30 2-3,5 \* exec lang/sh/crontest 17,19-21 0,2-3,5 1-2,4-26,28-30 2-3,5 \\\* Asia/Thimphu
12,14-16 0-22 \* \* \* exec lang/sh/crontest 12,14-16 0-22 \\\* \\\* \\\* Asia/Thimphu
7,9-11 \* 1-16,18-20,22 1 \* exec lang/sh/crontest 7,9-11 \\\* 1-16,18-20,22 1 \\\* Asia/Thimphu
TZ=Asia/Tokyo
2,4-6,56,59 0-12,14-16,18 1-11,13-15,17 1 \* exec lang/sh/crontest 2,4-6,56,59 0-12,14-16,18 1-11,13-15,17 1 \\\* Asia/Tokyo
0-1,51,54 0-7,9-11,13 1-6,8-10,12 1-7,9-11 \* exec lang/sh/crontest 0-1,51,54 0-7,9-11,13 1-6,8-10,12 1-7,9-11 \\\* Asia/Tokyo
46,49,55 0-2,4-6,8 1,3-5,7 1-2,4-6,8 \* exec lang/sh/crontest 46,49,55 0-2,4-6,8 1,3-5,7 1-2,4-6,8 \\\* Asia/Tokyo
41,44,50 0-1,3 2 1,3 \* exec lang/sh/crontest 41,44,50 0-1,3 2 1,3 \\\* Asia/Tokyo
36,39,45 0 1 \* \* exec lang/sh/crontest 36,39,45 0 1 \\\* \\\* Asia/Tokyo
TZ=Asia/Ulaanbaatar

I needed some tuning so it would not be starting more than 100 jobs in anyone minute or at least it would not every minute as cron would then delay some jobs which in turn throws up false positives. I now have it with 551 timezones and 2756 cron entries which. There is probably (certainly) an optimization on the script to generate the cron entries so that they only run in the current month or at least for the likely length of the test.


It has been running for 24 hours on my home server. I can see the entries run from the log and I've not had any email telling me things have been submitted at the wrong time. So I am reasonably happy. Just to prove that it does what I think it does I added a line were the arguments to the script did not match the time specification of an entry that is running. Sure enough I get email (the crontab entry I changed the 25 minute to 50):


Your "cron" job on pearson
exec lang/sh/crontest 23,25 0-11,13-18,20 1-10,12-17,19 \\\* \\\* GMT

produced the following output:

17182:24396: Bad min: Should be 23,25 is 50:

This covers the easy cases to test that jobs that run run at the correct time. The harder part is to verify that every job that should run does run. Need to think that one through.

Tuesday Jan 23, 2007

Mutliple time zones for cron

I've never worked out why things happen in pairs but last Friday two timezone related issues came up. One a server running in some US timezone while serving services to all the world prompted my blog entry to suggest that you run your servers in GMT post. However there was also a question about cron, which featured in the follow up to my post. That is cron run in the timezone of the system. Not the timezone of the users or a arbitrary timezone chosen by the user. Would it not be cool to be able to specify a timezone in your crontab file? Better yet let you specify multiple timezones in crontab files eg: z

TZ=US/Pacific
\* 11 \* \* \* (/usr/bin/date ; /usr/bin/date -u) > /tmp/cron.out
\* 19 \* \* \* (/usr/bin/date ; /usr/bin/date -u) > /tmp/cron.out19
\* 15 \* \* \* (/usr/bin/date ; /usr/bin/date -u) > /tmp/cron.out15
\* 8 \* \* \* (/usr/bin/date ; /usr/bin/date -u) > /tmp/cron.out8
TZ=GMT
\* 15 \* \* \* (/usr/bin/date ; /usr/bin/date -u) > /tmp/cron.out15gmt


So the first four crontab lines would run in US/Pacific and the last in GMT. Each user can use as many timezones as they wish and the TZ environment variable is propergated to the child.


Now as it happens I am travelling so that has meant a few hours in Airports and I got wondering how easy it would be to get cron to be more timezone friendly.

The answer to this is it takes just over one Tecra M2 battery life to get it working. Clearly this needs some more testing , and the crontab command needs to validate the TZ strings rather than just past them though. but the diffs are here. The output file shows one of the runs from US/Pacific:

6 # cat /tmp/cron.out8
Tue Jan 23 08:25:00 PST 2007
Tue Jan 23 16:25:00 GMT 2007
7 # 

Now I need to file an RFE and see if this can be putback.

Tags:

About

This is the old blog of Chris Gerhard. It has mostly moved to http://chrisgerhard.wordpress.com

Search

Archives
« April 2014
MonTueWedThuFriSatSun
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
    
       
Today