By user12625760 on Jan 28, 2008
While my home server continues to run the timezone enabled cron daemon I have after the last upgrade to build 81 I started getting mails to root saying:
Your "cron" job on pearson /tank/fs/local/snapshot minute tank/fs produced the following output: /tank/fs/local/snapshot: zfs: not found
Which was odd as the script had worked perfectly for years, well months. So why did root's path no longer contain “/usr/sbin”? Here I made a big mistake. I assumed (always a bad thing) that the bug was introduced by my code. Needless to say the timing could not have been worse. I had just put the “final” code changes for code review so finding a new bug was a real fly in the ointment. So finding a bug in that code would just be irritating. Then to add more confusion to the bug if you used crontab -e to edit the crontab, for example to add a cron job like this:
\* \* \* \* \* echo $PATH > /tmp/.root_path
To help debug the problem the problem would go away, at which point you forget about it until you reboot the system (to help diagnose 6653187) and when the job runs now it has the wrong PATH.
After a few minutes staring at the code it is obvious what is wrong. We are using an uninitialized variable to choose which PATH to use. The question was what had I done to cause this? Now I spend a few hours staring at the code running under libumem, running under the debugger, to see how I could have introduced the bug. I could not see how this could ever have worked. Finally I decide to check to see if there have been any recent changes to cron in the hoe that this was not my fault. So it was off to Martins “Mercurial for TeamWare users” page to find how to do this with mercurial:
changeset: 5581:aa8f6b1ea400 user: basabi date: Mon Dec 03 14:32:45 2007 -0800 summary: 6636777 \*cron\* coredumps on NULL home directory changeset: 5558:0976be4b75d2 user: basabi date: Thu Nov 29 21:09:22 2007 -0800 summary: 6416652 \*cron\* suffers from amnesia if name services aren't there at boot time changeset: 1315:45f0335a274a user: basabi date: Tue Jan 24 07:11:42 2006 -0800 summary: 6270017 cron/at-jobs log warning about not obtaining latest contrac t from popen(3c)
Perhaps one of those last two putbacks introduced the bug. Time to try the unmodified cron binary (yes the time to try that binary was hours ago but there is no point in being smart after the event). Sure enough the bug is there so I did not introduce it. Time to file the bug an move on.
Bug ID 6655359 Synopsis cron assumes malloc returns zeros memory and then sets root's path by luck rather than judgement
Introduced by: 6416652 \*cron\* suffers from amnesia if name services aren't there at boot time
Moral. Always check the putback logs.