cron vs. 'at': run command at varying intervals with self-registering 'at'

I wanted to check whether our business critical web server is up in every 6 hours.

First I thought I run ping every 6 hours via 'cron'. But I want to ping more often once server is detected down and until it comes back up.

So, I came up with this self registering 'at' script. This begins pinging the server in incremental backoff intervals once it's detected down. Starts at 1 minute, and then 2 minutes. 3,4,5... If you replace "+1" wit "\*2", it will do exponential backoff. 1,2,4,8,16...
$ cat ~/misc/myServerPing.at
# THISFILE should be full path or relative from $HOME
# Run this in bash by ". {this file}"
THISFILE=misc/myServerPing.at
INTERVAL=1

curl --silent --connect-timeout 8 http://ourserver.sun.com | grep "Our critical page" > /dev/null

if [ $? -ne 0 ]; then
  date | mailx -s "Failed ping to ourserver" my.mail.address@sun.com
  sed "s/\^\\(INTERVAL=\\)[1-9]\*$/\\\\1$(($INTERVAL+1))/" $THISFILE | at now + $INTERVAL minutes > /dev/null 2>&1
else
  at now + 360 minutes < $THISFILE > /dev/null 2>&1
fi
Comments:

isn't the role of tool like nagios? what's best one?

Posted by gerard on November 07, 2009 at 09:21 AM JST #

if the server could be down for 6hours and you not hear about it, its not very mission critical, typically mission critical stuff is checked every 1-5 minutes at least everywhere I have worked.

Posted by James Dickens on November 07, 2009 at 03:40 PM JST #

Post a Comment:
  • HTML Syntax: NOT allowed
About

The views expressed on this blog are my own and do not necessarily reflect the views of Oracle.

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today