Sad shell programming problem

While I was away an email ended up in my mail box like this:

I have a question on scripting ...

I want a "timedwait" for a background process started from a shellscript, i.e. do something like:

while [ 1 ]; do
        start_bg_job &
        "timedwait $! 30" || kill -9 $?
        cleanup
done

i.e. wait for the background job termination - but no more than 30 secs, if it didn't finish then kill it hard.

There were a couple of replies but none of them were “pure” shell scripts but it was weeks ago so who cares. Well it nagged at me last night as I cycled home. There has to be away to do this from the korn shell, and as usual the answer came to me just outside Fairoaks Airport (that is the answer came to me as usual while cycling home rather than outside the Airport.. If all the answers came to me outside the Airport I would be there a lot).


The trick (hack may be more appropriate) is to use a co process and start both the command you wish to run and a sleep command in the background in a sub shell that is the co process. When each of the commands return you echo information as to what to do back to the parent process which reads from the co process and takes the appropriate action.


So I ended up with this shell function:


#!/bin/ksh -p

# run_n_wait “command” [timeout]
function run_n_wait
{
        typeset com command time pid arg

        command="$1"
        time=${2:-60}

        ( ( ( $command ) > /dev/null 2>&1 & \\
                echo pid $! ; wait $! ;\\
                echo Done $? ) & \\
         (sleep $time ; echo kill ) & ) |&

         while read -p com arg
         do
                case $com in
                kill)  if [[ "${pid}" != "" ]]
                        then
                                kill ${pid} > /dev/null 2>&1
                                wait ${pid}
                        fi
                        return -1 ;;
                pid) pid=$arg ; arg="" ;;
                Done) return $arg ;;
                esac
        done
}


x=$SECONDS
run_n_wait "/bin/false" 3
echo Slept for $(($SECONDS - $x)) seconds ret $?
x=$SECONDS
run_n_wait "sleep 5 " 
echo Slept for $(($SECONDS - $x)) seconds ret $?
x=$SECONDS
run_n_wait "sleep 60" 3
echo Slept for $(($SECONDS - $x)) seconds ret $?

Yes there are lots of shells that could just do this out of the box but that was not the question. If you have a better way (korn shell, bourne shell only) let me know.


Tags:

Comments:

Well it's not better, but how about:

while [ 1 ]; do
        start_bg_job &
        sleep 30
        kill -9 $!
        cleanup
done

Which is one way I've solved this problem (and works fine for things that tend to wait forever if they go wrong); for cpu-bound programs just setting the cpu limit and letting the OS wipe them out is also a possible solution.

(The downside to this solution is that, because it unconditionally kills the pid, there's no guarantee that the real process hasn't finished and the pid has been taken by another process [the pids having wrapped around]. But for short waits, it works fine.)

Posted by Peter Tribble on September 01, 2005 at 02:17 PM BST #

The other problem with that solution is that you always sleep for 30 seconds even if the job completes in less.

Posted by Chris Gerhard on September 02, 2005 at 07:07 AM BST #

Here's another way that doesn't use a co-process, although it does "borrow" fd 9. This is a variation of a script I wrote a while back that does a "timed read", i.e. prompt a user for input but bail out after N seconds.

I flipped the args to run_n_wait around so the command could get its own args properly (e.g. with whitespace and all the other icky stuff).

#!/bin/ksh

function run_n_wait
{
    typeset -i delay=$1
    shift

    ###
    # Run the command in the background and get the PID.
    ###

    "$@" &
    typeset -i cmd_pid=$!

    ###
    # Move stderr out of the way so if the kill in the sleep job goes
    # off we do not get a "Terminated" message written.
    ###

    exec 9>&2 2>/dev/null

    ###
    # Start the sleep job that waits for the requested amount of time
    # then kills the background command.
    ###

    ( sleep $delay ; kill -9 $cmd_pid ) < /dev/null > /dev/null 2>&1 &
    typeset -i sleep_pid=$!

    ###
    # Wait for either the background process to terminate normally or
    # to have been killed by the sleep job.
    ###

    wait $cmd_pid
    typeset -i cmd_status=$?

    ###
    # Get rid of the sleep job if it is still around.
    ###

    kill $sleep_pid
    wait $sleep_pid

    ###
    # Put stderr back the way it should be and return the status from
    # the background command.
    ###

    exec 2>&9 9>&-
    return $cmd_status
}

x=$SECONDS
run_n_wait 3 /bin/false
echo Slept for $(($SECONDS - $x)) seconds ret $?
# ptree $$

x=$SECONDS
run_n_wait 10 sleep 5
echo Slept for $(($SECONDS - $x)) seconds ret $?
# ptree $$

x=$SECONDS
run_n_wait 3 sleep 10
echo Slept for $(($SECONDS - $x)) seconds ret $?
# ptree $$

x=$SECONDS
run_n_wait 3 echo "a    string    with      whitespace"
echo Slept for $(($SECONDS - $x)) seconds ret $?
# ptree $$

exit 0

Posted by John R. Jackson on September 14, 2005 at 04:27 PM BST #

Post a Comment:
Comments are closed for this entry.
About

This is the old blog of Chris Gerhard. It has mostly moved to http://chrisgerhard.wordpress.com

Search

Archives
« April 2014
MonTueWedThuFriSatSun
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
    
       
Today