Introducing run-one and run-this-one

I love cronjobs!  They wake me up in the morning, fetch my mail, backup my data, sync my mirrors, update my systems, check the health of my hardware and RAIDs, transcode my MythTV recordings, and so many other things...

The robotic precision of cron ensures that each subsequent job runs, on time, every time.  But it doesn't check that the previous execution of that same job completed first -- and that can cause big trouble.

This often happens to me, when I'm traveling and my backup cronjob fires while I'm on a slow uplink.  It's no good when an hourly rsync takes longer than an hour to run.  That has sent my system on a nasty downward spiral, soon seeing 2 or 3 or 10 rsync's all running at the same time.  Dang.

For this reason, I found myself putting almost all of my cronjobs in a wrapper script, managing and respecting a pidfile lock according to the typical UNIX sysvinit daemon method.  This quite rapidly found duplicated and sometimes buggy lock handling code spread across my multiple workstations and servers...  :-/

I'm proud to say, however, that I have now solved this problem on all of my servers, at least for myself, and perhaps for you too!

In Ubuntu 11.04 (Natty), you can now find a pair of utilities in the 'run-one' package: 'run-one' and 'run-this-one'.

run-one

You can simply prepend the 'run-one' utility on the beginning of any command (just like 'time' or 'sudo').  The tool will calculate the md5sum $HASH of the rest of the following $@ commands and arguments, and then try to obtain a lock on a file in $HOME/.cache/$HASH using flock(1).  If it can obtain the lock, then your command is simply executed, releasing the lock when done.  And if not, then another copy of your command is already running, and it quietly exits non-zero.

I can now be safely assured that there will only ever be one copy of this cronjob running on my local system as $USER at a time:
  */60 * * * *   run-one rsync -azP $HOME example.com:/srv/backup >> $HOME/backup.log

If a copy of "rsync -azP $HOME example.com:/srv/backup" is already running, subsequent calls of the same invocation will quiety exit non-zero.

run-this-one

'run-this-one' is a slightly more forceful take on the same idea.  Using pgrep(1), it finds any matching invocations owned by the user in the process table and kills those first, then continues, behaving just as 'run-one' (establishing the lock and executing your command).

I rely on a handful of ssh(1) tunnels and proxies, but I often suspend and resume my laptop many times a day, which can cause those ssh(1) connections to go stale and hang around for a while before the connection times out.  For these, I want to kill any old instances of the command+arguments invocation, and then start a fresh one.

Now, I use this code snippet in a wrapper script to establish my ssh(1) socks proxy, and a pair of local port forwarding tunnels for (squid(1) and bip(1) proxies):
  run-this-one ssh -N -C -D 1080 -L 3128:localhost:3128 -L 7778:localhost:7778 example.com

Have you struggled with this before?  Do you have a more elegant solution?  Would you use 'run-one' and/or 'run-this-one' to solve a similar problem?

Cheers,
:-Dustin
