A */1 rsync took our staging box to a load average of 41 one afternoon, and it took me longer than I want to admit to understand why. The sync normally finished in about twenty seconds. That day the backup target's NFS mount went sluggish, the sync started taking ninety seconds, and cron — which does not know or care whether the last run is still going — launched a fresh copy every single minute on top of it. Inside ten minutes there were a half-dozen rsyncs all reading the same tree off the same slow disk, each one making the disk slower, each new minute adding another. The box wasn't under attack. It was attacking itself, one polite copy at a time.
The fix is to make the job refuse to run while a copy of itself is already running. People reach for a PID file first — write $$ to /var/run/job.pid, check if it exists on the next run — and it almost works, until the day a run gets kill -9'd or the box reboots mid-job and leaves a stale PID file behind. Now every future run sees a "lock" pointing at a process that died on Tuesday, and the job never runs again. There's also a quiet race between checking the file and writing it. PID files are a lock you have to remember to clean up, and the times you most need the lock are exactly the times cleanup didn't happen.
flock has none of that. The lock isn't a file you create and delete — it's a lock the kernel holds on an open file descriptor, and the kernel releases it automatically the instant that descriptor closes. The process exiting closes the descriptor. So does crashing. So does kill -9. There is no state to leave behind.
The single-instance pattern
The two lines doing the work are exec 200>"$LOCK_FILE" and flock -n 200. The first opens the lock file on a descriptor that stays open for the life of the process. The second tries to grab the lock without waiting; if a sibling process already holds it, flock returns non-zero, we log it to stderr and exit 0 — a skipped run is normal, not an error, so we don't want it lighting up cron's mail.
Notice there is no cleanup. No trap to remove a PID file, no rm at the end. When this script exits for any reason, fd 200 closes and the lock is gone. That "for any reason" is the whole point: the failure modes that strand a PID file are the ones flock shrugs off.
If you don't want to edit the script
You can lock a command without touching it at all, straight from the crontab line:
flock runs sync.sh only if it can grab /run/lock/sync.lock; if a previous minute's run is still holding it, this minute's run exits immediately and does nothing. This is the fastest retrofit for a job that's already misbehaving — you don't even have to redeploy the script.
-n skips; -w 30 waits up to 30 seconds then gives up. Pick -n for frequent jobs where a skipped run is harmless (a metrics push, a sync that catches up next minute), and -w for jobs that must eventually run but can tolerate a short queue. Never use a bare flock with no -n and no -w on a fast cron — that blocks forever, and your "skipped" runs quietly pile up as stuck processes.
The load-41 afternoon ended the moment I wrapped that rsync in flock -n. The slow NFS mount was still slow, but now exactly one sync ran at a time and the extras skipped harmlessly until the disk recovered. The interesting part is that locking didn't fix the slow disk — it stopped a transient slow disk from turning into a self-inflicted outage. That's the difference between a script that works when you run it and a script that survives unattended.
A lock alone isn't enough, though. If the locked job itself hangs — a sync blocked on a dead socket that never returns — it holds the lock forever and every future run skips, so the job silently stops running and you find out days later. That's why locking pairs with bounding a command's runtime with timeout, and with retrying transient failures so a single network blip doesn't kill the run. The Hardened Cron Wrapper Generator stitches all three together for you, and the full reasoning for which jobs need which guard is in Bash Scripts That Survive Cron.
Run this script on a real Linux server
Get $200 free credit — DigitalOcean
Get $200 Free →Affiliate link · we earn a commission
Need to test this on a real box without risking a production server? Spin up a throwaway droplet, schedule the job at */1, and watch flock skip the overlaps. The rest of the unattended-automation toolkit is at bashsnippets.xyz — the Cron Job Builder for the schedule itself, and Bash error handling for the layer underneath all of this.