I am faced with a situation on how to create locking mechanism for RMAN backup.
Script dbf.sh backups database and can be run simutaneously if it’s not for the same database.
Hence dbf.sh sh running for PROD1 & PROD2 at the same time is VALID and running for PROD1 & PROD1 at the same time is NOT VALID.
While dbf.sh is running, arc.sh (backup archivelog) should not be running.
This was instigated by Laurent Schneider from his post on Can you restore from a full online backup?
http://laurentschneider.com/wordpress/2015/05/can-you-restore-from-a-full-online-backup.html
First test, dbf.sh is running and arc.sh should not.
The key is to have locking based on the database sid PROD1 and not the script.
$ crontab -l
#22 20 * * * /media/sf_working/sh/ogg_lag_sec.sh hawklas 0 > /tmp/ogg_lag_sec.sh.log 2>&1 #04 11 * * 2 [ $(date +\%d) -ge 07 ] && /home/oracle/t.sh > /tmp/t.log * * * * * [ $(date +\%d) -ge 07 ] && /usr/bin/flock -w 3600 /tmp/PROD1 /home/oracle/dbf.sh PROD1 >> /tmp/dbfPROD1.log 2>&1 * * * * * [ $(date +\%d) -ge 07 ] && /usr/bin/flock -w 3600 /tmp/PROD2 /home/oracle/dbf.sh PROD2 >> /tmp/dbfPROD2.log 2>&1 * * * * * [ $(date +\%d) -ge 07 ] && /usr/bin/flock -w 0 /tmp/PROD1 /home/oracle/arc.sh PROD1 >> /tmp/arcPROD1.log 2>&1 * * * * * [ $(date +\%d) -ge 07 ] && /usr/bin/flock -w 0 /tmp/PROD2 /home/oracle/arc.sh PROD2 >> /tmp/arcPROD2.log 2>&1
Looks like dbf.sh was run and arc.sh was not.
$ ls -alrt /tmp/*PROD*
-rw-r--r--. 1 oracle oinstall 0 Jun 11 20:22 /tmp/PROD1 -rw-r--r--. 1 oracle oinstall 0 Jun 11 20:22 /tmp/PROD2 -rw-r--r--. 1 oracle oinstall 0 Jun 11 20:22 /tmp/arcPROD1.log -rw-r--r--. 1 oracle oinstall 77 Jun 11 20:22 /tmp/dbfPROD1.log -rw-r--r--. 1 oracle oinstall 0 Jun 11 20:22 /tmp/arcPROD2.log -rw-r--r--. 1 oracle oinstall 77 Jun 11 20:22 /tmp/dbfPROD2.log
$ cat /tmp/dbfPROD1.log
Starting /home/oracle/dbf.sh PROD1 Thu Jun 11 20:22:01 PDT 2015 Sleeping 119
$ cat /tmp/dbfPROD2.log
Starting /home/oracle/dbf.sh PROD2 Thu Jun 11 20:22:01 PDT 2015 Sleeping 119
Continuing to monitor the process and arc.sh never ran since dbf.sh was always running.
$ ls -alrt /tmp/*PROD*
-rw-r--r--. 1 oracle oinstall 0 Jun 11 20:22 /tmp/PROD1 -rw-r--r--. 1 oracle oinstall 0 Jun 11 20:22 /tmp/PROD2 -rw-r--r--. 1 oracle oinstall 0 Jun 11 20:22 /tmp/arcPROD1.log -rw-r--r--. 1 oracle oinstall 0 Jun 11 20:22 /tmp/arcPROD2.log -rw-r--r--. 1 oracle oinstall 231 Jun 11 20:25 /tmp/dbfPROD2.log -rw-r--r--. 1 oracle oinstall 231 Jun 11 20:25 /tmp/dbfPROD1.log
$ cat /tmp/dbfPROD1.log
Starting /home/oracle/dbf.sh PROD1 Thu Jun 11 20:22:01 PDT 2015 Sleeping 119 Starting /home/oracle/dbf.sh PROD1 Thu Jun 11 20:24:00 PDT 2015 Sleeping 119 Starting /home/oracle/dbf.sh PROD1 Thu Jun 11 20:25:59 PDT 2015 Sleeping 119
$ cat /tmp/dbfPROD2.log
Starting /home/oracle/dbf.sh PROD2 Thu Jun 11 20:22:01 PDT 2015 Sleeping 119 Starting /home/oracle/dbf.sh PROD2 Thu Jun 11 20:24:00 PDT 2015 Sleeping 119 Starting /home/oracle/dbf.sh PROD2 Thu Jun 11 20:25:59 PDT 2015 Sleeping 119
Looking good so far. But what happens when arc.sh is currently running and then dbf.sh is started?
It would be a shame to have dbf.sh backup died because arc.sh is running.
$ crontab -l
#22 20 * * * /media/sf_working/sh/ogg_lag_sec.sh hawklas 0 > /tmp/ogg_lag_sec.sh.log 2>&1 #04 11 * * 2 [ $(date +\%d) -ge 07 ] && /home/oracle/t.sh > /tmp/t.log 43 20 * * * [ $(date +\%d) -ge 07 ] && /usr/bin/flock -w 0 /tmp/PROD1 /home/oracle/arc.sh PROD1 >> /tmp/arcPROD1.log 2>&1 43 20 * * * [ $(date +\%d) -ge 07 ] && /usr/bin/flock -w 0 /tmp/PROD2 /home/oracle/arc.sh PROD2 >> /tmp/arcPROD2.log 2>&1 44 20 * * * [ $(date +\%d) -ge 07 ] && /usr/bin/flock -w 3600 /tmp/PROD1 /home/oracle/dbf.sh PROD1 >> /tmp/dbfPROD1.log 2>&1 44 20 * * * [ $(date +\%d) -ge 07 ] && /usr/bin/flock -w 3600 /tmp/PROD2 /home/oracle/dbf.sh PROD2 >> /tmp/dbfPROD2.log 2>&1
From /usr/bin/flock -w 3600, this means wait up to 3600s before aborting dbf.sh
Let’s test this.
$ date
Thu Jun 11 20:43:06 PDT 2015
$ ls -alrt /tmp/*PROD*
-rw-r--r--. 1 oracle oinstall 0 Jun 11 20:43 /tmp/PROD2 -rw-r--r--. 1 oracle oinstall 0 Jun 11 20:43 /tmp/PROD1 -rw-r--r--. 1 oracle oinstall 78 Jun 11 20:43 /tmp/arcPROD2.log -rw-r--r--. 1 oracle oinstall 78 Jun 11 20:43 /tmp/arcPROD1.log
$ cat /tmp/arcPROD1.log
Starting /home/oracle/arc.sh PROD1 Thu Jun 11 20:43:01 PDT 2015 Sleeping 135s
$ cat /tmp/arcPROD2.log
Starting /home/oracle/arc.sh PROD2 Thu Jun 11 20:43:01 PDT 2015 Sleeping 135s
arc.sh started at 20:43 and is sleeping for 135s while dbf.sh is scheduled to run at 20:44
20:43 + 135s would take us to 20:45:15 which is well after the scheduled time for dbf.sh at 20:44
Let’s see if this works.
$ date
Thu Jun 11 20:45:33 PDT 2015
$ ls -alrt /tmp/*PROD*
-rw-r--r--. 1 oracle oinstall 0 Jun 11 20:43 /tmp/PROD2 -rw-r--r--. 1 oracle oinstall 0 Jun 11 20:43 /tmp/PROD1 -rw-r--r--. 1 oracle oinstall 78 Jun 11 20:43 /tmp/arcPROD2.log -rw-r--r--. 1 oracle oinstall 78 Jun 11 20:43 /tmp/arcPROD1.log -rw-r--r--. 1 oracle oinstall 75 Jun 11 20:45 /tmp/dbfPROD1.log -rw-r--r--. 1 oracle oinstall 75 Jun 11 20:45 /tmp/dbfPROD2.log
$ cat /tmp/dbfPROD1.log
Starting /home/oracle/dbf.sh PROD1 Thu Jun 11 20:45:16 PDT 2015 Sleeping 1
$ cat /tmp/dbfPROD2.log
Starting /home/oracle/dbf.sh PROD2 Thu Jun 11 20:45:16 PDT 2015 Sleeping 1
dbf.sh started at 20:45:16 – 1 second after arc.sh completed.
Simple scripts used to test with and you will need to modify sleep time accordingly for each test case.
$ cat dbf.sh
echo "Starting $0 $*" `date` echo "Sleeping 1" sleep 1
$ cat arc.sh
echo "Starting $0 $*" `date` echo "Sleeping 135s" sleep 135
And there you have it.
Good Night.
Reference: https://ma.ttias.be/prevent-cronjobs-from-overlapping-in-linux/
