Thanks everyone for the feedback/ideas!
They reaffirm my initial thoughts that there was no single solution, but
collectively it appears that I can marginalize the risk.
Ultimately as many have stated the DBA team will need to be held accountable,
and they do insist they own their backups........ until there is a problem :)
I will make a mental note to update the thread once a direction has been
determined in hopes others find it helpful.
~Rick
-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf Of
Prather, Wanda
Sent: Sunday, March 08, 2015 5:19 PM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Re: [ADSM-L] DB2/Oracle backup reporting and scheduling
Rick,
It's a problem, everywhere, no matter how you do it.
* The simplistic answer to your question is yes; the external scheduler
is running a list of tasks; after the last task it could call a
perl/ksh/python/yourfavorite script that invokes dsmadmc and does a
delete/define schedule with a start time of "now". One drawback, the client
has to be running in "prompted" mode, and another drawback is that from your
end, since the schedule gets deleted and redefined, you would have to be able
to notice the *absence* of that schedule. But see gotcha in bullet 3 below.
* What might work better is to have the external scheduler's last task be
to fire a script that writes a checkpoint/log file. Have your TDP client
schedule kick off the same time each night, add a preschedule cmd script that
opens the checkpoint file and reads the timestamp, if it doesn't have todays'
expected timestamp, close the file, sleep 10 minutes, rinse and repeat.
* However, even if you get one of those methods to work, it won't solve
the problem. I don't know about DB2, but unless something has changed recently
the only result you will get back from firing off the Oracle TDP with the TSM
scheduler is whether RMAN started or not, it won't tell you whether the backup
was successful or failed. The DBA's actually have to check the results of the
backup from RMAN, AFAIK. (Once years ago I got a Unix wizard to poke around
and write a script that parsed the actual RMAN output and sent email back to
me, but it's not something that's commonly done and would probably require
specific knowledge of that particular backup.)
* If your manager trusts your DBA's, there's nothing wrong with
distributed authority and making them responsible for their backup results,
most of my large customers do that. (Most good DBA's want that, anyway.)
* I have had customers where the DBA's were proved untrustworthy, and I
also resorted to perl/ksh/whatever scripts that did selects on the BACKUPS
table and started firing off emails to managers if backups were missing. (You
can usually get info from the backups table without too much pain if you
specify both the client node name and filespace name/id so the result table is
fairly small and TSM can use indexing to find the stuff.) I've even resorted
to perl scripts that query the activity log for messages from those clients,
and report those out via email.
* FWIW, in the new Operations Center, Tivoli development has approached
this by introducing the concept of "at risk". You specify in the TOC how often
the clients should back up (daily, weekly, etc.) and if even the TDP clients go
beyond that without a backup, they show up as "at risk". The 7.1.1 TOC has a
minimal email report that does show the at-risk TDP's. I haven't played with
it yet for TDP's to know whether it can distinguish between "time since last
contact" and "time since a successful RMAN backup". But you could look at it.
Wanda Prather
TSM Consultant
ICF International Enterprise and Cybersecurity Systems Division
-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf Of
Rick Adamson
Sent: Friday, March 06, 2015 12:12 PM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: [ADSM-L] DB2/Oracle backup reporting and scheduling
I assume someone has dealt with this I would like to hear how they handled it.
The issue:
DB2 and/or Oracle database backups that are dependent on completion of external
processes.
Currently our DBA's utilize a variety of methods to initiate DB2 and Oracle
database backups (CRON, external schedulers, etc) which presents challenges to
confirm that they are being completed as expected. As a start, I proposed
creating a client schedule and using the TSM scheduler to trigger these events,
which would minimally provide a completed/missed/failed status. Complemented by
routine reporting of stored objects it would give me some assurance that TSM
had what it needed to assure their recovery.
The DBA's are pushing back (surprise!) claiming that "some" backups have
special requirements, such as not running during other tasks like payroll
processing, runstats, etc. so they use the external scheduler to set
"conditions" that are met before the backup is initiated.
The question proposed to me is can a TSM schedule be triggered by the external
scheduler once the conditions have been met?
I would be grateful to hear how others handle this, or if they use a different
approach altogether to assure all DP database backups are completing on a
timely basis.
TIA
~Rick
|