ADSM-L

Re: [ADSM-L] DB2/Oracle backup reporting and scheduling

2015-03-09 10:10:04
Subject: Re: [ADSM-L] DB2/Oracle backup reporting and scheduling
From: Rick Adamson <RickAdamson AT BILOHOLDINGS DOT COM>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Mon, 9 Mar 2015 14:08:22 +0000
Thanks everyone for the feedback/ideas!
They reaffirm my initial thoughts that there was no single solution, but 
collectively it appears that I can marginalize the risk.
Ultimately as many have stated the DBA team will need to be held accountable, 
and they do insist they own their backups........ until there is a problem :)
I will make a mental note to update the thread once a direction has been 
determined in hopes others find it helpful.
  
~Rick   


-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf Of 
Prather, Wanda
Sent: Sunday, March 08, 2015 5:19 PM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Re: [ADSM-L] DB2/Oracle backup reporting and scheduling

Rick,
It's a problem, everywhere, no matter how you do it.

*       The simplistic answer to your question is yes; the external scheduler 
is running a list of tasks; after the last task it could call a 
perl/ksh/python/yourfavorite script that invokes dsmadmc and does a 
delete/define schedule with a start time of "now".  One drawback, the client 
has to be running in "prompted" mode, and another drawback is that from your 
end, since the schedule gets deleted and redefined, you would have to be able 
to notice the *absence* of that schedule.  But see gotcha in bullet 3 below.

*       What might work better is to have the external scheduler's last task be 
to fire a script that writes a checkpoint/log file.  Have your TDP client 
schedule kick off the same time each night, add a preschedule cmd script that 
opens the checkpoint file and reads the timestamp, if it doesn't have todays' 
expected timestamp, close the file, sleep 10 minutes, rinse and repeat.

*       However, even if you get one of those methods to work, it won't solve 
the problem.  I don't know about DB2, but unless something has changed recently 
the only result you will get back from firing off the Oracle TDP with the TSM 
scheduler is whether RMAN started or not, it won't tell you whether the backup 
was successful or failed.  The DBA's actually have to check the results of the 
backup from RMAN, AFAIK.  (Once years ago I got a Unix wizard to poke around 
and write a script that parsed the actual RMAN output and sent email back to 
me, but it's not something that's commonly done and would probably require 
specific knowledge of that particular backup.)

*       If your manager trusts your DBA's, there's nothing wrong with 
distributed authority and making them responsible for their backup results, 
most of my large customers do that.  (Most good DBA's want that, anyway.)

*       I have had customers where the DBA's were proved untrustworthy, and I 
also resorted to perl/ksh/whatever scripts that did selects on the BACKUPS 
table and started firing off emails to managers if backups were missing.  (You 
can usually get info from the backups table without too much pain if you 
specify both the client node name and filespace name/id so the result table is 
fairly small and TSM can use indexing to find the stuff.)  I've even resorted 
to perl scripts that query the activity log for messages from those clients, 
and report those out via email.

*       FWIW, in the new Operations Center, Tivoli development has approached 
this by introducing the concept of "at risk".  You specify in the TOC how often 
the clients should back up (daily, weekly, etc.) and if even the TDP clients go 
beyond that without a backup, they show up as "at risk".  The 7.1.1 TOC has a 
minimal email report that does show the at-risk TDP's.  I haven't played with 
it yet for TDP's to know whether it can distinguish between "time since last 
contact" and "time since a successful RMAN backup".  But you could look at it.


Wanda Prather
TSM Consultant
ICF International Enterprise and Cybersecurity Systems Division





-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf Of 
Rick Adamson
Sent: Friday, March 06, 2015 12:12 PM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: [ADSM-L] DB2/Oracle backup reporting and scheduling

I assume someone has dealt with this I would like to hear how they handled it.

The issue:
DB2 and/or Oracle database backups that are dependent on completion of external 
processes.

Currently our DBA's utilize a variety of methods to initiate DB2 and Oracle 
database backups (CRON, external schedulers, etc) which presents challenges to 
confirm that they are being completed as expected. As a start, I proposed 
creating a client schedule and using the TSM scheduler to trigger these events, 
which would minimally provide a completed/missed/failed status. Complemented by 
routine reporting of stored objects it would give me some assurance that TSM 
had what it needed to assure their recovery.

The DBA's are pushing back (surprise!) claiming that "some" backups have 
special requirements, such as not running during other tasks like payroll 
processing, runstats, etc. so they use the external scheduler to set 
"conditions" that are met before the backup is initiated.

The question proposed to me is can a TSM schedule be triggered by the external 
scheduler once the conditions have been met?

I would be grateful to hear how others handle this, or if they use a different 
approach altogether to assure all DP database backups are completing on a 
timely basis.
TIA

~Rick