ADSM-L

Re: TSM 5.3.3 loaddb and audit problem

2006-05-17 10:49:33
Subject: Re: TSM 5.3.3 loaddb and audit problem
From: "Allen S. Rout" <asr AT UFL DOT EDU>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Wed, 17 May 2006 10:49:09 -0400
>> On Wed, 17 May 2006 10:14:27 -0400, "Scott, Brian" <bscott AT EDS DOT COM> 
>> said:

> All,

> My collegue found the following article from the University of Florida

Heh.  Well, I was staying out of this round; considering saying
something once Kelly posted, but since my article was invoked, I guess
I was too. :)

Richard and I have kicked this around in the past.  For the record I
don't disagree with the direction of the concerns he has, I disagree
with their magnitude.  He feels that the issues he raises are
sufficiently strong that the procedure should never ever be done.  I
feel that it's sufficiently risky than you shouldn't do it unless
you're pretty certain you will get a win out of it.

So our risk profiles are different; given that by normal standards
most backup admins are tinfoil-covered paranoids, I guess small
absolute variations look huge from our perspective. :)

I've been convinced by Richards' past arguments about the rate of
fragmentation and the speed with which you'll lose the ground you got
back; so I'm not inclined to include an unload/reload as any kind of
routine maintenance.  Not so much for risk reasons, as for
waste-of-time reasons.

It is clear that the unload and reload are drastically less frequently
used than most other code paths in TSM.  But we've got DB backups for
just such reasons.  Your primary risk in this operation is the time
you spend doing the unload / reload, finding you've messed it up, and
then restoring your previous backup. (which, of course, you performed
_just_ before the unload.  -RIGHT?-

The only additional risk you dare is that something subtle is
corrupted, taking you an unacceptable number of days to figure out, so
that the last good DB backup is either gone, or unacceptably distant
(Redo last weeks' work, anyone?)

I will not dismiss this risk: it is finite and should make all our
guts churn.  But I consider it to be low.  More importantly, I
consider it to be indistinguishable from the other low-grade subtle
corruptions which it is possible to introduce.  NT Backup objects,
undeletable files, other oddness: we've seen several DB problems come
down the pike.


But as I said, I'm convinced by Richards' rate-of-decay points, and
consequently I'm not planning on unloading and reloading most of my
servers.  I -am- planning on doing this to two of my servers (actually
the Cyrus backup servers to which I alluded earlier in the week) which
have seen 100+% turnover in filespaces, and are heading towards being
43GB of database, 25% occupied, maximum reduction 1G.   I expect to
get back more than 10G of database volumes from each of them, and
since the filespaces which were occupying that DB space are gone, I
expect to keep that space win.




- Allen S. Rout

<Prev in Thread] Current Thread [Next in Thread>