ADSM-L

Re: TSM 5.3.3 loaddb and audit problem

2006-05-28 07:46:03
Subject: Re: TSM 5.3.3 loaddb and audit problem
From: Remco Post <r.post AT SARA DOT NL>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Sun, 28 May 2006 13:47:44 +0200
Richard Sims wrote:
> Hi, Kelly -
>
> I was appalled when I first saw TSM manuals blithely enticing
> customers to reorganize their TSM databases as though it were some
> kind of risk-free, trivial undertaking. Nowhere in the documentation
> for this procedure are there the strong advisories which should be
> there regarding the prolonged unavailability which your site's data
> recovery facility will experience during the procedure, full
> perspective on why it might ever be warranted, the risks involved,
> what messages to expect, how to know whether or not the operation has
> succeeded, or what to do in case of a problem. Conspicuously missing
> is any mention that the utilities involved are not mainstream TSM
> software, but rather salvage utilities - which get little developer
> attention or testing (as is evident in the frightening APARs I've
> read on these utilities).

Ok, I read the manual carefully while we did our unload/load. If you do
read carefully you'll notice that ample waring is given when you either
use dumpdb (rather than unloaddb) or unloaddb with consistent=n.

So I would not recommend doing either of those unless some competent IBM
support person tells you to do so. The same is true for audidb btw.
(note, IMNSHO not all IBM level 1 or even level 2 support is competent,
use your own judgment.)

You all know how strongly I feel about this operation, unless you have a
very good reason to unload/load, just don't. I my case, we found our 220
GB database only 42.5% utilized and shrinking and we needed the
diskspace for other tsm servers badly, really badly.

Doing the loadformat/loaddb was the scariest ting I've done in my life.
The only reason I even dared is because I knew I had a decent database
backup and knew how to restore.

>
> To my experienced eye, this was an extraordinarily irresponsible
> thing for IBM to do, and a recipe for disaster. TSM novices in
> particular will see this in the manual, think it harmless because IBM
> offers it, and launch right into it. Unfortunately, the disaster
> potential has been borne out by customers writing to ADSM-L for help
> upon discovering the hard way that their TSM database is no longer
> viable after the operation. (And we don't know how many more
> customers have suffered silently.)
>
> It is high risk stuff, and almost always unwarranted, as customers
> are typically trying it expecting it to be some panacea for their
> system. Without an understanding of databases in general and the TSM
> db specifically, a customer is wandering through an unfamiliar house
> in the dark in such an undertaking, where the risk of getting hurt is
> high.
>

Agreed, both the administrators guide and the admin reference manual
should be very clear on one thing.... unload/load is usually not what
you need. Only in very special cases these operations could be carried out.

> The fact is that IBM *DOES NOT* have suitable software for its TSM
> customers to use to reorganize the TSM database. Salvage utilities,
> by their nature, are VERY physical in their orientation and
> operation, with no customer-meaningful feedback during execution and
> no customer-oriented assurance summary at conclusion.  (I speak from
> experience in having run these utilities - and having seen no
> enduring performance or space benefit.)  And, again, these utilities
> are not part of the main product and, as "tributary" software,
> receive little developmental attention. Such software is wholly
> unsuitable for this purported usage. And the encouraging but
> unadvising documentation only makes the situation worse.
>

I would have been very happy if we could have just salvaged the free
block in our database. No such luck. The only way we could think of (all
three of us, with experience going back to ADSM v2.1) to free the space
was by doing an unload/load. Fragmentation is a fact of life. We'll be
seeing our db grow by as much as 50% in the next few weeks (my guess).


> Thankfully, we have this forum to try to keep customers from getting
> into trouble when someone suggests actions which we experienced
> technicians know are just plain bad.
>

Unfortunately, you, Kelly and I (and a lot of other people on this list)
all are very aware of one other fact, _if_ it is justifiable to salvage
your unused database pages, there is only one way, and it's scary
stuff.... :(

> To all the novice customers:  Get the whole story on a major
> procedure before considering undertaking it.
>

So, now my summary. We did do an unload/load this weekend. Unload of 660
million database entries (approx. 250 million files) took well over 11
hours, database was on 16 15k RPM disks (mirrored). Still the bottleneck
was the disks. Target was a file deviceclass. The load took well over 5
hours. Now our bottleneck was the CPU. since both operations are
single-threaded, don't expect to see much better performance on any
hardware.

Now, if I had any other option available, would I have done this? Hell
no. Unfortunately, I needed the space yesterday, and any other solution
would take months.

>     Richard Sims
>
> On May 17, 2006, at 7:08 AM, Kelly Lipp wrote:
>
>> Richard,
>>
>> I could not agree more on your stance regarding Dump/Load.
>> However, I'm
>> in Holland teaching a Level 2 class and have been surprised to learn
>> that a lot of my students perform this action as a matter of course on
>> their servers.  The objective is to reduce the size of aged TSM
>> databases.  In TSM 5.3 we have new functionality to determine if a db
>> reorg would reclaim a significant amount of space.  Then the Dump/load
>> is executed to get this space.  Do you suppose this new command is
>> encouraging us to do something that is high risk?  Alternatives?
>>
>> I guess they've decided the risk is worth the potential gain.
>>
>> I personally have not experience the problem so have not attempted
>> this
>> solution.


--
Met vriendelijke groeten,

Remco Post

SARA - Reken- en Netwerkdiensten                      http://www.sara.nl
High Performance Computing  Tel. +31 20 592 3000    Fax. +31 20 668 3167
PGP Key fingerprint = 6367 DFE9 5CBC 0737 7D16  B3F6 048A 02BF DC93 94EC

"I really didn't foresee the Internet. But then, neither did the
computer industry. Not that that tells us very much of course - the
computer industry didn't even foresee that the century was going to
end." -- Douglas Adams

<Prev in Thread] Current Thread [Next in Thread>