Sloooow disaster recovery from copy pool

jwhowell

ADSM.ORG Member
Joined
Sep 18, 2002
Messages
26
Reaction score
0
Points
0
Website
Visit site
TSM 5.1.1, server on OS/390 2.10, NT and W2K clients. We just got back from a disaster recovery test. While the TSM server restore went very smoothly and we only had a few bobbles adjusting to a different hardware environment, recovering our NT server data from tapes was painfully, agonizingly, excruciatingly slow. The tapes we were using were the offsite copy storage pool, and the pool uses collocation. We keep at most 3 versions of any file and were not trying to restore inactive versions of files. It seemed as if TSM was scanning the tapes repeatedly in some cases; a few servers seemed to restore very quickly, while others took hours to restore a few hundred megabytes of data. Interestingly, the worst offender was a server with a few huge (tens of gigabytes) files. That one was mounting tape after tape.



While doing restores on our test LAN at home we used the backup pool (still tape) rather than the copy storage pool and never saw problems like this. Clearly TSM is doing things differently from the copy pool that it does the backup pool. Has anyone ever experienced a problem like this? Does anyone have any insight as to what might be going on and what we might do to avoid this in the future?
 
Hi!

I have see the same on one customer site. The problem we have there was that. He ahve not split the directory names with the files. So when we was restoreing some data 600 MB it´s take over 3 h.

But when we run some more Recleams and put all the directorys information on diskpool. The we got much better performance. And we download the same data on 10 min.

So we was going from 3h to 10min.

So I think this can help you.

Good lycke and Mery Christmas.
 
It's even worse than I originally thought! Collocation does not appear to be working. If I run a "select volume_name from volumeusage where node_name=node1" and a like query against node2, comparing the result sets shows 1) way too many tapes being used by each node and 2) quite a few tape volumes in common. If I turn around and run "select node_name from volumeusage where volume_name='tapevol'", where tapevol is one of the volumes listed in the first query, I get back a string of different nodes that have data on that tape volume. I checked my tape pools and collocation is set to "yes". I think it's PMR time.
 
For anyone who's interested - this appears to be a bug in TSM and is addressed by APAR IC33376. (I'm waiting on confirmation from IBM, but the writeup in the APAR is pretty explicit.) I've been told by IBM that they are recommending that 390 customers get to 5.1.6 - it's considered stable and also has the support for the new 3590H tape drives.
 
Back
Top