ADSM-L

Re: Strategies for DR recovery of large clients

2002-09-10 12:05:51
Subject: Re: Strategies for DR recovery of large clients
From: Robin Sharpe <Robin_Sharpe AT BERLEX DOT COM>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Tue, 10 Sep 2002 12:02:46 -0400
Werner,

I feel your pain...  ;)

You have hit most of the major issues of disaster recovery with TSM
squarely on the head.  We have had similar experience in our testing... 4-6
hours to get the TSM server up, running through loads of tapes (even though
storage pool is collocated with only three servers), 48-hour window, etc.

We have proved that we can get our three critical clients back within 24
hours, but they are not nearly as big as yours.  We use DLT8000 drives.

Probably the best way for you to get better restore throughput is to add
more drives and do concurrent restores.  TSM should only mount the tapes
that actually contain the file versions you will restore.  The problem is
that, even with collocation, after many months of backups on a relatively
active system, these files will get scattered across many tapes.
Conventional wisdom suggests using collocation by filespace to reduce this
effect... and also guarantee that concurrent restores of different file
systems will not compete for the same tape volume.  But the cost is of
course using a lot more tape.  Another approach might be to occasionally
(every three months maybe) do a "full" backup (by changing mode to
"absolute" to force even unchanged files to get backed up)... this should
effectively "defragment" the tape pool and put all active versions on one
(or a couple) tape.  We did this once with an additional machine that we
DR'ed and it worked quite well.  Some people don't like this concept
because it defeats TSM's "progressive" backup methodology, but I think its
an acceptable compromise.

As you said, backup sets are not a good option for DR... for one thing,
creating the backupset will take as long as restoring the whole system, and
will read the same number of tapes.  You will suffer this on a regular
schedule since you'll have to make new backupsets probably every week or
two.  Secondly, restoring from backupsets effectively single-threads that
client because all of it's data is on one or maybe two tapes.

Good luck, and please keep us posted on your results!

Robin Sharpe
Berlex Labs