ADSM-L

Re: Selectively duplicating client data across servers

2004-08-26 00:04:05
Subject: Re: Selectively duplicating client data across servers
From: Stuart Lamble <adsm AT CAROUSEL.ITS.MONASH.EDU DOT AU>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Thu, 26 Aug 2004 14:02:49 +1000
(Once more, this time with the _right_ From address. Sigh.)

On 26/08/2004, at 12:40 PM, Steven Pemberton wrote:
On Thursday 26 August 2004 09:19, Stuart Lamble wrote:
Hey ho. Here's the skinny. We will, eventually, have a number of
clients backing up to a TSM server on a regular basis (we're still
setting up the SAN and other ancillary things that are needed to
support the TSM server). Some of them will be filesystem backups;
others will be database backups (which, if I understand correctly, are
most likely to be seen as archives rather than backups as such). The
setup involves two sites, A and B, which have multiple gigabit fibre
connections between them (so they're effectively on the same LAN; the
only real difference is a very small amount of additional latency.)
Systems at site A will backup to a server at site B, and vice versa.

So, you have the following?

1/ Clients at "A" backup to TSM server at "B".
2/ Clients at "B" backup to TSM server at "A".

Yup, pretty much.

Where are you producing the copypool versions? Eg:

1/ Client A -> TSM B (primary) -> TSM A (copy) ?
2/ Client A -> TSM B (primary) -> TSM B (copy) ?
3/ What copy pools? :(

Oh, option three of course -- we want to save money on media! *ducks*
Seriously, though, it'll be option 2, is my understanding -- if site A
goes down, and some of the media at site B is dead, we'd like to still
be able to recover the data. If we lost two sets of media at the same
time, well, we're obviously not meant to have that data any more. (Cue
the story I heard whilst doing Legato Networker training: a site with
several copies of key data. Key data goes poof. First copy on tape is
bad. Second copy on tape is bad. They send for the offsite copy.
Courier manages to have an accident, and the tapes are ruined...)

(The scary thing is, there were a couple of people advocating no copy
pools for some of the clients. Thank God _that_ got shot down in short
order.)

[verbose description of the basic plan snipped in favour of Steven's
summary]
Something like this?

1/ Client A -> TSM B (primary) -> TSM A (copy) (all)
                                             -> TSM C (copy/export)
(critical
only)

A healthy paranoia. :)

That's pretty much it. A more accurate picture would be:

Client A -> TSM B (primary/copy) (all) -> TSM C (copy/export) (critical)
Client B -> TSM A (primary/copy) (all) -> TSM C (copy/export) (critical)

And remember: just because I'm paranoid, it doesn't mean they're _not_
out to get me... ;)

[copying data from the original backup server to the remote site]
The two solutions I've come up with involve data export, and copy
pools
(via virtual volumes). The problem is, both of those operate at the
storage pool level; there's no way to specify "copy/export only the
data for _this_ client, and no others" that I can see.

Actually, you can "export node" for individual hosts, but I'm not sure
if it's
the best way to do what you're planning. However, "export node" can
specify a
client, time range, backup and/or archive data, active files only, and
export/import directly via a server to server connection.

Hm. Going to have to re-read the manual on that; I must have missed
that point. *flick flick flick* ... ok, I missed that point. Excuse me
whilst I carefully extract my foot from my mouth. :)

It's preferable
that we not have to create separate storage pools (all the way down to
the tape level) for these systems just so we can do this -- we'd
prefer
to have one disk pool and one tape pool for the whole shebang if
possible.

I'd normally recommed that you DO create multiple storage pools, so
that you
can better control the physical location of the backup data. This can
improve
recovery performance by separating "critical" clients to their own
storage
pools and tapes. With only one huge disk/tape storage pool hierarchy
each
client's data will tend to "fragment" across a large number of tapes
(unless
you use collocation, which may greatly reduce tape efficiency instead).

Interesting point. Everybody here is an utter newbie when it comes to
TSM; we've done the initial training course (you should remember; IIRC,
you were the one taking the course I was on :) which is all fine and
dandy, but it doesn't really expose you to the little tricks of the
trade which come up when you're actually _using_ the product. :) (And
besides -- after too many months of not using the product because of
wrangling that's out of the hands of the techies, you tend to forget
the finer points that were covered on the course.) Still, I have a fair
amount of faith that TSM will do the job we need; it's more or less a
matter of what problems we run into along the way (and don't tell me we
won't run into problems -- we will; it's just a question of how severe
they are and how difficult to fix. With luck, they'll be less than what
we have with our current backup system.) We've already ruled out
collocation for the most part; I seem to recall an upcoming version of
TSM has a weaker form of collocation (along the lines of "group clients
A, B, and C on the same tapes; D, E, and F on another set of tapes;
etc.") which would be a bit more useful.

If you do create seperate storage pools, then it's simply a matter of
running
an additional "backup stgpool" command to produce the extra off-site
copy for
site "C". This has another advantage in that it's a completely
incremental
process, and you can probably afford to run it every day (for the
critical
nodes/storage pools only).

*nodnod* I was thinking more in terms of trying not to waste too much
disk/tape space (in the sense of "how much do we need to allocate to
the critical systems, and how much to the non-critical servers?"),
rather than recovery speed and so forth. Of course, I haven't got the
information on how much disk pool we'll be allocating, so that aspect
may be completely irrelevant ... it's always fun trying to predict how
things will hang together once they're up and running.

Backup sets may be doable, but I'm a little uncertain about
how they'd go with virtual volumes, and also whether they'd cover
archive data sets.

Backup sets only encompass filesystem backup data. They cannot be used
for
filesystem archives, nor for application TDP backups (even if the TDP
backups
are really "backup" objects).

That's what I suspected, and given that there are several databases
involved in this whole thing, it makes them kinda useless for our needs
in this regard. Scratch that option, then.

So the question is: is there any way we can say to TSM "Copy this
client's data (backup and archive) to that server over there, ignoring
all other client data in that storage pool"? Or am I smoking crack
(again)?

Probably the easiest way is to use a separate storage pool(s) for the
critical
clients. As I mentioned above, this may also help in controlling tape
"fragmentation" and improve recovery performance.

Looks like that's pretty much the consensus, then. All useful
information, and it's very much appreciated.

Oh, and as for the suggestion of shipping tapes -- one of the reasons
we want to do it this way is to minimise the amount of media we need to
ship from point A to point C (almost wrote point B there :) -- we have
a permanent link to the third site, and the cost of using it for this
purpose is negligible, even if it is relatively slow (well, compared to
gigabit fibre... :)

Many, many thanks for the responses; they've been very enlightening.

Cheers,

Stuart "now to hammer the next piton into the learning cliff..." Lamble.