ADSM-L

[ADSM-L] Meandering post on offsite reclamation including very large files.

2009-04-28 09:54:52
Subject: [ADSM-L] Meandering post on offsite reclamation including very large files.
From: "Allen S. Rout" <asr AT UFL DOT EDU>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Tue, 28 Apr 2009 09:54:37 -0400
So, I've got some DB2 database backups on my TSM server.  For those of
you who aren't familiar with the way that app backs up, you have a
relatively small number of large files; we have one or two sessions
running, so we have one or two files when we're done.  This means my
DB fulls are rattling around in 200-500G single files.

So we back them up to an offsite copy pool.  Nice and easy.

Now, we happen to have one big primary stgpool on this TSM server.
This means that, as the stgpool backups happen, adjacent to every big
file is (most often) a bunch of little files.  These adjacencies
happen a random distance through a given offsite volume.

Consequently, if the _rest_ of the remote volume expires, I may have
to copy a 500G file in order to reclaim a 20G volume which had 200M of
the big file on it.  Ick.

It's especially irritating when I _know_ that I'm replacing (say) 20
20G volumes full of BIGFILE with a -different- 20 20G volumes full of
BIGFILE.  Reclaim, indeed.  Humph.


The cleanest way to handle these huge files, if I'd anticipated this
corner case when I started, would be to arrange for the big files to
be in their own separate storage heirarchy.  I'd never even reclaim
the copy stgpool volumes from there: It's not worth the extra motion.
I may still get around to doing that.


The perfect technical fix would be for TSM to do some navel gazing on,
say, files (aggregates?  are aggregates still involved with large
files?) larger than a configurable NAVELGAZESIZE.

While making offsite copy to a devclass of type SERVER:
 If next filesize > NAVELGAZESIZE
   close this volume, open a new one.

   copy copy copy copy

   when done, close this volume, open a new one.


it might even be rational to have NAVELGAZESIZE = devclass maxcap or
something.  But I like it settable.

I can't come up with a downside to this strategy: we already
proliferate offsite volumes for a bunch of bad reasons.  This isn't
much proliferation, and it's for a darn good reason.


I'm thinking of developing this into a formal feature request /
requirement / whatever.


- Allen S. Rout

<Prev in Thread] Current Thread [Next in Thread>