ADSM-L

Re: Consumption of new scratch tape increase abnormally

2002-09-13 10:32:11
Subject: Re: Consumption of new scratch tape increase abnormally
From: "Wayne T. Smith" <ADSM AT MAINE DOT EDU>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Fri, 13 Sep 2002 10:18:40 -0400
Hi Kathy,

(More than) a few remarks ... maybe one or two will be useful ...

Scanning through a *SM "Concepts" manual (search ADSM.ORG for the web
address or just look at the IBM Redbooks site) will be a help.

Here's the 25-cent tour of this area (sorry if it's too simple or too
long or doesn't make sense!):

Data comes to *SM from your clients.  These could go directly to tape,
but most often gets stored on disk on your backup server first.  Data on
this disk will be copied to tape, either on your direction (an
administrative schedule or external command to lower the "HI" value of
the (storage pool) disk) or because the utilization of the disk exceeds
the currently-defined "HI" value. See the command "Q STG" to view these
and other numbers.  The movement of data from disk to tape is called
"migration".  *SM creates 1 or more migration processes to dump from
disk to tape.

The selection of tape volume depends on your setup somewhat. For
example, if the tape storage pool is "collated", a "filling" or new tape
will be chosen, but each *SM node or filespace will tend to have its own
tape volume.  Thus a collated storage pool will tend to have more
filling tapes than a non-collated pool. (Multiple clients can share a
tape volume in a collated storage pool, depending on a setting).

A new tape is selected from a private pool, if available (it would have
a status of "empty"), but it sounds like you are using the "scratch"
procedure, so whenever a new volume is needed, it comes from your
scratch pool.

Once *SM writes to a tape volume, its status is set to "filling".
Whenever *SM has more data to migrate, it simply adds to the end of the
current set of data on tape.  So, you will normally have 1 or a few
filling tapes (more if we're talking about a collated storage pool, more
if you have multiple migration processes or backup client data directly
to tape).

Once a tape volume fills, it is marked "Full".  It's written end to end,
and all of the data is "useful". It's utilization is 100% (unless it has
taken some time to fill the volume and the following (expiration) has
started).

Your *SM retention policies probably limit the data that is kept. (This
is the only way to ever reuse a tape!).   This useless (too old) data
*is* kept, however.  You must run a process called "Expire Inventory"
(periodically) to cause *SM to discard old *SM DB entries.  This process
creates "logical" holes in each of your tapes. Over time, instead of a
tape volume having 100% useful data, only 95% is useful (the other 5%
"expired"). As time progresses, the 95% "utilization" continues to dwindle.

So as time passes, if nothing is done, it will take more and more tape
volumes to hold the same amount of data!

The "acceptable" utilization value for any tape is completely up to you.
 "Accepting" tapes with 20% utilization means that 80% of your full
tapes contains uninteresting data, so you have wasted tape.  This also
means that major restores must traverse more tape and more tape volumes,
thus slowing the restore.  Typically a value of 40% might be used, but
factors may make you choose another value.  Reclamation is *slow*, so
this plays into your decision.  *SM reclamation is very good at wearing
out your tape drive read/write heads and very good at convincing you
that you need more tape drives.  (Those drives will be useful on your
next major restores!)

The *SM process that merges the poorly utilized tapes to a new filling
tape is called "reclamation" (a process called "MOVE DATA" will also do
this). A storage pool can have 1 reclamation process.  That process
starts whenever "Full" tape volumes have un-utilized space that is more
than the storage pool "reclamation" setting.  Once the reclamation
process copies off the 20% (or whatever) of good data, the tape volume
is placed in "pending" status, so it is not reused (typically you set
this (#days) to well after the next DB backup (You do take *SM DB
backups?!) that you might ever want to use). After the Pending period (a
setting of the storage pool), it is returned to scratch (or set to
"Empty" is using a private pool).

I haven't mentioned copies of data in your backup server (such as used
to move client data "offsite").  In this case, data is copied from your
"primary" storage pools (the disk and tape referenced above) to your
"copypool" with a *SM "Backup" command.  Most people like to have a
large disk pool, so the backup to copypool can be made from disk to tape
instead of from tape to tape.  The population of your copypool is not
automatic ... you must periodically start (or have scheduled) backup
commands for the disk and tape pools. *SM determines which objects
haven't been moved in each originating pool, and moves them.

(Leaving out lots of details here!) Your offsite tapes have a similar
utilization-thing happen to them. Over time, they have less useful data
on each volume.  The reclamation of offsite tapes is somewhat different,
however.  Because you don't want to bring back those mostly empty tapes
while they still contain some of the company's jewels, a reclamation of
an offsite tapes causes "primary" storage pool tapes to be mounted to
obtain the useful objects.  Although it might seem strange, if you
haven't been reclaiming your offsite copypool, I recommend lowering your
reclamation value down from 100% *very* slowly.  Otherwise the offsite
tapes may not be completely emptied at the rate you might expect.

I've also not discussed HSM and archiving, but I don't think it
important to your question (and I don't "do" them). :-)

Hope some of this long-winded saga is useful!  Perhaps others will
correct this, add detail, and share their perspectives.  The *SM manuals
and redbooks are well written ... there's just a lot of pages to digest!

cheers, wayne
--

Wayne T. Smith -- ADSM AT Maine DOT edu -- University of Maine System -- UNET