This is a tough one. On the one hand we want Reclamation to use as many tape
drives as possible, but not consume them all. We also have multiple TSM
instances wanting library resources. The TSM instances are blind to each
others needs. This _IS_ difficult to control.
The _current_ solution controls reclamation completely manually from a set of
scripts.
It works something like this:
- we run a library sharing environment across a bunch of TSM instances
- reclamation is set to never run automatically - all stgpools are set
to not run reclamation automatically (reclamation pct = 100)
- define the max number of drives reclamation can use in a library
(reclamation can use up to this number)
- define the number unused drives in a library that MUST be UNUSED before
another reclamation is started
(there are always some number of unused drives available for non-reclamation
jobs to start)
- define on stgpools the number of reclamation process allowed - we set it to 1
(one reclamation process equals 2 tape drives)
Late morning we kick in the script
- Crawls through all our TSM instances and gets a count of tapes per stgpool
that could be reclaimed (above some rec pct).
- Sorts the list of stgpools/counts by the count
- Scripts loops.
On each loop it will start a new stgpool reclamation if:
- max number of drives allowed for reclamation hasn't been hit
- required number of unused drives are still unused
Later in the day we kill this script, letting running reclamation jobs run to
completion.
If buy the next morning (when migrations want to run) we still have
reclamations running, they get nuked!
. . .repeat each day . . . .
The result, at a gross level we keep some number of drives open for other
sessions/jobs to use, and yet allow reclamation to use up to the defined limit
of drives if no one other processes are using them.
It has major flaws, but has really smoothed out our tape drive contention and
resources used for reclamation. The one thing I really like is that it lets
the stgpool with the most reclaimable tapes in whatever TSm instance to run the
longest.
One core overall issue - no amount of playing around like this can make up for
not having the resources you need to drive TSM. If you don't have the drives
to process the work, nothing will really help.
Rick
-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf Of
Roger Deschner
Sent: Wednesday, February 17, 2016 8:43 PM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Migration should preempt reclamation
I was under the impression that higher priority tasks could preempt
lower priority tasks. That is, migration should be able to preempt
reclamation. But it doesn't. A very careful reading of Administrator's
Guide tells me that it does not.
We're having a problem with a large client backup that fails, due to a
disk stgpool filling. (It's a new client, and this is its initial large
backup.) It fills up because the migration process can not get a tape
drive, due to their all being used for reclamation. This also prevents
the client backup from getting a tape drive directly. Does anybody have
a way for migration to get resources (drives, volumes, etc) when a
storage pool reaches its high migration threshold, and reclamation is
using those resources? "Careful scheduling" is the usual answer, but you
can't always schedule what client nodes do. Back on TSM 5.4 I built a
Unix cron job to look for this condition and cancel reclamation
processes, but it was a real Rube Goldberg contraption, so I'm reluctant
to revive it now in the TSM 6+ era. Anybody have a better way?
BTW, I thought I'd give the 7.1.4 Information center a try to answer
this. I searched on "preemption". 10 hits none of which were the answer.
So I went to the PDF of the old Administrator's Guide and found it right
away. We need that book!
Roger Deschner University of Illinois at Chicago rogerd AT uic DOT edu
======I have not lost my mind -- it is backed up on tape somewhere.=====
-----------------------------------------
The information contained in this message is intended only for the personal and
confidential use of the recipient(s) named above. If the reader of this message
is not the intended recipient or an agent responsible for delivering it to the
intended recipient, you are hereby notified that you have received this
document in error and that any review, dissemination, distribution, or copying
of this message is strictly prohibited. If you have received this communication
in error, please notify us immediately, and delete the original message.
|