A similar script could use MOVE DATA instead of RECLAIM, which has the
advantage of checking as each MOVE ends to see if there are drive resources,
and intelligently picking a volume, before starting a new MOVE. It also can
check an external file for a "pause" or "halt" command, or parse the actlog(s)
for ANR1496I messages for similar "commands", and re-read the configuration
file so you can affect it without stopping/starting.
Richard
-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf Of
Rhodes, Richard L.
Sent: Thursday, February 18, 2016 9:34 AM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Re: [ADSM-L] Migration should preempt reclamation
This is a tough one. On the one hand we want Reclamation to use as many tape
drives as possible, but not consume them all. We also have multiple TSM
instances wanting library resources. The TSM instances are blind to each
others needs. This _IS_ difficult to control.
The _current_ solution controls reclamation completely manually from a set of
scripts.
It works something like this:
- we run a library sharing environment across a bunch of TSM instances
- reclamation is set to never run automatically - all stgpools are set
to not run reclamation automatically (reclamation pct = 100)
- define the max number of drives reclamation can use in a library
(reclamation can use up to this number)
- define the number unused drives in a library that MUST be UNUSED before
another reclamation is started
(there are always some number of unused drives available for non-reclamation
jobs to start)
- define on stgpools the number of reclamation process allowed - we set it to 1
(one reclamation process equals 2 tape drives)
Late morning we kick in the script
- Crawls through all our TSM instances and gets a count of tapes per stgpool
that could be reclaimed (above some rec pct).
- Sorts the list of stgpools/counts by the count
- Scripts loops.
On each loop it will start a new stgpool reclamation if:
- max number of drives allowed for reclamation hasn't been hit
- required number of unused drives are still unused
Later in the day we kill this script, letting running reclamation jobs run to
completion.
If buy the next morning (when migrations want to run) we still have
reclamations running, they get nuked!
. . .repeat each day . . . .
The result, at a gross level we keep some number of drives open for other
sessions/jobs to use, and yet allow reclamation to use up to the defined limit
of drives if no one other processes are using them.
It has major flaws, but has really smoothed out our tape drive contention and
resources used for reclamation. The one thing I really like is that it lets
the stgpool with the most reclaimable tapes in whatever TSm instance to run the
longest.
One core overall issue - no amount of playing around like this can make up for
not having the resources you need to drive TSM. If you don't have the drives
to process the work, nothing will really help.
Rick
-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf Of
Roger Deschner
Sent: Wednesday, February 17, 2016 8:43 PM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Migration should preempt reclamation
I was under the impression that higher priority tasks could preempt lower
priority tasks. That is, migration should be able to preempt reclamation. But
it doesn't. A very careful reading of Administrator's Guide tells me that it
does not.
We're having a problem with a large client backup that fails, due to a disk
stgpool filling. (It's a new client, and this is its initial large
backup.) It fills up because the migration process can not get a tape drive,
due to their all being used for reclamation. This also prevents the client
backup from getting a tape drive directly. Does anybody have a way for
migration to get resources (drives, volumes, etc) when a storage pool reaches
its high migration threshold, and reclamation is using those resources?
"Careful scheduling" is the usual answer, but you can't always schedule what
client nodes do. Back on TSM 5.4 I built a Unix cron job to look for this
condition and cancel reclamation processes, but it was a real Rube Goldberg
contraption, so I'm reluctant to revive it now in the TSM 6+ era. Anybody have
a better way?
BTW, I thought I'd give the 7.1.4 Information center a try to answer this. I
searched on "preemption". 10 hits none of which were the answer.
So I went to the PDF of the old Administrator's Guide and found it right away.
We need that book!
Roger Deschner University of Illinois at Chicago rogerd AT uic DOT edu
======I have not lost my mind -- it is backed up on tape somewhere.=====
-----------------------------------------The information contained in this
message is intended only for the personal and confidential use of the
recipient(s) named above. If the reader of this message is not the intended
recipient or an agent responsible for delivering it to the intended recipient,
you are hereby notified that you have received this document in error and that
any review, dissemination, distribution, or copying of this message is strictly
prohibited. If you have received this communication in error, please notify us
immediately, and delete the original message.
|