ADSM-L

Re: [ADSM-L] Migration should preempt reclamation

2016-02-18 09:36:04
Subject: Re: [ADSM-L] Migration should preempt reclamation
From: "Rhodes, Richard L." <rrhodes AT FIRSTENERGYCORP DOT COM>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Thu, 18 Feb 2016 14:34:10 +0000
This is a tough one.  On the one hand we want Reclamation to use as many tape 
drives as possible, but not consume them all.  We also have multiple TSM 
instances wanting library resources.  The TSM instances are blind to each 
others needs.  This _IS_ difficult to control.

The _current_ solution controls reclamation completely manually from a set of 
scripts. 
It works something like this:

- we run a library sharing environment across a bunch of TSM instances
- reclamation is set to never run automatically - all stgpools are set 
    to not run reclamation automatically (reclamation pct = 100)
- define the max number of drives reclamation can use in a library
   (reclamation can use up to this number)
- define the number unused drives in a library that MUST be UNUSED before 
   another reclamation is started
   (there are always some number of unused drives available for non-reclamation 
jobs to start)
- define on stgpools the number of reclamation process allowed - we set it to 1 
   (one reclamation process equals 2 tape drives)

Late morning we kick in the script

- Crawls through all our TSM instances and gets a count of tapes per stgpool
    that could be reclaimed (above some rec pct).
- Sorts the list of stgpools/counts by the count
- Scripts loops.  
    On each loop it will start a new stgpool reclamation if:
      - max number of drives allowed for reclamation hasn't been hit 
      - required number of unused drives are still unused

Later in the day we kill this script, letting running reclamation jobs run to 
completion.
If buy the next morning (when migrations want to run) we still have 
reclamations running, they get nuked!

. . .repeat each day . . . .



The result, at a gross level we keep some number of drives open for other 
sessions/jobs to use, and yet allow reclamation to use up to the defined limit 
of drives if no one other processes are using them.  

It has major flaws, but has really smoothed out our tape drive contention and 
resources used for reclamation.  The one thing I really like is that it lets 
the stgpool with the most reclaimable tapes in whatever TSm instance to run the 
longest.

One core overall issue - no amount of playing around like this can make up for 
not having the resources you need to drive TSM.  If you don't have the drives 
to process the work, nothing will really help. 

Rick





-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf Of 
Roger Deschner
Sent: Wednesday, February 17, 2016 8:43 PM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Migration should preempt reclamation

I was under the impression that higher priority tasks could preempt
lower priority tasks. That is, migration should be able to preempt
reclamation. But it doesn't. A very careful reading of Administrator's
Guide tells me that it does not.

We're having a problem with a large client backup that fails, due to a
disk stgpool filling. (It's a new client, and this is its initial large
backup.) It fills up because the migration process can not get a tape
drive, due to their all being used for reclamation. This also prevents
the client backup from getting a tape drive directly. Does anybody have
a way for migration to get resources (drives, volumes, etc) when a
storage pool reaches its high migration threshold, and reclamation is
using those resources? "Careful scheduling" is the usual answer, but you
can't always schedule what client nodes do. Back on TSM 5.4 I built a
Unix cron job to look for this condition and cancel reclamation
processes, but it was a real Rube Goldberg contraption, so I'm reluctant
to revive it now in the TSM 6+ era. Anybody have a better way?

BTW, I thought I'd give the 7.1.4 Information center a try to answer
this. I searched on "preemption". 10 hits none of which were the answer.
So I went to the PDF of the old Administrator's Guide and found it right
away. We need that book!

Roger Deschner      University of Illinois at Chicago     rogerd AT uic DOT edu
======I have not lost my mind -- it is backed up on tape somewhere.=====


-----------------------------------------
The information contained in this message is intended only for the personal and 
confidential use of the recipient(s) named above. If the reader of this message 
is not the intended recipient or an agent responsible for delivering it to the 
intended recipient, you are hereby notified that you have received this 
document in error and that any review, dissemination, distribution, or copying 
of this message is strictly prohibited. If you have received this communication 
in error, please notify us immediately, and delete the original message.