ADSM-L

Re: [ADSM-L] Migration should preempt reclamation

2016-02-18 09:58:59
Subject: Re: [ADSM-L] Migration should preempt reclamation
From: Richard Cowen <rcowen AT CPPASSOCIATES DOT COM>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Thu, 18 Feb 2016 14:57:25 +0000
A similar script could use MOVE DATA instead of RECLAIM, which has the 
advantage of checking as each MOVE ends to see if there are drive resources, 
and intelligently picking a volume, before starting a new MOVE.  It also can 
check an external file for a "pause" or "halt" command, or parse the actlog(s) 
for ANR1496I messages for similar "commands", and re-read the configuration 
file so you can affect it without stopping/starting.
Richard



-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf Of 
Rhodes, Richard L.
Sent: Thursday, February 18, 2016 9:34 AM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Re: [ADSM-L] Migration should preempt reclamation

This is a tough one.  On the one hand we want Reclamation to use as many tape 
drives as possible, but not consume them all.  We also have multiple TSM 
instances wanting library resources.  The TSM instances are blind to each 
others needs.  This _IS_ difficult to control.

The _current_ solution controls reclamation completely manually from a set of 
scripts. 
It works something like this:

- we run a library sharing environment across a bunch of TSM instances
- reclamation is set to never run automatically - all stgpools are set 
    to not run reclamation automatically (reclamation pct = 100)
- define the max number of drives reclamation can use in a library
   (reclamation can use up to this number)
- define the number unused drives in a library that MUST be UNUSED before 
   another reclamation is started
   (there are always some number of unused drives available for non-reclamation 
jobs to start)
- define on stgpools the number of reclamation process allowed - we set it to 1 
   (one reclamation process equals 2 tape drives)

Late morning we kick in the script

- Crawls through all our TSM instances and gets a count of tapes per stgpool
    that could be reclaimed (above some rec pct).
- Sorts the list of stgpools/counts by the count
- Scripts loops.  
    On each loop it will start a new stgpool reclamation if:
      - max number of drives allowed for reclamation hasn't been hit 
      - required number of unused drives are still unused

Later in the day we kill this script, letting running reclamation jobs run to 
completion.
If buy the next morning (when migrations want to run) we still have 
reclamations running, they get nuked!

. . .repeat each day . . . .



The result, at a gross level we keep some number of drives open for other 
sessions/jobs to use, and yet allow reclamation to use up to the defined limit 
of drives if no one other processes are using them.  

It has major flaws, but has really smoothed out our tape drive contention and 
resources used for reclamation.  The one thing I really like is that it lets 
the stgpool with the most reclaimable tapes in whatever TSm instance to run the 
longest.

One core overall issue - no amount of playing around like this can make up for 
not having the resources you need to drive TSM.  If you don't have the drives 
to process the work, nothing will really help. 

Rick





-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf Of 
Roger Deschner
Sent: Wednesday, February 17, 2016 8:43 PM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Migration should preempt reclamation

I was under the impression that higher priority tasks could preempt lower 
priority tasks. That is, migration should be able to preempt reclamation. But 
it doesn't. A very careful reading of Administrator's Guide tells me that it 
does not.

We're having a problem with a large client backup that fails, due to a disk 
stgpool filling. (It's a new client, and this is its initial large
backup.) It fills up because the migration process can not get a tape drive, 
due to their all being used for reclamation. This also prevents the client 
backup from getting a tape drive directly. Does anybody have a way for 
migration to get resources (drives, volumes, etc) when a storage pool reaches 
its high migration threshold, and reclamation is using those resources? 
"Careful scheduling" is the usual answer, but you can't always schedule what 
client nodes do. Back on TSM 5.4 I built a Unix cron job to look for this 
condition and cancel reclamation processes, but it was a real Rube Goldberg 
contraption, so I'm reluctant to revive it now in the TSM 6+ era. Anybody have 
a better way?

BTW, I thought I'd give the 7.1.4 Information center a try to answer this. I 
searched on "preemption". 10 hits none of which were the answer.
So I went to the PDF of the old Administrator's Guide and found it right away. 
We need that book!

Roger Deschner      University of Illinois at Chicago     rogerd AT uic DOT edu
======I have not lost my mind -- it is backed up on tape somewhere.=====


-----------------------------------------The information contained in this 
message is intended only for the personal and confidential use of the 
recipient(s) named above. If the reader of this message is not the intended 
recipient or an agent responsible for delivering it to the intended recipient, 
you are hereby notified that you have received this document in error and that 
any review, dissemination, distribution, or copying of this message is strictly 
prohibited. If you have received this communication in error, please notify us 
immediately, and delete the original message.