Struggling with the number of scratch tapes

nabildz

ADSM.ORG Member
Joined
May 28, 2008
Messages
125
Reaction score
0
Points
0
Help TSMers

we are having problem with lack of scratch tapes as library is full specially over the weekend

I have found out that all the tapes in the library belong to 2 different storage pool

Library has 60 slots

STGP1 with MAXSCRATCH = 600
STGP2 with MAXSCRATCH= 30

I was just thinking of how to get always scratch tapes on the library available for DB and COPY backup by altering the maxscracth for both stgp1+stgp2 < 56 to allow always 4 scratch tapes availble and also expanding Primary disk storage ( DEFINE VOLUME ) to allow for enough space.

would that work to sort out this problem??

Over to you

Thanks
 
What is currently inside the library? I would presume that it holds the primary tape backup pool. Are you off loading the offsite (DR) tapes daily? If you are not, then you may want to see if these tapes can significantly free up slots.

By decreasing the number of scratch per pool, you may end with a scenario wherein TSM would stop processing data since the disk pool is FULL and no tapes available for off loading data. Personally, I wouldn't go this route. Instead, I will move (move media) old online tape backup tapes to an overflow location.
 
What is currently inside the library? I would presume that it holds the primary tape backup pool. Are you off loading the offsite (DR) tapes daily? If you are not, then you may want to see if these tapes can significantly free up slots.

Yes DR tapes get offloaded daily to go offsite



By decreasing the number of scratch per pool, you may end with a scenario wherein TSM would stop processing data since the disk pool is FULL and no tapes available for off loading data. Personally, I wouldn't go this route. Instead, I will move (move media) old online tape backup tapes to an overflow location.

MAXSCRATCH is set to 600 and library only holds 58

thats why the idea of changing MAXSCRATCH to less than 68 and expanding the primary storage pools and moving data from tapes to disk from the same primary storage pool , which will freeup more tapes
 
yes, limiting maxscratch to force the ability to do db backups is a good safety measure, as long as you also have something to alert you to impending space exhaustion. But as to how you're running out of tapes:::? Are you collocating a large number of small nodes so that you've got all your tapes assigned but each one very little used? Are the tapes actually highly utilized and thus you're undersized for your load and retention?
Are you running ridiculous retention, or not expiring and reclaiming to work with the reasonable retention set?

If you're collocating, perhaps you should do some collocationgrouping. If you're just having to take care of more data than fits in the library, it's time to upgrade, or limp along with overflow.
 
Thank you for your reply, collocation is off for all groups except for 4 nodes ( Exchange server ) all other groups are not, about 110 nodes
retention is set to no limit for exchange and 30 days for others ( client requirements )

can primary disk pool extended to freeup space on primary tape pool, or we have to upgrade the library

what are the commands to run to check if this is what is needed in term of space

yes, limiting maxscratch to force the ability to do db backups is a good safety measure, as long as you also have something to alert you to impending space exhaustion. But as to how you're running out of tapes:::? Are you collocating a large number of small nodes so that you've got all your tapes assigned but each one very little used? Are the tapes actually highly utilized and thus you're undersized for your load and retention?
Are you running ridiculous retention, or not expiring and reclaiming to work with the reasonable retention set?

If you're collocating, perhaps you should do some collocationgrouping. If you're just having to take care of more data than fits in the library, it's time to upgrade, or limp along with overflow.
 
Thank you for your reply, collocation is off for all groups except for 4 nodes ( Exchange server ) all other groups are not, about 110 nodes
retention is set to no limit for exchange and 30 days for others ( client requirements )

can primary disk pool extended to freeup space on primary tape pool, or we have to upgrade the library

what are the commands to run to check if this is what is needed in term of space

hi,

what about reclamation ? Are you running it on a scheduled basis ? What is the Threshold ?
you might just need to tune it.
Run the following query
select volume_name, pct_reclaim, stgpool_name from volumes where pct_reclaim>=60 order by 3,2 asc
to know how much you'd reclaim with a threshold 60.

cheers
max
 
hi,

what about reclamation ? Are you running it on a scheduled basis ? What is the Threshold ?
you might just need to tune it.
Run the following query
select volume_name, pct_reclaim, stgpool_name from volumes where pct_reclaim>=60 order by 3,2 asc
to know how much you'd reclaim with a threshold 60.

cheers
max


Thanks for your reply Max

we have a schedule reclamations running for different storage pools running on regular basis ( weekly ) I have run the query and found many Tapes with pct_reclaim >= 60

the reclamation we are running is using 70 and 80 not 60 . do we need to change this to 60 to improve space reclaimed??
 
hi,

you should tune the threshold for your convenience (no rule of thumb, except that it must be greater than 50). I usually run reclamation at 80 or even 90, but that really depends on your environment.
Since you say you have many tapes eligible to reclaim, try and run reclamation more often, should your actual environment allow it.

cheers
max
 
Obviously, with the resources you have available, a threshold of 80 is far too high. What that means is that you can have every tape full and still be using only 20% of your total space. Max gave 60 because that's kind of the standard value - first one on the tens that's less than 50% used. If you're swimming in extra tape capacity, running high is nice for two reasons - less data is shuffled around, and the load is spread across more volumes - better overall tape life. Most of us, though, are not so lucky, and we need to run things tighter than that. It's a matter of getting everything required from your library. If your reclamations can't be completed in the window allotted, you may have to ease up and run a higher threshold. If you don't have enough space, you must crank the threshold down and move more data to get more space. Of the two, generally the space requirement wins... put up with some performance hit from overlapping reclamation to make enough room to get all the backups done. Fortunately, reclamation doesn't scale linearly, at least for tapes. What I mean is if you double the amount moved (100% minus reclamation threshold) you don't double your time. Fuller tapes have fewer gaps, so there's less time spent waiting for the tape to move to the next file, and as well, the writing drive spends less of its time spinning up, stopping, and backing up, so that generally, at tape with 90% utilization (10% reclamable) takes only about 4 times as long as one with 10% utilization (90% reclaimable). What I mean is don't think moving your threshold from 80 to 60 is going to double your reclamation time. It'll probably be more like a 30% increase.

By the way - you mention increasing disk space to alleviate load on tape. Let me point ouit that you can greatly cut the need for reclamation if you can muster enough disk space (put most of it in FILE-class) that you can run a MIGD on the stgpool at least as big as your default RETE and RETO... If you can, most of what goes to tape is active or long-term. Of course, if you're at one of those "keep every version of everything forever" places, it won't make any difference...
 
a threshold of 80 is far too high. What that means is that you can have every tape full and still be using only 20% of your total space


can you please elaborate on this I dont quiet undertand it

Many Thanks
 
hi,
the bigger the threshold, the least is the space you gain after reclamation (which is a shame), the least is the time you stress the drives (which you like). As usual, you need to analyze tradeoffs for your own environment.

cheers
max
 
The threshold is the level which pct_reclaim must exceed for a volume to be reclaimed. It's a little confusing since it's a lot like the opposite of pct_utilized. For a FULL volume, it's exactly that. a volume that's FULL but only 20% utilized will be at 80 pct_reclaim. A volume that you filled to 50% and on which all of that data is expired can be 0% utilized and 50% reclaimable. That's why there's a kind of funny numeric relationship, since the reclamation value really is a separate concept.

Here's a shell script I use to see what my reclamation situation is like:

bash-3.00$ cat `which estimatereclaim`
#!/bin/sh

stgps=`echo $1 |tr a-z A-Z`
if [ "$stgps" ]
then
shift
threshes=$*
else
stgps=`tsmout q stg|cut -f1`
fi
[ "$threshes" ] || threshes="50 55 60 65 70 75 80 85 90 95"
for stgp in $stgps
do
for thresh in $threshes
do
tsmout "select '$stgp','$thresh',count(1),sum(PCT_UTILIZED) from VOLUMES where STGPOOL_NAME='$stgp' and PCT_RECLAIM \> $thresh and STATUS != 'EMPTY'"
done
done
bash-3.00$
tsmout is just a wrapper that does "dsmadmc -id=xxx -pass=xxx -tabdelimited $* |grep "<TAB>"

So, for instance:
bash-3.00$ estimatereclaim offsitetape 80
OFFSITETAPE 80 19 111.7
bash-3.00$ estimatereclaim offsitetape 60
OFFSITETAPE 60 23 211.4
bash-3.00$
That tells me that if I reclaim at threshold=80, I'll empty 19 tapes and it'll take 1 whole tape and 11.7 percent of another to hold the consolidated data. If I go 60%, I'll get 23 for 2 and 11.4%. I am currently running a reclamation at thresh=55, so not surprisingly:
bash-3.00$ estimatereclaim offsitetape 55
OFFSITETAPE 55 23 209.5
bash-3.00$ estimatereclaim offsitetape 70
OFFSITETAPE 70 23 209.5
It's already gotten the more-full tapes.
That particular script is also handy for picking a threshold...
bash-3.00$ estimatereclaim offsitetape
OFFSITETAPE 50 26 352.7
OFFSITETAPE 55 23 209.2
OFFSITETAPE 60 23 209.2
OFFSITETAPE 65 23 209.2
OFFSITETAPE 70 23 209.2
OFFSITETAPE 75 21 155.0
OFFSITETAPE 80 19 111.2
OFFSITETAPE 85 17 75.9
OFFSITETAPE 90 15 52.6
OFFSITETAPE 95 11 21.8
bash-3.00$
If I were desperate to get some tapes back quickly, I could get 11 back quick with thresh=95.
bash-3.00$ estimatereclaim offsitetape 95 96 97 98 99 100
OFFSITETAPE 95 11 21.8
OFFSITETAPE 96 9 12.7
OFFSITETAPE 97 7 6.4
OFFSITETAPE 98 7 6.4
OFFSITETAPE 99 4 1.9
OFFSITETAPE 100 0
bash-3.00$
look! 4 for 1.9% of a tape to hold it on.
This is also handy when brushing up for an offsiting. I can look at how much space is left on my mountable DR volumes, and pick a reclamation threshold that will maximize utilization. Let's say I'm offsiting tomorrow (new owners have gotten cheap and cut my DR budget), and after ba stg, I've got one mountable DR volume 85% full. Obviously I'd pick a threshold of 96, which would recall me 9 volumes without adding another tape to offsite.
I generally just let things run on fixed thresholds, but sometimes I do stuff like that.
 
Back
Top