ADSM-L

Re: [ADSM-L] DEVCLASS=FILE - what am I missing

2015-02-13 13:44:25
Subject: Re: [ADSM-L] DEVCLASS=FILE - what am I missing
From: "Prather, Wanda" <Wanda.Prather AT ICFI DOT COM>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Fri, 13 Feb 2015 18:39:00 +0000
Probably just bad luck….

When I set up FILE pools for customers, I usually have to tweak them a couple 
of times to get the sizing right, depends on the load, the number of concurrent 
sessions, etc.  Been there, done that, got the scars.

Assumptions you should change:

•       Unlike a disk pool, if there is no space available in a TYPE=FILE pool, 
backups don't fail over to the NEXT stgpool.  WAD.  I don't know why it's that 
way.  I think some RFE pressure is indicated, it causes me grief.

•       In a seq pool on disk, you need to be much more aggressive about 
reclaims.  If you have reclaim set at 59, you are saying you are willing to 
live with 59% of your disk space dead/expired and unusable!  That means you 
need to size your disk pool so that 41% is big enough to hold the entire 
night's backup.  I set reclaim on my disk pools to 20%, or 15% if the disk 
throughput is sufficient to tolerate the I/O.

•       Migration from a sequential pool may not be working like you think; 
read the DEFINE  STGPOOL HIGHMIG option definition in the admin ref for your 
version.  I always set MAXSCRATCH to 0 for a sequential file pool and use 
pre-defined volumes instead of scratch so I have better control over what 
happens.

•       You have mountlimit set to 40 in the devclass; how many concurrent 
client sessions do you have writing to that pool?

•       Also check server option NUMOPENVOLSALLOWED to make sure you can have 
enough volumes in use at once to do concurrent backups plus reclaims plus 
backup stgpool plus migration etc etc etc.

•       If you are going to fill this pool and empty it out via migration every 
night, best to force the migration yourself with a MIGRATE STPOOL command 
rather than relying on the threshold.  And if reclaims don't kick in on their 
own regularly, set up a RECLAIM STGPOOL schedule to fire daily anyway.  Won't 
hurt.

•       The usual problem I see is that people don't have enough volumes 
defined in the pool to account for all the concurrent sessions, plus  some 
empty volumes to allow for reclaims, plus a high enough NUMOPENVOLSALLOWED.  
You've defined your volumes at 50G, so you should have enough.  One of these 
other issues is probably your problem.

•       While things are working well, check daily to see what is a "normal" 
value of the number of "empty" volumes in that pool.  Then set yourself an 
alert to let you know when the number of "empty" volumes drops below the 
"normal" value so you can investigate before disaster sets in.

Good luck!

Wanda Prather
TSM Consultant
ICF International Enterprise and Cybersecurity Systems Division





-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf Of 
Zoltan Forray
Sent: Friday, February 13, 2015 12:13 PM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: [ADSM-L] DEVCLASS=FILE - what am I missing

Up until recently, I have always used DEVCLASS=DISK for disk storage and always 
preformatted/allocated the disk volumes into multiple chunks to all for 
multi-I/O benefits.

When I recently stood-up a new server, I decided to try DEVCLASS=FILE for 
disk-based storage/incoming backups.

I thought I understood that FILE type storage was basically "tape/sequential 
files on disk" and would act accordingly and things like reclamation now 
applied so when the file chunks (I defined 50GB file sizes) got below the 
reclaim value, it would reclaim such files, create new ones and delete the old 
ones automagically.

Well, last night became a disaster.  Backups failing all over because it 
couldn't allocate any more files and also would not automatically shift to use 
the "nextpool" which is defined as a tape pool.

So, what am I doing wrong?  What assumptions are wrong?  Here is the devclass 
values with the empty values left out...:

             Device Class Name: TSMFS
        Device Access Strategy: Sequential
            Storage Pool Count: 1
                   Device Type: FILE
                        Format: DRIVE
         Est/Max Capacity (MB): 51,200.0
                   Mount Limit: 40
                     Directory: /tsmpool

Here is the lone stgpool that used this devclass:

12:06:21 PM   GALAXY : q stg backuppool f=d
                    Storage Pool Name: BACKUPPOOL
                    Storage Pool Type: Primary
                    Device Class Name: TSMFS
                   Estimated Capacity: 7,106 G
                   Space Trigger Util: 84.5
                             Pct Util: 80.9
                             Pct Migr: 80.9
                          Pct Logical: 99.2
                         High Mig Pct: 85
                          Low Mig Pct: 75
                      Migration Delay: 0
                   Migration Continue: Yes
                  Migration Processes: 1
                Reclamation Processes: 1
                    Next Storage Pool: PRIMARY-ONSITE
                 Reclaim Storage Pool:
               Maximum Size Threshold: No Limit
                               Access: Read/Write
                          Description:
                    Overflow Location:
                Cache Migrated Files?:
                           Collocate?: No
                Reclamation Threshold: 59
            Offsite Reclamation Limit:
      Maximum Scratch Volumes Allowed: 143
       Number of Scratch Volumes Used: 137
        Delay Period for Volume Reuse: 0 Day(s)
               Migration in Progress?: No
                 Amount Migrated (MB): 0.00
     Elapsed Migration Time (seconds): 1,009
             Reclamation in Progress?: No
       Last Update by (administrator): ZFORRAY
                Last Update Date/Time: 02/13/2015 11:44:23
             Storage Pool Data Format: Native
                 Copy Storage Pool(s):
                  Active Data Pool(s):
              Continue Copy on Error?: Yes
                             CRC Data: No
                     Reclamation Type: Threshold
          Overwrite Data when Deleted:
                    Deduplicate Data?: No  Processes For Identifying Duplicates:
            Duplicate Data Not Stored:
                       Auto-copy Mode: Client Contains Data Deduplicated by 
Client?: No

I calculated the "Max Scratch Volumes" value based on having ~7.6TB filesystem 
so 50GB * 143 = 7.1TB

This morning when I checked, there were plenty of volumes with <40% utilized.  
SO why didn't reclaim kick-in?  or am I totally off on this
assumption?   I manually performed move data on them and it freed things up.
--
*Zoltan Forray*
TSM Software & Hardware Administrator
BigBro / Hobbit / Xymon Administrator
Virginia Commonwealth University
UCC/Office of Technology Services
zforray AT vcu DOT edu<mailto:zforray AT vcu DOT edu> - 804-828-4807
Don't be a phishing victim - VCU and other reputable organizations will never 
use email to request that you reply with your password, social security number 
or confidential personal information. For more details visit 
http://infosecurity.vcu.edu/phishing.html