Thanks for the detailed explanations / experiences / suggestions. I
greatly appreciate and will store away in case I ever try this again.
Yes we do have lots of clients backing up at once - we easily hit
40-simultaneous sessions thus the reason for the high number.
NUMOPENVOLSALLOWED is set to 10 for this server.
I had not planned to empty it every night. This server doesn't have that
much incoming backups. It has been running for a month without needing to
force migration to tape.
On Fri, Feb 13, 2015 at 1:39 PM, Prather, Wanda <Wanda.Prather AT icfi DOT com>
wrote:
> Probably just bad luck….
>
> When I set up FILE pools for customers, I usually have to tweak them a
> couple of times to get the sizing right, depends on the load, the number of
> concurrent sessions, etc. Been there, done that, got the scars.
>
> Assumptions you should change:
>
> • Unlike a disk pool, if there is no space available in a TYPE=FILE
> pool, backups don't fail over to the NEXT stgpool. WAD. I don't know why
> it's that way. I think some RFE pressure is indicated, it causes me grief.
>
> • In a seq pool on disk, you need to be much more aggressive about
> reclaims. If you have reclaim set at 59, you are saying you are willing to
> live with 59% of your disk space dead/expired and unusable! That means you
> need to size your disk pool so that 41% is big enough to hold the entire
> night's backup. I set reclaim on my disk pools to 20%, or 15% if the disk
> throughput is sufficient to tolerate the I/O.
>
> • Migration from a sequential pool may not be working like you
> think; read the DEFINE STGPOOL HIGHMIG option definition in the admin ref
> for your version. I always set MAXSCRATCH to 0 for a sequential file pool
> and use pre-defined volumes instead of scratch so I have better control
> over what happens.
>
> • You have mountlimit set to 40 in the devclass; how many concurrent
> client sessions do you have writing to that pool?
>
> • Also check server option NUMOPENVOLSALLOWED to make sure you can
> have enough volumes in use at once to do concurrent backups plus reclaims
> plus backup stgpool plus migration etc etc etc.
>
> • If you are going to fill this pool and empty it out via migration
> every night, best to force the migration yourself with a MIGRATE STPOOL
> command rather than relying on the threshold. And if reclaims don't kick
> in on their own regularly, set up a RECLAIM STGPOOL schedule to fire daily
> anyway. Won't hurt.
>
> • The usual problem I see is that people don't have enough volumes
> defined in the pool to account for all the concurrent sessions, plus some
> empty volumes to allow for reclaims, plus a high enough
> NUMOPENVOLSALLOWED. You've defined your volumes at 50G, so you should have
> enough. One of these other issues is probably your problem.
>
> • While things are working well, check daily to see what is a
> "normal" value of the number of "empty" volumes in that pool. Then set
> yourself an alert to let you know when the number of "empty" volumes drops
> below the "normal" value so you can investigate before disaster sets in.
>
> Good luck!
>
> Wanda Prather
> TSM Consultant
> ICF International Enterprise and Cybersecurity Systems Division
>
>
>
>
>
> -----Original Message-----
> From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf
> Of
> Zoltan Forray
> Sent: Friday, February 13, 2015 12:13 PM
> To: ADSM-L AT VM.MARIST DOT EDU
> Subject: [ADSM-L] DEVCLASS=FILE - what am I missing
>
> Up until recently, I have always used DEVCLASS=DISK for disk storage and
> always preformatted/allocated the disk volumes into multiple chunks to all
> for multi-I/O benefits.
>
> When I recently stood-up a new server, I decided to try DEVCLASS=FILE for
> disk-based storage/incoming backups.
>
> I thought I understood that FILE type storage was basically
> "tape/sequential files on disk" and would act accordingly and things like
> reclamation now applied so when the file chunks (I defined 50GB file sizes)
> got below the reclaim value, it would reclaim such files, create new ones
> and delete the old ones automagically.
>
> Well, last night became a disaster. Backups failing all over because it
> couldn't allocate any more files and also would not automatically shift to
> use the "nextpool" which is defined as a tape pool.
>
> So, what am I doing wrong? What assumptions are wrong? Here is the
> devclass values with the empty values left out...:
>
> Device Class Name: TSMFS
> Device Access Strategy: Sequential
> Storage Pool Count: 1
> Device Type: FILE
> Format: DRIVE
> Est/Max Capacity (MB): 51,200.0
> Mount Limit: 40
> Directory: /tsmpool
>
> Here is the lone stgpool that used this devclass:
>
> 12:06:21 PM GALAXY : q stg backuppool f=d
> Storage Pool Name: BACKUPPOOL
> Storage Pool Type: Primary
> Device Class Name: TSMFS
> Estimated Capacity: 7,106 G
> Space Trigger Util: 84.5
> Pct Util: 80.9
> Pct Migr: 80.9
> Pct Logical: 99.2
> High Mig Pct: 85
> Low Mig Pct: 75
> Migration Delay: 0
> Migration Continue: Yes
> Migration Processes: 1
> Reclamation Processes: 1
> Next Storage Pool: PRIMARY-ONSITE
> Reclaim Storage Pool:
> Maximum Size Threshold: No Limit
> Access: Read/Write
> Description:
> Overflow Location:
> Cache Migrated Files?:
> Collocate?: No
> Reclamation Threshold: 59
> Offsite Reclamation Limit:
> Maximum Scratch Volumes Allowed: 143
> Number of Scratch Volumes Used: 137
> Delay Period for Volume Reuse: 0 Day(s)
> Migration in Progress?: No
> Amount Migrated (MB): 0.00
> Elapsed Migration Time (seconds): 1,009
> Reclamation in Progress?: No
> Last Update by (administrator): ZFORRAY
> Last Update Date/Time: 02/13/2015 11:44:23
> Storage Pool Data Format: Native
> Copy Storage Pool(s):
> Active Data Pool(s):
> Continue Copy on Error?: Yes
> CRC Data: No
> Reclamation Type: Threshold
> Overwrite Data when Deleted:
> Deduplicate Data?: No Processes For Identifying
> Duplicates:
> Duplicate Data Not Stored:
> Auto-copy Mode: Client Contains Data Deduplicated
> by Client?: No
>
> I calculated the "Max Scratch Volumes" value based on having ~7.6TB
> filesystem so 50GB * 143 = 7.1TB
>
> This morning when I checked, there were plenty of volumes with <40%
> utilized. SO why didn't reclaim kick-in? or am I totally off on this
> assumption? I manually performed move data on them and it freed things
> up.
> --
> *Zoltan Forray*
> TSM Software & Hardware Administrator
> BigBro / Hobbit / Xymon Administrator
> Virginia Commonwealth University
> UCC/Office of Technology Services
> zforray AT vcu DOT edu<mailto:zforray AT vcu DOT edu> - 804-828-4807
> Don't be a phishing victim - VCU and other reputable organizations will
> never use email to request that you reply with your password, social
> security number or confidential personal information. For more details
> visit http://infosecurity.vcu.edu/phishing.html
>
>
--
*Zoltan Forray*
TSM Software & Hardware Administrator
BigBro / Hobbit / Xymon Administrator
Virginia Commonwealth University
UCC/Office of Technology Services
zforray AT vcu DOT edu - 804-828-4807
Don't be a phishing victim - VCU and other reputable organizations will
never use email to request that you reply with your password, social
security number or confidential personal information. For more details
visit http://infosecurity.vcu.edu/phishing.html
|