Dirpool sizing and overflow

marcinek

ADSM.ORG Member
Joined
Sep 14, 2004
Messages
52
Reaction score
0
Points
0
Location
Warsaw, Poland
Hi,
According to the IBM Blueprints, It looks I'm running "ultra tiny" TSM setup. I have only 1 TiB dir container pool. And I have to back up only two SAP HANA Instanaces: one 760 GiB and second around 360 GiB. Even having more than 900Gigs free, TSM refused to receive backup of the first one, claiming it "ran out of storage space". I had to define another pool, this time file based as an overflow. And it used this pool! But this is going to be a subject of further investigation since still, data stored into overflow file pool could easily fit into dirpool.
My question regarding capacity planing is: Imagine, having X free space in the dirpool. And Y size backup, which is slightly bigger than X, but deduplicates well. Does it mean I have to assure at least Y storage space in my pools, just to make sure it fits in case of no deduplication ? How much more space do I need?
Now, thinking of 6TiB HANA, means assuring at least 6+ TiBs of unused space! Kinda expensive...

And another question, perhaps wrong thread: how do I move data from overflow pool back to container pool?
 
And another question, perhaps wrong thread: how do I move data from overflow pool back to container pool?
Right now, there is no mechanism for it. The overflow pool is meant as a safety net, but you really don't want to use it unless forced to it. You probably would have been better off adding that space as an additional directory in your container pool, rather than as an overflow.

There are some creative and risky ways. Using combinations of protect stgpool, replicate node between the source and target and back, you could in theory move the data back in the source container pool. Me or my customers have not tested that yet, so I don't have more details than that.
 
First of all, HANA backups do not deduplicate well. Especially so if there is only ever going to be one backup in the pool.

When your HANA backup starts, it estimates how much data it is going to send before it sends it. As it doesn't exactly know how much it will send, it takes a conservative guess. Your server is responding with "Whoa, Dude! I don't have space for all of that". Then the backup fails with "out of server storage space".

You can easily get your overflow pool data back in your container pool by running a storagepool conversion on your overflow pool since it is a FILE pool (convert stg <overflowpoolname> <containerpoolname>). It will lock your overflow pool and make it unusable for backups, but I take it that this is what you want.

Curious to know why you want a storage pool only slightly larger than 1 backup. Is this data going to be migrated elsewhere or are you only keeping one backup. I guess you could be replicating to another server with a longer retention. Still, even if you were doing that, I'd think it would be unworkable with 2 days storage.
 
First of all, HANA backups do not deduplicate well. Especially so if there is only ever going to be one backup in the pool.

When your HANA backup starts, it estimates how much data it is going to send before it sends it. As it doesn't exactly know how much it will send, it takes a conservative guess. Your server is responding with "Whoa, Dude! I don't have space for all of that". Then the backup fails with "out of server storage space".

That is exactly what is happening :-\
Regarding deduplication of HANA data - Indeed, it reports that dedupication saved no data after the sessions is closed, but when I generate dedupstats for my nodes I get quite nice results - Total saving percentage is reported between 88 to 97%. It will probably change, since my Hana's are non-production yet, but after 3 weeks of backups (two full per week, logs via backint) I have about 48 % used. The policy is to keep data for 30 days, so indeed the pool is too small. If I make it double should suffice.

You can easily get your overflow pool data back in your container pool by running a storagepool conversion on your overflow pool since it is a FILE pool (convert stg <overflowpoolname> <containerpoolname>). It will lock your overflow pool and make it unusable for backups, but I take it that this is what you want.

That is a tempting idea. I assume, I should update my "dirpool1" with additional storage, remove "next" pointing to my safety-net-file-pool, and run convert. I'll try it today.

Curious to know why you want a storage pool only slightly larger than 1 backup. Is this data going to be migrated elsewhere or are you only keeping one backup. I guess you could be replicating to another server with a longer retention. Still, even if you were doing that, I'd think it would be unworkable with 2 days storage.

A mixture of politics (not so much disk space) and no experience with dirpools. Leaving politics, I'm a little old-fashioned in designing TSM, and with traditional approach, It's usually good enough to have large enough disks to save daily backup and then migrate it to tapes. In this case we were supposed to have tapes, but so far we have none (did I mentioned "politics" ?) so I decided to use dirpools. I started with 1 TiB, now I see I'm gonna need at least 2. maybe more. But what I found frustrating, is that what You called "Whoa, Dude" effect. That means I'll end up with 2 TiB pool where 1,5 is really useable. (or whatever the final figures will be)

Any way, Thank You and Marclant for you opinions and advises.
 
That is exactly what is happening :-\
Regarding deduplication of HANA data - Indeed, it reports that dedupication saved no data after the sessions is closed, but when I generate dedupstats for my nodes I get quite nice results - Total saving percentage is reported between 88 to 97%. It will probably change, since my Hana's are non-production yet, but after 3 weeks of backups (two full per week, logs via backint) I have about 48 % used. The policy is to keep data for 30 days, so indeed the pool is too small. If I make it double should suffice.

What version are you on? We had the exact same thing on one of servers. Directory pool was getting good dedup then started putting HANA data to it and the dedup stats for the pool just got worse and worse until it was getting 0%. This was very odd as there was non-HANA data was was getting good results previously. Also the HANA logs dedup well so the result seems a nonsense. We logged a call with IBM after working getting thru the first level of support it was passed to the development team who recommended and upgrade from 7.1.6 to 7.1.7. This fixed the problem and our pool usage reduced by about 80-90%! We can now hold over 30 days backups instead of the 2 days we could just fit before.
 
Back
Top