How to measure capacity of disk backup pool

jimlane

ADSM.ORG Member
Joined
May 9, 2008
Messages
24
Reaction score
0
Points
0
Hi, All

I'm a capacity planner not a TSM expert so forgive me if my terminology is a bit off. I have a TSM infrastructure which I need to analyze to determine when it's about to fill up. One aspect is a disk storage pool where backup versions go initially before being migrated to tape. Management wants to ensure that backups remain in the disk pool for at least 3 days before going to tape. So for my purposes TSM would be out of capacity as and when that is no longer possible.

My understanding (and correct me if I'm wrong) is that TSM will fill up the disk pool and then periodically free up space in it by migrating backup versions to tape in descending order by age. As my workload builds up the oldest backup version that TSM is able to keep within a fixed size disk pool will tend to get less and less old eventually converging on my 3 day threshold. My question is how can I measure this process? Is there a query I can run that will tell me the oldest backup version in the disk pool? I would like to gather this information periodically and put it into a SAS dataset so that I can do a trend analysis on it. This would allow me to tell management that TSM will be "out of disk" some months before it happens.
If I'm on the wrong track here is there a different or better way of doing this?

TIA

Jim Lane
 
Good question Jim - I dont think I have the correct answer, but the only thing I can think of - as it applies to my environment, would be to use mytsmreports - its an older program, but it works for me. I would gather how much data is moved and backed up every night and average your numbers from that. I backup anywhere from 1 TB to 3 TB a night, depending on the night - I have a DBA that likes to do full weekly BU's on 3 of his oracle databases on the weekends.

SO - if management here wanted me to implement your task here - I would go (3 TB x 3 days) + 20% to 30% growth = 10 TB disk addition.

Sitting back with my coffee and wondering what other way people would suggest.

:) Yeah! It's Friday!!
 
I'm afraid you ARE on the wrong track. TSM does not migrate the oldest files first from a diskpool but the ones belonging to the node having the most data in the diskpool when migration starts. In order to size the "3day" requirement properly, you'll have to have at least the amount of your daily-backups*3 plus some spare capacity to play with. In addition to that, you will have to set an age limit to the migration. If you're not running it like this yet, be prepared for some performance impact. And usually one can find better ways to accomplish fast disk2disk recovery - but that would lead to a totally new discussion about what makes sense and what doesn't.

PJ
 
PJ - maybe because it is Friday, after reading your post, I do not under stand why I am on the wrong track. We both recommend multiplying the daily backups times three plus some growth. And looking at my ISC, disk storage pools have an option under Migration "Minimum amount of time to retain a file in the storage pool before migrating it" - Am I not correct that if this was set to 3 days, it would migrate data that is 3 days old off of the disk pool?

Setting aside performance and what makes sense and what doesn't.

Clay
 
PJ: thanks for this. As I said I'm not a TSM guru. Would it be valid to solve this by adding up the total amount backed up in a day and comparing that total to the size of my disk pool? For my purposes would I be in trouble if (total backed up)*3>=(size of disk pool)?

Regards,
Jim Lane
 
CWILLOUG: I think PJ was saying that I'm on the wrong track not that you are. You and he seem to be saying the same thing. Perhaps I should be comparing the total amount backed up per day to the size of my disk pool. Would that be a valid measure of capacity?

BTW what is mytsmreports? And where is it?

-Jim
 
My Tsm reports is a 3rd party tool - http://mytsmreport.sourceforge.net installed on a unix box to gather data.

I don't think comparing your disk pools is a good idea, for example, my largest disk pool is only 410 GB, so that is being migrated while I'm doing backups - slowing my performance.

You need to find out what your LARGEST daily backup is, and figure from there.

Clay
 
Clay: I'm not sure I take your point here. I'm concerned with capacity rather than performance. As it stands my disk pool is about 80TB which is much larger than it needs to be for 3 days worth of backups. However, as time goes on I will be backing up more and more data every day to the point where the disk pool won't hold 3 days worth any more. I want to be able to see that point coming a few months in advance. How does my largest daily backup figure into that process?

-Jim

PS: thanks for the link to your program.
 
Jim -

Yes, I get the capacity rather then performance. I didnt know you had an 80TB pool already. So, I think mytsmreports will help you forecast your time to add additional disk space. I was going with the incorrect assumption you were needing to currently add disk space. In the future, looking to your largest daily backup well cue you in on when you need upgrades.

I think that may make sense.

Clay
 
Here's the best way to go. Cut your diskpool size down to maybe 50% more than your daily backup volume, then turn the rest of the space into filesystems, and define a FILE deviceclass using those filesystems, and a storage pool with a 3-day migration delay using that device class. This will let you completely clear your diskpool every day (improving performance), and make it easy for you to see where your data is, and how big it is. You very will may be able to offer to increase your migration delay, or make the case to decrease it if that's needed.
On another track - This smells like a requirement for quick access to restores. Over time, in most cases, almost all of the active data will be living on tape anyway. This isn't arcserv/netbackup/other lame package. Perhaps what they really need is a big activedata pool. receive backups, backup stgpool, copy activedata, migrate. If you have relatively few simultaneous clients, you can even get your activedata populated by the clients during the night.
 
Sorry for the confusion, Clay. I did in fact respond to Jim's original post. The requirement (last 3 days on disk) would probably strike everybody working with TSM as a bit odd. Usually restore performance is a problem when you have to restore large amounts of data that is somehow collocated on the client (a directory-tree or a filesystem) but not necessarily backed up within a given timeframe like 3 days. Without knowing a lot more about what motivated the requirement and what the data structure looks like (DB backups, filesystems, etc.) there is no way anyone of us could come up with a strategy that makes best use of the existing disk capacity.

PJ
 
I'm not in a position to change anything about the way TSM is configured. I'm a capacity planner (in my present incarnation) so I have to take TSM as I find it and try to measure how "effectively" it's doing its job and warn management in advance when more hardware will be required to keep up. The 3-day requirement AFAIK was plucked out of the air by consultants called in to clean up a previous mess, probably based on how much money they thought that management would hold still for.

I'm not sure how many clients would qualify as "relatively few". I believe I have something like 20-30K spread over 16 TSM instances.

-Jim
 
Ok - so in that case you'd best plan for "daily capacity * 4" plus whatever reserves you're considering for growth. You'll need the extra day's capacity in order to have room for new backups coming in before the oldest ones may swap out. TSM will need some time for that can't effectively swap out the oldest versions at the same time new data comes in.

PJ
 
PJ: thanks for this. For planning purposes based on what you suggest would the "total amount backed up in a day" (which I know how to find) equal "daily capacity" in your reply?

TIA

-Jim
 
Back
Top