ADSM-L

Re: [ADSM-L] Best use of disk storage

2008-11-15 15:25:50
Subject: Re: [ADSM-L] Best use of disk storage
From: Kelly Lipp <lipp AT STORSERVER DOT COM>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Sat, 15 Nov 2008 13:24:15 -0700
My comments to the original posters questions are below...

At the 2005 Oxford TSM Symposia, Dave Cannon and I did two presentations on 
this topic:

http://tsm-symposium.oucs.ox.ac.uk/2005/papers/Understanding%20Disk%20Storage%20in%20TSM%20(Dave%20Cannon).pdf

Is Dave's talk...

http://tsm-symposium.oucs.ox.ac.uk/2005/papers/TSM%20and%20D2D2T%20-%20Making%20a%20Bigger%20D%20(Kelly%20Lipp).pdf

is mine.  While perhaps a bit dated, the information is still reasonably good.  
Dave and I had a chance to compare notes again last spring at Share in Orlando 
and our thoughts were still similar on this topic.  We had learned a good bit 
about performance and configuration.  My original response to this post is 
using that knowledge.

Kelly Lipp
CTO
STORServer, Inc.
485-B Elkton Drive
Colorado Springs, CO 80907
719-266-8777
www.storserver.com

-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf Of 
Dwight Cook
Sent: Saturday, November 15, 2008 11:33 AM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Re: [ADSM-L] Best use of disk storage

Internally, within TSM there will be DB locks against keys such as ~volume~
and to help avoid lock contention what I like to do is to have as many
volumes in a storage pool as I expect maximum concurrent inbound client
sessions writing to that storage pool.  So if you have a BACKUPPOOL that is
to accept the inbound nightly backups and you see 500 GB nightly from 10
nodes that push 5 concurrent sessions each I would allocate/assign 50
volumes at [500/(10*5)]=10 GB each  (or 25 at 20 GB each)

-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf Of
Michael Green
Sent: Saturday, November 15, 2008 11:40 AM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Re: [ADSM-L] Best use of disk storage

On Sat, Nov 15, 2008 at 4:48 AM, Kelly Lipp <lipp AT storserver DOT com> wrote:

> I would choose 2.  Using JBOD disks build a cachepool with enough space
for incrementals.

I'm surprised you suggest to use JBOD  for diskpools (cachepools as
you call them) I.e. diskpool volumes spread across
(sda,sdb,sdc|hdisk1,hdisk2,hdisk3) devices, if I get it right...
Does it provide any measurable performance edge over the use of same
amount of disks in RADI5 configuration? Doesn't it make the setup more
prone to disk failures? I.e. one morning you may discover that one of
the disks has died  and you've just lost significant amount of data
from last night incremental?

Kelly's Comments: Sure, it could, but generally it won't: you will have moved 
the data to copy storage pools and migrated it downward.  You are always in 
some risk of losing data in the process.  It only really hurts you if you lose 
the original data too.  How safe is safe?  Yes, write performance to RAID5 
volumes can be a problem unless you have huge write back caches in your RAID 
controllers (which most of us do), but for large numbers of simultaneous 
writes, RAID5 will break down at some point.  And actually, RAID5 on SATA 
drives can have a negative impact on their MTBF numbers as they are being 
driven harder that in a JBOD configuration.  Already somewhat MTBF challenged, 
RAID5 banging them to death makes them worse.  JBOD drives, since they aren't 
hammered as much, will actually last longer.  That an engineering opinion.

>With the rest of it built on RAID5, create onlinefile with devclass file.
Choose a volume size of 25GB and create all the >volumes manually rather
than having TSM create them for you on the fly.  This way you avoid
fragmentation.

Why do you choose 25G? Why not 50G? Is this because it makes
reclamation of FILE volumes easier? If so, then why not 10G?

Kelly's Comments: Why not make it 100GB?  The optimal size is tricky to choose. 
 Dwight Cook has a nice way of deciding.  I don't think it matters much.  We've 
tried all sorts of sizes.


>
> One last thing: you really need to use devclass file on the bulk of the
storage as the disk device class does not provide space reclamation with the
file device does.

Well, I was thinking about emptying the diskpools completely while
having CACHE=y enabled. That allows you to migrate all the data down
the hierarchy  while still having it available for restores.

Kelly's Comments: Sure, but you still won't reclaim any of that space.  I 
thought the goal was to leave it behind for use later.  If you use file then 
you don't have to do the migration to tape: leave as much data behind as 
possible only migrating when the pool gets very full.  We typically set 
highmig=98 lowmig=95 and migdelay=15 to keep as much data as possible in the 
file based disk pool.  We only migrate when absolutely necessary.

There is one last problem with large disk class pools: whenever a backup 
stgpool operation occurs, TSM has to look at every file in the pool to 
determine if it needs to be backed up.  With the file device class, TSM simply 
looks at the volumes rather than the files since it knows (since their 
sequential) where it read from last.  This significantly speeds up the backup 
stgpool operation.  Especially in your case where you will likely have millions 
of files in the 20TB pool.  Actually, as I write this, this is the most 
significant problem you will face with your approach.  My only question is does 
TSM have to look at moved and cached files in that pool or can it tell the 
difference?  I don't know.

Basically my original question boils down to  this: why would I want
to spend time on additional migration stages when using FILE devc
(i.e. disk>file>tape instead of just disk>tape) and on reclamation of
FILE stgpool, if what I can do is just create huge disk pools and
completely empty them every morning and still have 20T worth of data
on them with cache=y?

Kelly's Comments: You're already hypothesizing migration: from disk to tape.  
I'm suggesting that stay the same, but instead of migration to tape migrate to 
file disk.  You can even avoid that step altogether of you write data from 
clients directly to the file disk pool.  As I already mentioned, if you have 
small numbers of simultaneous backups, this won't thrash the RAID5 set unduly 
and will probably work out just fine.  However, for large numbers of 
simultaneous backups watch the overall performance.  It may degrade on file 
volumes on a RAID5 set.