ADSM-L

Re: Trouble when TSM-database reaches 100Gb?

2005-09-02 15:47:15
Subject: Re: Trouble when TSM-database reaches 100Gb?
From: Paul Zarnowski <psz1 AT CORNELL DOT EDU>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Fri, 2 Sep 2005 15:46:38 -0400
We have our database at 370GB currently, only 61% full.  It has been
larger, and more troublesome in the past.  We are on a roadmap to get
it smaller by continuing to split our servers.  I will tell you that
it is easier to start a 2nd server before you need it than it will be
to move nodes to it later.  We have experienced the pains of a too
large database.  In addition to the two issues that Roger mentioned
(db backup and expiration), other symptoms that you may see are
longer query times (i.e., when listing files to restore from a large
client), longer restore times (again, because the query at the
beginning of the restore will take longer with larger databases).  We
would like to unload/reload our database (it is over 10 years old
now, and very fragmented), but the time to do this for such a large
database is prohibitive.  Another reason you might want to limit your
database size.

I will offer a dissenting opinion from Roger's on using striping or
RAID for database volumes.  I will start by saying that I once shared
Roger's view on this.  Striping or RAIDing in and of itself does not
necessitate multiple spindle access for database reads.  If your
stripe size is set appropriately, you should only have to access 1
spindle to read a database page.  Writing database pages is another
story.  However, if you have a large write cache in front of your
dbvols, then write I/O should not be a problem.  TSM, IMHO, does not
do a good job of spreading I/O across dbvols.  E.g., it does not do
round-robin allocation, but rather will allocate new pages all on one
volume, then only after that volume fills up will it start allocating
on the second volume.  By using striping or RAID, you will spread the
I/O across spindles more effectively.  Even if eventually your
database grows and spreads itself across multiple spindles, pages
from a particular user are likely to be grouped onto one spindle
(with JBOD), causing queries for that user to incur head
contention.  I would be interested in hearing what others have to say
on this matter.  When we moved from 10k rpm SSA drives (non-RAID,
with cache on the SSA adapter) to 15k rpm FAStT RAID5 arrays, we saw
a world of improvement on our server.  It made a huge improvement.  I
attribute this to a combination of faster drives (15k rpm) and
spreading the I/O across spindles using RAID5.  The ease and
simplicity of replacing FAStT drives when the fail is another strong
reason to use RAID5 instead of JBOD.  The FAStT will automatically
resort to using a hot spare drive, and you don't need to do anything
other than replace the bad drive with a new one.  No fuss, no muss.

I will echo Roger's comments on how long it takes to catch up on
expiration.  It can take weeks or months to catch up if you don't
notice how far behind it has gotten.  We have since added some
metrics to our Servergraph monitoring tool to help us to monitor
expiration performance and cache-hit ratios.

We also run multiple TSM images per AIX server.  You don't need
multiple boxes to run multiple servers.

I will say that managing one server is conceptually simpler than
managing multiple servers, but only until you start running into
these problems.  IMHO, IBM needs to address this issue either by
making a single large server more feasible, or by providing
additional tools to manage a collection of servers.  Specifically, it
would be great if we could migrate a node from one server to another,
without having to make a change to the client option file to point it
at the new server.

..Paul



--
Paul Zarnowski                            Ph: 607-255-4757
Manager, Storage Systems                  Fx: 607-255-8521
719 Rhodes Hall, Ithaca, NY 14853-3801    Em: psz1 AT cornell DOT edu