Re: TSM Database Disk Layout Recommendations

Storage Pools are JFS.  I got a pretty good bump on Storage Pool backups and
migrations when I increased the page read ahead to 256K from the default.
The reason for 256K is we are striping 4 ESS LUNs together that are on 4
different arrays in the ESS on 4 different ESS loops for out TSM Database.
The ESS will read into its internal cache roughly 64K from each LUN with
36GB drives.  So, when reading sequentially the ESS stages 256K and
transfers all of it to satisfy the read ahead.  The stripe size is 128K in
the file system.  So essentially 2 buffers are read sequentially.  There is
some round off error, but the point is fewer operations, use 5 gallon
buckets instead of 8 ounce cups to fill up the tank.

Note that I have not fixed all the storage pools yet so when we do that I am
expecting some improved performance on the storage pool movement as well.

Paul D. Seay, Jr.
Technical Specialist
Naptheon Inc.
757-688-8180


-----Original Message-----
From: Andrew Carlson [mailto:andyc AT ANDYC.CARENET DOT ORG]
Sent: Tuesday, October 08, 2002 9:24 AM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Re: TSM Database Disk Layout Recommendations


Are your storage pools JFS or Raw?  Did you try both?  When I first created
this server (AIX 4.3.3 on an S7A), I used all raw, but did not see that the
memory was being utilized.  Then I changed the storage pools to JFS for the
readahead functionality (using straight SSA disk for those).  While I have
alot of page steals, I think the JFS has helped the daily processes that
read the storagpools (backup stgpool and migration).


Andy Carlson                                    |\      _,,,---,,_
Senior Technical Specialist               ZZZzz /,`.-'`'    -.  ;-;;,_
BJC Health Care                                |,4-  ) )-,_. ,\ (  `'-'
St. Louis, Missouri                           '---''(_/--'  `-'\_)
Cat Pics: http://andyc.dyndns.org/animal.html


On Mon, 7 Oct 2002, Seay, Paul wrote:

> The reason why is because everyone is saying RAID-5 is bad versus a
> particular implementation or whatever.  The Enterprise Storage Server
> (SHARK) is RAID-5 SSA under the covers.  It flies because it has
> controllers on the front end that essentially eliminate the RAID-5
> effect and can actually blow away RAID-1 solutions under high
> sequential write applications.  The HDS 9900 series is the same.
>
> It is all a balance.  RAID-5 works great for some things, bad for
> others. If your RAID-5 solution does any kind of parity calculation in
> the array it will perform well on sequential write.  Why? Because the
> generally change from RAID-5 to RAID-3 which is the fastest on write.
>
> Now, considering this.  What does high sequential write?
>
>         Generally, Storage Pools and the LOG.
>
> What does high sequential read?
>
>         Storage Pools and the DB during backup.
>
> So, to generally say RAID-5 is bad is totally incorrect.  It depends
> on your hardware.  The number of simultaneous write operations you can
> perform, the speed of your disk, etc.
>
> In the case of TSM it even depends on how your environment is setup.
> Would I use software RAID-5, heck no, the CPU overhead is astronomical
> and the read back penalty on something like Windows is will just kill
> you because it is a dumb RAID-5 implementation.
>
> I hope everyone will look at what they are saying and give specific
> complete configuration information in the future.
>
> We use the ESS.  We do striping in the AIX file system, not RAID-5.
> Protection is performed in the ESS.  We had some serious performance
> problems in relation to other ESS applications because we did not
> implement our striping correctly and our AIX system needed some
> serious tuning.
>
> If you are running default AIX vmtune parameters.  You are probably
> experiencing bad performance, not because of the RAID-5
> implementation, but because of the stress RAID-5 puts on the
> filesystem buffers in the non-comp space and causing astronomical
> paging on your system.  You change to raw and magically the problem
> goes away.  Why, becauase the file system usage drops dramatically and
> the paging stops.
>
>
> By changing to the recommendations folks kindly suggested over the
> past weeks.  My database backup time went from about 3 hours down to 1
> hour for an 85GB database.  My storage pools have dramatically
> improved as well and I have not corrected their striping yet.  How did
> I get the performance:
>
>         maxperm set to 40
>       minperm set to 10
>         max page read ahead set to 256K
>         bufferpool set to 256MB (memory on the machine is 2GB)
>         sufficient free pages to support the max read ahead (there are
> rules about this number)
>
> Our machine is a P660-6H1,
>         (4) 450MZ processors,
>         2GB memory,
>         2 Fibre Channel cards for the disk, 4 for the tape (1 Gbit)
>         640GB of ESS disk
>         14 Magstar Drives in use so far, eventually 32.
>         2 Gbit Ethernet Cards.
>
> Yes, my environment may be unique, but at least I am telling you why
> what I have works well so that a generalization is not made that has
> no point of reference.
>
> Thanks Mark, you probably saved us about 55K to prevent us from buying
> a much larger TSM server.  We will probably change to (6) 750mz
> processors and 4GB of memory, add another Gbit card, and 2 more FC
> Cards to a new IO frame for the 6H1.  Your methodical approach was
> exactly what we needed to understand the issues and what to do.  Our
> machine purrs like a kitten now.
>
>
>
> Paul D. Seay, Jr.
> Technical Specialist
> Naptheon Inc.
> 757-688-8180
>