Veritas-bu

[Veritas-bu] WEBCT server VERY slow / strange throughput (4.5 Datacenter, Solaris)

2004-02-25 19:22:59
Subject: [Veritas-bu] WEBCT server VERY slow / strange throughput (4.5 Datacenter, Solaris)
From: larry.kingery AT veritas DOT com (Larry Kingery)
Date: Wed, 25 Feb 2004 19:22:59 -0500 (EST)
> Hey Everybody !  We have the following situation:
> 
> NB45 FP5 running on Solaris 8
> 
> 1x STKL11000 Master / Media (16 drives)
> 1x STKL700 Media (10 Drives)
> 
> Backups of our campus WEBCT server are ridiculously slow. Job streams
> were historically multiplexed, but even when not multiplexed, we are
> getting only 258kbps from the /sync mountpoint. WEBCT uses millions of
> little files, but what we can't explain is why this one mountpoint is so
> much slower vs. the rest ?

You just did, millions of little files.

Stuff to look into:

1) FlashBackup.  You have to pay for it, but it's designed for this
   problem and usually makes a world of difference with lots of little
   files.

2) Raw partition backup.  Free, but doesn't have a lot of the benefits
   of FlashBackup (like incrementals, single file restore, and ability
   to reliably backup with filesystem mounted.)

3) When NBU backs up a file the access time is changed, so NBU sets it
   back to the original value.  You can disable this if you want (on a
   per-client basis).  You'll save a little time, but probably not a
   lot.  DO NOT do this for HSM managed filesystems (if you have
   any).  

4) "Bad" exclude lists.  Exclude lists, especially with wildcards, can
   lead to every file being compared against every entry in the list.

5) Filesystem layout.  If you put say 15,000+ files in a single
   directory you're going to see a performance hit on backup (as well
   as other operations).  Break those 15,000 files into 150
   directories with 100 files each and things will be much better.

6) Multiple streams.  This is one of a few cases where it makes sense
   to run multiple streams on the same filesystem (or set of disks).
   Related to items 5 and 7.

7) Directory Name Lookup Cache.  This is a Solaris kernel parameter
   which may help in certain circumstances (I'm thinking long
   filenames or directory paths, but I haven't done any testing on
   this myself).  See the O'Reilly System Performance Tuning book.

>  Could this be a possible RAID problem on the client side ?

Maybe, but I think you'll find your best answer in the above.  I
remember something from the O'Reilly System Performance Tuning book
about how RAID-5 could cause an issue here (see item 3 -- when the
access time is reset, the disk access would cause writes to both the
"real" disk AND the parity disk, seems to me that it implied even more
than that).  Again, I haven't investigated this one too far but it's
worth a mention I suppose.

> 
> We have tried using both robots with the same outcome on this
> particular mountpoint.
> 
> The other 3 mount points average 1200kbps each for a total of about
> 5500kbps when all 4 streams are running. If we could get this one
> mountpoint up to that speed, that would be acceptable.
> 
> Client info:
> 
> Sun V880
> 4 Cpu
> RAID
> 
> 

-- 
Larry Kingery 
    Eagles may soar, but weasels don't get sucked into jet engines

<Prev in Thread] Current Thread [Next in Thread>