Veritas-bu

Re: [Veritas-bu] Destaging going slow

2010-06-29 02:42:59
Subject: Re: [Veritas-bu] Destaging going slow
From: "Martin, Jonathan" <JMARTI05 AT intersil DOT com>
To: "WALLEBROEK Bart" <Bart.WALLEBROEK AT swift DOT com>, <veritas-bu AT mailman.eng.auburn DOT edu>
Date: Tue, 29 Jun 2010 02:42:47 -0400
I'm with Mark Phillips on this one.  We use the direct attached storage on our 
media servers because I can get 3x the storage for the price of SAN shelf. I 
just deployed a new Dell R710 Master / Media with 2 x Dell MD1220s for less 
than $30,000. (Each MD1220 is driving an LTO3 drive 100+MB/sec as I write 
this.) I think the last SAN shelf we purchased for our Hitachi was $20,000. The 
good news about running a SAN, is that you're likely to have more disk metrics 
available (from the SAN, not windows) to troubleshoot your issues. We run 
Hitachi ourselves, but I'm not familiar with whatever modifications HP makes.

Generally speaking I do not change the default NetBackup settings unless I'm 
having a performance issue. The Dell R710 I deployed last week (6.5.6 
Master/Media on Windows 2003 Std R2 x86) is stock / out of the box with zero 
buffer / touch files configured. It drives LTO3 just fine.

My two largest Media/Masters (not the one above) only have the following touch 
files, but they are the exception.
NUMBER_DATA_BUFFERS             64
NUMBER_DATA_BUFFERS_DISK        64

>From a storage perspective, I've got all disks in a Dell MD1000 enclosure 
>configured in a single 15 disk RAID-5. My raid stripe size is generally 64K, 
>although I've played with 128K. My read policy is set to Adaptive Read Ahead, 
>my write policy is Write Back and my disk cache policy is disabled.

>From a windows perspective, my disks are GPT formatted with 64K block sizes 
>(matches the stripe size above.) You may want to consider partition alignment 
>based on your SAN manufacturer's specifications. 64K is the most common in my 
>experience, but Microsoft also recommends 1024K offset, which accounts for 
>32K, 64K, 256K and 512K offset requirements.

The SAN is going to add a layer of complexity to this.  Whoever manages you SAN 
will create Raid Groups, then assign you luns from those raid groups. Much like 
a normal Raid array, performance is "capped" at the raid group. The difference 
between a raid array and a raid group is that your SAN guy can carve up that 
Raid Group and assign it to 4 different servers, essentially spreading your 
performance around. If you are using SATA disks you definitely want a single 
server with a single lun on a single raid group, or performance will suffer. 
You might also have the SAN guy disable options like LUSE luns.

To troubleshoot, fire up perfmon and add the Physical Disk \ Avg. Disk Sec/Read 
counter for the specific disk you are testing. If you are seeing large spikes > 
.050 (50ms) then you are seeing your SATA disks struggle to find and return the 
data you are looking for. (for reference, my 100MB/sec SATA arrays show <10ms 
spikes, my 35MB/sec SATA arrays have spikes > 100ms.) You can also look at the 
individual disk queues if you have access to the SAN's metrics. If you are 
testing single stream write and single stream read to tape, then I am guessing 
that SAN congestion is your bottle neck.

Good luck!

-Jonathan

-----Original Message-----
From: WALLEBROEK Bart [mailto:Bart.WALLEBROEK AT swift DOT com] 
Sent: Monday, June 28, 2010 4:58 AM
To: Martin, Jonathan; veritas-bu AT mailman.eng.auburn DOT edu
Subject: RE: [Veritas-bu] Destaging going slow

Martin,

To be very honest, I have no idea what the detailed specs are behind the disks 
(SAN).  This is an HP XP20000 disk system (Hitachi to be exact) filled with 
SATA disks.  I know that the way HP constructed this disk system (which is used 
by many other systems as well, next to all our DSSU Media Servers) as 1 large 
volume of which they give pieces to users who request some SAN disk space.  
This is not the way how it works on our other HP disk system (XP 12000 filled 
with fiber channel disks) but this system is only used for high critical 
applications.  And as you know, not many companies consider the backup 
environment as a critical application :-(

The 35-40 MB/sec for us is dramatic as we backup to these same disks with 
speeds up to 125 MB/sec.  So disks are filling up very rapidly.

Can you tell me what parameters you took to get your DSSU's up and running 
properly (disk block size, number of buffers and size of buffers on NBU side, 
...) ?

Best Regards,
Bart WALLEBROEK
Backup Admin & Systems & Applications Management & Support Specialist 
Enterprise Applications Delivery - Infrastructure Management

-----Original Message-----
From: Martin, Jonathan [mailto:JMARTI05 AT intersil DOT com] 
Sent: Friday, June 25, 2010 5:49 PM
To: WALLEBROEK Bart; veritas-bu AT mailman.eng.auburn DOT edu
Subject: RE: [Veritas-bu] Destaging going slow

Great.  I would say that is your "baseline", and that is just about what I get 
out of 15x1TB SATA disks in a Raid-5 at 8 streams. What kind of Raid controller 
are you using?  What kind of block sizes on that raid volume? What kind, and 
how many physical disks are in your raid set? Is this a SAN or DAS?  If it is a 
SAN, do you have other things on the same raid group? Are you using / what are 
your values in the touch files? Have you looked at your disk counters while the 
destaging is running? (Logical and Physical disk counters for individual 
disks.) Are you seeing high levels of disk queuing? Does your throughput to 
tape match the read/sec? 

35-40 MB/sec is not ideal, but I could consider it acceptable if you were using 
huge SATA disks. If you think you are not getting the performance you deserve 
from the hardware then I would suggest digging.  It took me two weeks to find 
the optimal settings that I run now.  In the mean time, don't listen to Ed.  
There are plenty of users on this forum who have DSSUs running just fine. I for 
one have something like 12 sites running DSSUs without issue.

-J

_______________________________________________
Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu