Networker

Re: [Networker] Best way to evaluate drive speed?

2006-07-12 12:37:11
Subject: Re: [Networker] Best way to evaluate drive speed?
From: David Gold-news <dave2 AT CAMBRIDGECOMPUTER DOT COM>
To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
Date: Wed, 12 Jul 2006 12:37:17 -0400
George,

Vendors typically report on maximum physical write speed, which is determined by a set of standard tests. The physical testing of the tape drives doesn't have as much to do with real-world use, as it does with what the drive is designed to do. So while the SDLT 600 physically can write 35.4MB/second natively (around 70MB/second with 2:1 compression), that has very little do with with the actual backup speeds you'll see. Or rather, except in rare circumstances, you won't see that performance from a single client, so some tuning has to be done.

So, to answer your question, you bet the vendors are testing with data sets and hardware designed to show off what the drives can do. Just like the road tests in Car and Driver show the best times for a Porche--they use a professional driver, the best pressure for the wheels, and a smooth track. (I doubt that I could get the care from 0 to 60 in 3 seconds :)

BTW, see a nice review of the SDLT 600 versus Ultrium at www.open-mag.com/features/Vol_90/sdlt/sdlt.htm

In terms of testing the drive, there are a couple of ways to do it. The best way is to use a reliable data set; in most cases this loosely translates to "take the disk drive out of the equation". (a 10K disk does somewhere in the range of 40MB/second sequentially, and 800K/second randomly). One way is to follow the uasm instructions in the NetWorker performance tuning guide; also, each vendor usually has a tape drive utility that includes some testing. There are a number of other utilities to test drive throughput, from bst5 for windows, to Tapewise. On a *nix platform, I'd probably use DD, since it can easily remove any filesystem (as well as CPU) restrictions in writing to tape, giving you a better pure speed test, even if they are all 0's.

In terms of the last question--which I'll rephrase as "how many concurrent streams do I want to use for my tape drive"--the answer has more to do with the clients pushing data than anything else. The method you noted is a good one--do a single backup of each client, see what it gets, then do some math to figure out how many to run at once. Try them all, and see what the results are. Thierry's reporter would help with speeds and throughput, but I would normally just look at the savegrp report, possibly use nss to parse it if I'm doing a lot of them (rather than doing by hand). The one gotcha is that linear drives become a bottleneck at the low end if you underfeed them. So if you have a 35MB/second native speed drive, and it bottlenecks at 35-40% of capacity (say 15MB/second), then if a client stream sends 11MB/second, you might actually get more like 3-5MB/second, due to a tape drive buffer underrun condition. That makes a methodical test much more difficult for the SDLT than with a helical scan drive would be, unfortunately.

Hope this helps, or at least opens up more questions!

Dave


Date:    Tue, 11 Jul 2006 20:16:24 -0400
From:    George Sinclair <George.Sinclair AT NOAA DOT GOV>
Subject: Best way to evaluate drive speed?

I think I'm asking too many questions here, but it's related so ...

How do we best determine if we're getting reasonable write speed on our
tape drives?

We have 4 SDLT 600 drives in a LVD Quantum library attached to a Linux
storage node. We're using SCSI interface with dual channel host adapter
cards (LSI 22320 Ultra320) which supplies 160 MB/channel. Drive1-2 are
daisy chained to channel A. Drives 3-4 are daisy chained to channel B.
Perhaps each drive should be on its own channel, but given the speed of
the channel it seemed OK to have two drives share a channel. Was not
going to have more than 2 drive per channel, however. But, burst
transfer speed does list as 160 max, so with two drives that would be
320 so maybe they should each be on separate channels. Anyway, the
product information indicates that these drives have a speed of 35
native, 70 MB/sec with compression. I assume this is a best case
scenario, and I doubt we'll be able to match those in practice, and it
might make a difference if the data is local to the snode or coming over
the network, but how do we determine if we're getting good results?

This brings up 3 questions:
1. What are the vendors doing to claim these numbers? How are they
writing the data during the tests so as to optimize the speed to claim
the best possible results?

2. What is a good way to determine if the drives in your library are
functioning at their proper speeds?

3. Does sending more save sets to a tape device really increase the
speed, and if so, why? I thought I've seen this behavior before wherein
the speed increases (to a point) when more sessions are running, but
maybe that was coincidental. I mean, why would one large save set (as in
one large enough such that when you're doing a full, it will take a
while so there's no shoe shining effect and the drive can just stream
along)  not do the same as multi sessions?

I was thinking to first run some non-Legato tests by tarring a 2 GB
directory on the storage node directly to one of the devices, time the
operation and then do the math. Perform a tar extract to ensure it
worked correctly. I don't think I can run multiple concurrent tar
sessions to the same device, though, like multi-plexing in NetWorker.
Will this still yield a good idea of the drives speed? Not sure this
would fill the buffer fast enough or generate good enough burst speed
like if I'm running multiple save sets in NetWorker, but I refer to
question 2. above?

If sending more sessions to the device does increase the speed then when
using NetWorker to send multiple sessions to the device, so as to better
fill the buffer and increase drive speed, how can I best capture the
results? Is it obvious enough to simply create a group, place the snode
in the group, specify 'All' for the client's save sets and then just
launch a full and then look at the start and completion times for the
group and how much total was backed up and do the math to get an average
drive speed? There are 5 file systems on the snode, and 4 are very small
(under 300 MB) except for /usr which is about 6 GB, so we all know which
one will still be cranking on a full long after the others are done.
Will this be a fair test? Maybe better to create say 4 separate named
paths of 1 GB each, for example, and list those as the save sets? Again,
this gets back to question 2. above.

Thanks.

George


===================================
David Gold
Sr. Technical Consultant
Cambridge Computer Services, Inc.
Artists in Data Storage
Tel: 781-250-3000
Tel (Direct): 781-250-3260
Fax: 781-250-3360
dave AT cambridgecomputer DOT com
www.cambridgecomputer.com

===================================
 ----------------------------------------------------------------------------
*Any ideas, suggestion or comments are mine alone, and are not of my company*
To sign off this list, send email to listserv AT listserv.temple DOT edu and type 
"signoff networker" in the
body of the email. Please write to networker-request AT listserv.temple DOT edu 
if you have any problems
wit this list. You can access the archives at 
http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER