Amanda-Users

Re: using disk instead of tape

2006-09-05 10:27:58
Subject: Re: using disk instead of tape
From: Gene Heskett <gene.heskett AT verizon DOT net>
To: amanda-users AT amanda DOT org
Date: Tue, 05 Sep 2006 10:21:36 -0400
On Tuesday 05 September 2006 05:21, Phil Howard wrote:
>On Mon, Sep 04, 2006 at 11:01:20PM -0400, Ian Turner wrote:
>| On Saturday 02 September 2006 16:21, Phil Howard wrote:
>| > It would not need to be separate for each OS.  The idea of using a
>| > partition table isn't even the only approach.
>|
>| The tradeoff here is that if you don't use real partitions, then you
>| (again) need this tool for restore. At present the only thing you need
>| for restore is gzip and tar or dump. Even with raw partitions, that
>| would continue to be the case, but as soon as you introduce an
>| Amanda-specific blocking format, that would no longer be the case.
>| Performance advantages might make that worthwhile, but then again the
>| same effort applied elsewhere could probably yield equal improvements
>| without the sacrifice.
>
>If all that is written is tar format, nothing more needs to be added.
>The tar format can be handled as a stream, disregarding blocks (though
>I don't know if Amanda preserves that).  I do periodically write tar
>directly to disk partitions (and read it back).  I've also done this
>with DV format video, but that's another matter.
>
>| > FYI, I was benchmarking some disk writing for an unrelated purpose
>| > yesterday and found that in Linux 2.6 using the O_DIRECT option when
>| > opening a device to write on a disk raw (even a partition) results in
>| > much faster writing. Writing raw already beats writing through a
>| > filesystem. Raw with O_DIRECT is much faster than raw without.  If
>| > someone does decide to write a driver for raw disk support, I suggest
>| > having its implementation test for support for the O_DIRECT option,
>| > and use it where possible.  It does have some size, offset, and
>| > alignment requirements that vary by OS.
>|
>| This is an interesting idea, and certainly worth pursuing. I'd be
>| interested in seeing your data.
>
>I didn't keep any stats, or really do it scientifically.  Someone that
> wants to should probably control for a lot of the variables that
> influence it. But I do recall the speed improvement is about 25% to 30%.
>  I suspect much of that is OS work bypassed with O_DIRECT.

And any perceived time saved advantages are lost by a factor of 20 or so 
when software compression is in use.  Normal backups here are written at 
20-50 megabytes/second, but 'compress client best' on a 500 mhz K6 will be 
slowed to about 50k/second or less for the compression phase.  Once the 
compression is done, and its in the holding disk, then the actual write is 
at 20-50 megs/second.  In no way is the speed of the disk a more than a 
very very minor factor in the amount of time to do the backup here.

I personally fail to see the point of trying to bypass the filesystem as 
being a speed bottleneck, its only a percent or three of the total time 
doing the backups here.  Estimates and compression are the two places to 
look at when configuring for speed.  If the storage capacity is there, 
leave the compression out.  However, I selectively use it here on some 
dle's, particularly those that will compress to less than 10% of the 
original size, and that does take time when /usr/src on either machine is 
several gigabytes.

YMMV, I have maybe 50 gigs at any one time, whereas some may have a 
terrabyte or more, but thats my take on how relatively pointless (and 
crippling to the basic premise of amanda) the proposed changes would be at 
the end of the day.

-- 
Cheers, Gene
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Yahoo.com and AOL/TW attorneys please note, additions to the above
message by Gene Heskett are:
Copyright 2006 by Maurice Eugene Heskett, all rights reserved.