Bacula-users

Re: [Bacula-users] Idea/suggestion for dedicated disk-based sd

2010-04-06 12:31:06
Subject: Re: [Bacula-users] Idea/suggestion for dedicated disk-based sd
From: Phil Stracchino <alaric AT metrocast DOT net>
To: bacula-users AT lists.sourceforge DOT net
Date: Tue, 06 Apr 2010 12:28:40 -0400
On 04/06/10 12:06, Josh Fisher wrote:
> On 4/6/2010 8:42 AM, Phil Stracchino wrote:
>> On 04/06/10 02:37, Craig Ringer wrote:
>> Well, just off the top of my head, the first thing that comes to mind is
>> that the only ways such a scheme is not going to result in massive disk
>> fragmentation are:
>>
>>   (a) it's built on top of a custom filesystem with custom device drivers
>> to allow pre-positioning of volumes spaced across the disk surface, in
>> which case it's going to be horribly slow because it's going to spend
>> almost all its time seeking track-to-track; or
> 
> I disagree. A filesystem making use of extents and multi-block 
> allocation, such as ext4, is designed for large file efficiency by 
> keeping files mostly contiguous on disk. Also, filesystems with delayed 
> allocation, such as ext4/XFS/ZFS, are much better at concurrent i/o than 
> non-delayed allocation filesystems like ext2/3, reiser3, etc. The 
> thrashing you mentioned is substantially reduced on writes, and for 
> restores, the files (volumes) remain mostly contiguous. So with a modern 
> filesystem, concurrent jobs writing to separate volume files will be 
> pretty much as efficient as concurrent jobs writing to the same volume 
> file, and restores will be much faster with no job interleaving.


I think you're missing the point, though perhaps that's because I didn't
make it clear enough.

Let me try restating it this way:

When you are writing large volumes of data from multiple sources onto
the same set of disks, you have two choices.  Either you accept
fragmentation, or you use a space allocation algorithm that keeps the
distinct file targets self-contiguous, in which case you must accept
hammering the disks as you constantly seek back and forth between the
different areas you're writing your data streams to.

Yes, aggressive write caching can help a bit with this.  But when we're
getting into data sizes where this realistically matters on modern
hardware, the data amounts have long since passed the range it's
reasonable to cache in memory before writing.  Delayed allocation can
only help just so much when you're talking multiple half-terabyte backup
data streams.



-- 
  Phil Stracchino, CDK#2     DoD#299792458     ICBM: 43.5607, -71.355
  alaric AT caerllewys DOT net   alaric AT metrocast DOT net   phil AT 
co.ordinate DOT org
         Renaissance Man, Unix ronin, Perl hacker, Free Stater
                 It's not the years, it's the mileage.

------------------------------------------------------------------------------
Download Intel&#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users

<Prev in Thread] Current Thread [Next in Thread>