Bacula-users

Re: [Bacula-users] Idea/suggestion for dedicated disk-based sd

2010-04-06 08:44:20
Subject: Re: [Bacula-users] Idea/suggestion for dedicated disk-based sd
From: Phil Stracchino <alaric AT metrocast DOT net>
To: bacula-users AT lists.sourceforge DOT net
Date: Tue, 06 Apr 2010 08:42:02 -0400
On 04/06/10 02:37, Craig Ringer wrote:
> Is this insane? Or a viable approach to tackling some of the
> complexities of faking tape backup on disk as Bacula currently tries to do?

Well, just off the top of my head, the first thing that comes to mind is
that the only ways such a scheme is not going to result in massive disk
fragmentation are:

 (a) it's built on top of a custom filesystem with custom device drivers
to allow pre-positioning of volumes spaced across the disk surface, in
which case it's going to be horribly slow because it's going to spend
almost all its time seeking track-to-track; or

 (b) it writes to raw devices and one volume is one spindle, in which
case you pretty much lose all the flexibility of using disk storage, and
you need large numbers of spindles for the large numbers of concurrent
volumes you want.  To all practical purposes, you would be replacing
"simulating tape on disk" with using disks as though they were tapes.

You could possibly simplify some of the issues involved in (a) by making
it a FUSE userspace filesystem, but then you add the two drawbacks that
(1) it's probably going to be slow, because userspace filesystems
usually are, and (2) it'll only be workable on Linux.

Now, all you're going to gain from this is non-interleaved disk volumes,
and that's basically going to help you only during restores.  So you're
sacrificing the common case to optimize for the rare case.  You mention
spool files, but the obvious question there is, if you're backing up to
disk anyway, why use spooling at all?  The purpose of disk spooling was
to buffer between clients and tape devices.  When backing up to disk,
there's really not a lot of point in spooling at all.  What you really
want is de-interleaving.  Correct?

As has already been discussed, you can achieve this end by creating
multiple storage devices on the same disk pool and assigning one storage
device per client, but this will result in massive disk fragmentation -
and, honestly, you'll be no better off.

If what you want is to de-interleave your backups, then look into the
Migration function.  You can allow backups to run normally, then Migrate
one job at a time to a new device, which will give you non-interleaved
jobs on the output volume.  But you're still not guaranteed that the
output volume will be unfragmented, because you don't have control over
the disk space allocation scheme; and you're still sacrificing the
common case to optimize for the rare case.


-- 
  Phil Stracchino, CDK#2     DoD#299792458     ICBM: 43.5607, -71.355
  alaric AT caerllewys DOT net   alaric AT metrocast DOT net   phil AT 
co.ordinate DOT org
         Renaissance Man, Unix ronin, Perl hacker, Free Stater
                 It's not the years, it's the mileage.

------------------------------------------------------------------------------
Download Intel&#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users

<Prev in Thread] Current Thread [Next in Thread>