Bacula-users

Re: [Bacula-users] Idea/suggestion for dedicated disk-based sd

2010-04-07 01:19:08
Subject: Re: [Bacula-users] Idea/suggestion for dedicated disk-based sd
From: Craig Ringer <craig AT postnewspapers.com DOT au>
To: Phil Stracchino <alaric AT metrocast DOT net>
Date: Wed, 07 Apr 2010 13:13:55 +0800
Phil Stracchino wrote:
> On 04/06/10 02:37, Craig Ringer wrote:
>> Is this insane? Or a viable approach to tackling some of the
>> complexities of faking tape backup on disk as Bacula currently tries to do?
> 
> Well, just off the top of my head, the first thing that comes to mind is
> that the only ways such a scheme is not going to result in massive disk
> fragmentation are:
> 
>  (a) it's built on top of a custom filesystem with custom device drivers
> to allow pre-positioning of volumes spaced across the disk surface, in
> which case it's going to be horribly slow because it's going to spend
> almost all its time seeking track-to-track; or

You appear to be assuming that "disk backup" == "single disk backup" or
"set of simple disks".

Most practical disk backup setups will involve large RAID-5, RAID-6 or
RAID-10 arrays. These tend to be striped across the spindles anyway, and
the file system is rarely properly aware of how this striping occurs.

For what it's worth, a quick check on my volumes does reveal significant
(25% or so) fragmentation. I'm going to see if I can extend the sd with
posix_fallocate(...) support and see if I can reduce that.

> Now, all you're going to gain from this is non-interleaved disk volumes,
> and that's basically going to help you only during restores.

Interleaved disk volumes complicate management of retention periods and
lifetimes. They also make it harder to see what's using what space,
where. That's why I want to avoid them, not for performance reasons.

> So you're
> sacrificing the common case to optimize for the rare case.  You mention
> spool files, but the obvious question there is, if you're backing up to
> disk anyway, why use spooling at all?

I prefer not to. I'm only spooling in one place and only as an
alternative to volume interleaving. Truly I probably don't need to,
since it's not interleaving I really want to avoid but combining
multiple things on one volume.

> The purpose of disk spooling was
> to buffer between clients and tape devices.  When backing up to disk,
> there's really not a lot of point in spooling at all.  What you really
> want is de-interleaving.  Correct?

Rather than de-interleaving, what I really want is writing to more than
one backup volume, tracked separately in the catalog.

> As has already been discussed, you can achieve this end by creating
> multiple storage devices on the same disk pool and assigning one storage
> device per client, but this will result in massive disk fragmentation -
> and, honestly, you'll be no better off.
> 
> If what you want is to de-interleave your backups, then look into the
> Migration function.  You can allow backups to run normally, then Migrate
> one job at a time to a new device, which will give you non-interleaved
> jobs on the output volume.  But you're still not guaranteed that the
> output volume will be unfragmented, because you don't have control over
> the disk space allocation scheme; and you're still sacrificing the
> common case to optimize for the rare case.

I'm wondering why I should worry too much about fragmentation, actually.
The array performs quite well when significantly fragmented; I haven't
noticed any significant write performance drops over time.

It may slow restores a little, but again with a many-spindle array I'm
not sure how much practical effect it'll have. Is fragmentation
avoidance worth all this complexity?

--
Craig Ringer

------------------------------------------------------------------------------
Download Intel&#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users

<Prev in Thread] Current Thread [Next in Thread>