Bacula-users

Re: [Bacula-users] Idea/suggestion for dedicated disk-based sd

2010-04-06 09:18:55
Subject: Re: [Bacula-users] Idea/suggestion for dedicated disk-based sd
From: Henrik Johansen <henrik AT scannet DOT dk>
To: <bacula-users AT lists.sourceforge DOT net>
Date: Tue, 6 Apr 2010 15:16:19 +0200
On 04/ 6/10 02:42 PM, Phil Stracchino wrote:
> On 04/06/10 02:37, Craig Ringer wrote:
>> Is this insane? Or a viable approach to tackling some of the
>> complexities of faking tape backup on disk as Bacula currently tries to do?
>
> Well, just off the top of my head, the first thing that comes to mind is
> that the only ways such a scheme is not going to result in massive disk
> fragmentation are:
>
>   (a) it's built on top of a custom filesystem with custom device drivers
> to allow pre-positioning of volumes spaced across the disk surface, in
> which case it's going to be horribly slow because it's going to spend
> almost all its time seeking track-to-track; or
>
>   (b) it writes to raw devices and one volume is one spindle, in which
> case you pretty much lose all the flexibility of using disk storage, and
> you need large numbers of spindles for the large numbers of concurrent
> volumes you want.  To all practical purposes, you would be replacing
> "simulating tape on disk" with using disks as though they were tapes.
>
> You could possibly simplify some of the issues involved in (a) by making
> it a FUSE userspace filesystem, but then you add the two drawbacks that
> (1) it's probably going to be slow, because userspace filesystems
> usually are, and (2) it'll only be workable on Linux.
>
> Now, all you're going to gain from this is non-interleaved disk volumes,
> and that's basically going to help you only during restores.  So you're
> sacrificing the common case to optimize for the rare case.

That depends on what you need, actually. Some people are fine with 
slower backups as long as they get fast restores.

There are a number of reasons why you might want to segregate backups 
into a one-volume-per-client or a one-volume-per-job relationship :

1. Keeping the size of a volume down for manageability.

2. The ability to migrate certain client data WITHOUT relying on Bacula 
to do it for you (think zfs send / receive, rsync, etc).

3. Hard quota for limiting disk consumption of given a client.

Some other aspects involve performance and / or deduplication but are 
highly dependent on the underlying infrastructure.


> You mention
> spool files, but the obvious question there is, if you're backing up to
> disk anyway, why use spooling at all?  The purpose of disk spooling was
> to buffer between clients and tape devices.  When backing up to disk,
> there's really not a lot of point in spooling at all.  What you really
> want is de-interleaving.  Correct?

Spooling to a sufficiently large RAM disk is plaussible and would serve 
the same purpose as spooling does for tape devices.

>
> As has already been discussed, you can achieve this end by creating
> multiple storage devices on the same disk pool and assigning one storage
> device per client, but this will result in massive disk fragmentation -
> and, honestly, you'll be no better off.

That depends largely on the underlying filesystem and thus should not be 
matter of such generalization.

> If what you want is to de-interleave your backups, then look into the
> Migration function.  You can allow backups to run normally, then Migrate
> one job at a time to a new device, which will give you non-interleaved
> jobs on the output volume.  But you're still not guaranteed that the
> output volume will be unfragmented, because you don't have control over
> the disk space allocation scheme; and you're still sacrificing the
> common case to optimize for the rare case.
>


-- 
Med venlig hilsen / Best Regards

Henrik Johansen
henrik AT scannet DOT dk
Tlf. 75 53 35 00

ScanNet Group
A/S ScanNet

------------------------------------------------------------------------------
Download Intel&#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users

<Prev in Thread] Current Thread [Next in Thread>