Bacula-users

Re: [Bacula-users] External "recylable" hard drives as Bacula storage -how?

2009-01-11 09:46:14
Subject: Re: [Bacula-users] External "recylable" hard drives as Bacula storage -how?
From: Kevin Keane <subscription AT kkeane DOT com>
Date: Sun, 11 Jan 2009 06:41:15 -0800
I don't think bacula supports your scenario very well. The main problem 
is that it does not treat the hard disk, but the individual file as a 
volume. I'm trying to get the same type of setup working with two hard 
disks.

The closest I came so far is this: I set up two hard disks as two 
different storage devices on the same SD, something like this 
(/misc/BACKUP2 is set up as an automount):

Device {
  Name = USBDisk2
  Media Type = File
  Device Type = File
  Archive Device = /misc/BACKUP2
  LabelMedia = yes
  Random Access = Yes
  AutomaticMount = yes
  RemovableMedia = no
  AlwaysOpen = no
  RequiresMount = No
}

Hint: bacula does not remember which device it stored a file on. So you 
may want to use the Media Type or Device Type to identify this. There is 
a section in the manual about it.

Then I created two sets of three pools: for each hard disk, one for full 
backups, one for diff backups, and one for incremental backups. Hint: DO 
NOT use the client or the job name in the label format. I originally did 
this to identify which file held which backup, but as soon as bacula 
starts recycling volumes, the name no longer matches what the file 
contains. Adjust the volume retention to your needs, of course. I do two 
full backups per month (one to each hard disk), so a volume retention of 
35 days means that I will always have two full backups available.

Pool {
  Name = Full-1-Pool
  Pool Type = Backup
  Storage = Disk1
  Maximum Volume Jobs = 1
# Bacula can automatically recycle Volumes
  Recycle = yes
# Prune expired volumes
  AutoPrune = yes
  Volume Retention = 35 days
  Label Format = "${Pool}_${NumVols}.bacula"
}

For the incremental pool (which I run every day), I use Maximum Volume 
Jobs=5 to keep the number of files somewhat under control.

Now to switch between the two disks, I use the following schedule:

Schedule {
  Name = "WeeklyCycle1"
  Run = Level=Full FullPool=Full-1-Pool DifferentialPool=Diff-1-Pool 
IncrementalPool=Inc-1-Pool on 1 at 19:05
  Run = Level=Full FullPool=Full-2-Pool DifferentialPool=Diff-2-Pool 
IncrementalPool=Inc-2-Pool on 16 at 19:05
  Run = Level=Differential FullPool=Full-1-Pool 
DifferentialPool=Diff-1-Pool IncrementalPool=Inc-1-Pool on 7 at 19:05
  Run = Level=Differential FullPool=Full-2-Pool 
DifferentialPool=Diff-2-Pool IncrementalPool=Inc-2-Pool on 22 at 19:05
  Run = Level=Incremental FullPool=Full-1-Pool 
DifferentialPool=Diff-1-Pool IncrementalPool=Inc-1-Pool on 3-6,8-14 at 19:05
  Run = Level=Incremental FullPool=Full-2-Pool 
DifferentialPool=Diff-2-Pool IncrementalPool=Inc-2-Pool on 18-21,23-31 
at 19:05
}

So with this setup, I would simply to have to remember to make sure the 
first disk is connected to the server from the 1st to the 15th, and the 
second disk from the 16th to the end of the month.

There still are a number of shortcomings with this setup:

- There is no reminder from bacula; bacula will think that both hard 
disks are always connected.
- Bacula is completely clueless even that the hard disks CAN be removed, 
much less that it might not exist at any given point in time. That could 
be an issue if you need to restore files spread out over multiple 
incremental, differential and full backups.
- Bacula does not know which disk a file actually resides on, only what 
file name to look for, and what type of media. That is because file 
names are the equivalent of volume names.
- Bacula will not delete files on the hard disk when recycling, or even 
when you manually delete it from the database. It will truncate and 
reuse the files. That can be a problem if the file name includes things 
such as client or job name.
- On the 1st and the 16th of the month, you will have a very high load 
because all backup jobs will be full backups. I get around that by 
assigning different schedules to different jobs; some jobs do a full 
backup on the 3rd or 5th of the month. But this wreaks havoc with taking 
disks off site.
- You have multiple pools that are identical except for the disk it 
connects to. A duplicate setup is always error prone. You can to some 
extent get around that by using bacula's ability to include files and 
even use the output of scripts (look at the @| operator).
- If you manually run a backup, you have to remember to change the pool 
to the correct one. Worse: if you run an incremental backup, change the 
pool to Inc-1-Pool, and then bacula decides to upgrade to a full backup 
(which can happen for various legitimate reasons), it will still end up 
in your incremental pool.
- Because I really had to fight bacula and to some extent force it to do 
what it wasn't designed to do, a number of the built-in features don't 
work very well.

Overall, bacula is a great tool, and a lot of kudos to the development 
team. But backing up to hard disk files is something of a step child; it 
was grafted onto bacula and is clearly not an organic part of the 
design. That's really not so much a criticism of bacula as it is a 
statement that it may not be the right tool for the job. It will work 
exceedingly well if you have a NAS device or something of that nature 
permanently available for backup. But for taking disks offsite, well, it 
works well enough for my own office - where I can manually intervene 
easily - but I don't think I would roll it out to my customers for hard 
disk backup. I would not hesitate to use it for tape backups.


Timo Neuvonen wrote:
> I'm considering the use of reasonably-priced external hard drives to replace
> my oldish tape drive. In the beginning this would apply my home system.
> Drives like this:
> http://www.wdc.com/en/products/products.asp?driveid=563
>
> AFAIK Bacula has been very much designed with tapes (volumes) and tape pools
> in mind, though it is possible to create volumes on hard disks too.
>
> I haven't tried to plan this very much yet, so there certainly are some
> holes in my scenario. But let's suppose I had (for example) 3 pcs of
> one-terabyte drives like the one in the link above, I would keep at least
> one of them in a "safehouse", and one or two (at a time) were attached to
> computer running Bacula SD. This 3-disk system is a minimized scenario,
> there would soon be a need for 4-5 disks, I guess.
>
> The goal could be writing all the backups to one drive for, say, one month.
> Then, this drive would be considered "used" (compare to "volume use
> duration" in Bacula) and the next one were taken into daily use, though both
> the disks were attached to the server at this time. Then, some day soon
> thereafter I would take the "used" one to safehouse, and some day bring the
> oldest one from the safehouse back to use, and attach it to the server
> before it actually is needed. Very much like I move tapes between a
> safehouse and an autoloader, making sure they are there when needed but not
> forced to do it at certain preset day.
>
> Basically, this is recycling the disk drives, like Bacula recycles volumes.
> Both full and incremental jobs were written to same disk, and I would
> propably like the system configured in a way that after a new disk is taken
> into use (after the previous one was considered "used") the first jobs were
> run as full backups.
>
> How should this (or something even close to this) be set up? When the disks
> were attached to the server, they would get different device names at
> different times, and I wouldn't like to manually edit config files each
> time. I'm sure someone has at least been thinking about this already, maybe
> also implemented this. So, can Bacula currently handle this kind of setup in
> a reasonable way?
>
>
> Regards,
> Timo
>   

-- 
Kevin Keane
Owner
The NetTech
Find the Uncommon: Expert Solutions for a Network You Never Have to Think About

Office: 866-642-7116
http://www.4nettech.com

This e-mail and attachments, if any, may contain confidential and/or 
proprietary information. Please be advised that the unauthorized use or 
disclosure of the information is strictly prohibited. The information herein is 
intended only for use by the intended recipient(s) named above. If you have 
received this transmission in error, please notify the sender immediately and 
permanently delete the e-mail and any copies, printouts or attachments thereof.


------------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It is the best place to buy or sell services for
just about anything Open Source.
http://p.sf.net/sfu/Xq1LFB
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users