Bacula-users

Re: [Bacula-users] External "recylable" hard drives as Bacula storage -how?

2009-01-12 10:06:22
Subject: Re: [Bacula-users] External "recylable" hard drives as Bacula storage -how?
From: Martin Schmid <scm AT apsag DOT com>
To: bacula-users AT lists.sourceforge DOT net
Date: Mon, 12 Jan 2009 16:03:50 +0100
Kevin is right stating that there's some room for improvments. I'm using
bacula for both tape and disk backups and I'm still searching for 'the
best' solution to implement swapable drives.

Bacula needs to know the devices and this is difficult for drives that
are hot-swapped.

I'm using a workaround by udev. Udev can be taught to always create my
Maxtor USB-Drive as /dev/icybox1 while any disk found on my
SATA-Controller shows up as /dev/icybox5. For bacula this is a situation
it can handle. It's not natively bacula, though.

The other problem with disks is that there's no clear instruction to the
operator. How should an operator know that the volume BOX_1723 is
located on the physical disk #3? So what to do when bacula asks for it?

The only way is to disable autolabelling. The admin needs to set up a
known range of volumes and write a sticker to mark the drives with the
volumes on it.

It would be great if bacula could use a device just like it uses a
single tape drive: without file system on it, up to its capacity, and
being the volume itself.

Regards

Martin


Kevin Keane schrieb:
> I don't think bacula supports your scenario very well. The main problem 
> is that it does not treat the hard disk, but the individual file as a 
> volume. I'm trying to get the same type of setup working with two hard 
> disks.
>
> The closest I came so far is this: I set up two hard disks as two 
> different storage devices on the same SD, something like this 
> (/misc/BACKUP2 is set up as an automount):
>
> Device {
>   Name = USBDisk2
>   Media Type = File
>   Device Type = File
>   Archive Device = /misc/BACKUP2
>   LabelMedia = yes
>   Random Access = Yes
>   AutomaticMount = yes
>   RemovableMedia = no
>   AlwaysOpen = no
>   RequiresMount = No
> }
>
> Hint: bacula does not remember which device it stored a file on. So you 
> may want to use the Media Type or Device Type to identify this. There is 
> a section in the manual about it.
>
> Then I created two sets of three pools: for each hard disk, one for full 
> backups, one for diff backups, and one for incremental backups. Hint: DO 
> NOT use the client or the job name in the label format. I originally did 
> this to identify which file held which backup, but as soon as bacula 
> starts recycling volumes, the name no longer matches what the file 
> contains. Adjust the volume retention to your needs, of course. I do two 
> full backups per month (one to each hard disk), so a volume retention of 
> 35 days means that I will always have two full backups available.
>
> Pool {
>   Name = Full-1-Pool
>   Pool Type = Backup
>   Storage = Disk1
>   Maximum Volume Jobs = 1
> # Bacula can automatically recycle Volumes
>   Recycle = yes
> # Prune expired volumes
>   AutoPrune = yes
>   Volume Retention = 35 days
>   Label Format = "${Pool}_${NumVols}.bacula"
> }
>
> For the incremental pool (which I run every day), I use Maximum Volume 
> Jobs=5 to keep the number of files somewhat under control.
>
> Now to switch between the two disks, I use the following schedule:
>
> Schedule {
>   Name = "WeeklyCycle1"
>   Run = Level=Full FullPool=Full-1-Pool DifferentialPool=Diff-1-Pool 
> IncrementalPool=Inc-1-Pool on 1 at 19:05
>   Run = Level=Full FullPool=Full-2-Pool DifferentialPool=Diff-2-Pool 
> IncrementalPool=Inc-2-Pool on 16 at 19:05
>   Run = Level=Differential FullPool=Full-1-Pool 
> DifferentialPool=Diff-1-Pool IncrementalPool=Inc-1-Pool on 7 at 19:05
>   Run = Level=Differential FullPool=Full-2-Pool 
> DifferentialPool=Diff-2-Pool IncrementalPool=Inc-2-Pool on 22 at 19:05
>   Run = Level=Incremental FullPool=Full-1-Pool 
> DifferentialPool=Diff-1-Pool IncrementalPool=Inc-1-Pool on 3-6,8-14 at 19:05
>   Run = Level=Incremental FullPool=Full-2-Pool 
> DifferentialPool=Diff-2-Pool IncrementalPool=Inc-2-Pool on 18-21,23-31 
> at 19:05
> }
>
> So with this setup, I would simply to have to remember to make sure the 
> first disk is connected to the server from the 1st to the 15th, and the 
> second disk from the 16th to the end of the month.
>
> There still are a number of shortcomings with this setup:
>
> - There is no reminder from bacula; bacula will think that both hard 
> disks are always connected.
> - Bacula is completely clueless even that the hard disks CAN be removed, 
> much less that it might not exist at any given point in time. That could 
> be an issue if you need to restore files spread out over multiple 
> incremental, differential and full backups.
> - Bacula does not know which disk a file actually resides on, only what 
> file name to look for, and what type of media. That is because file 
> names are the equivalent of volume names.
> - Bacula will not delete files on the hard disk when recycling, or even 
> when you manually delete it from the database. It will truncate and 
> reuse the files. That can be a problem if the file name includes things 
> such as client or job name.
> - On the 1st and the 16th of the month, you will have a very high load 
> because all backup jobs will be full backups. I get around that by 
> assigning different schedules to different jobs; some jobs do a full 
> backup on the 3rd or 5th of the month. But this wreaks havoc with taking 
> disks off site.
> - You have multiple pools that are identical except for the disk it 
> connects to. A duplicate setup is always error prone. You can to some 
> extent get around that by using bacula's ability to include files and 
> even use the output of scripts (look at the @| operator).
> - If you manually run a backup, you have to remember to change the pool 
> to the correct one. Worse: if you run an incremental backup, change the 
> pool to Inc-1-Pool, and then bacula decides to upgrade to a full backup 
> (which can happen for various legitimate reasons), it will still end up 
> in your incremental pool.
> - Because I really had to fight bacula and to some extent force it to do 
> what it wasn't designed to do, a number of the built-in features don't 
> work very well.
>
> Overall, bacula is a great tool, and a lot of kudos to the development 
> team. But backing up to hard disk files is something of a step child; it 
> was grafted onto bacula and is clearly not an organic part of the 
> design. That's really not so much a criticism of bacula as it is a 
> statement that it may not be the right tool for the job. It will work 
> exceedingly well if you have a NAS device or something of that nature 
> permanently available for backup. But for taking disks offsite, well, it 
> works well enough for my own office - where I can manually intervene 
> easily - but I don't think I would roll it out to my customers for hard 
> disk backup. I would not hesitate to use it for tape backups.
>
>
> Timo Neuvonen wrote:
>   
>> I'm considering the use of reasonably-priced external hard drives to replace
>> my oldish tape drive. In the beginning this would apply my home system.
>> Drives like this:
>> http://www.wdc.com/en/products/products.asp?driveid=563
>>
>> AFAIK Bacula has been very much designed with tapes (volumes) and tape pools
>> in mind, though it is possible to create volumes on hard disks too.
>>
>> I haven't tried to plan this very much yet, so there certainly are some
>> holes in my scenario. But let's suppose I had (for example) 3 pcs of
>> one-terabyte drives like the one in the link above, I would keep at least
>> one of them in a "safehouse", and one or two (at a time) were attached to
>> computer running Bacula SD. This 3-disk system is a minimized scenario,
>> there would soon be a need for 4-5 disks, I guess.
>>
>> The goal could be writing all the backups to one drive for, say, one month.
>> Then, this drive would be considered "used" (compare to "volume use
>> duration" in Bacula) and the next one were taken into daily use, though both
>> the disks were attached to the server at this time. Then, some day soon
>> thereafter I would take the "used" one to safehouse, and some day bring the
>> oldest one from the safehouse back to use, and attach it to the server
>> before it actually is needed. Very much like I move tapes between a
>> safehouse and an autoloader, making sure they are there when needed but not
>> forced to do it at certain preset day.
>>
>> Basically, this is recycling the disk drives, like Bacula recycles volumes.
>> Both full and incremental jobs were written to same disk, and I would
>> propably like the system configured in a way that after a new disk is taken
>> into use (after the previous one was considered "used") the first jobs were
>> run as full backups.
>>
>> How should this (or something even close to this) be set up? When the disks
>> were attached to the server, they would get different device names at
>> different times, and I wouldn't like to manually edit config files each
>> time. I'm sure someone has at least been thinking about this already, maybe
>> also implemented this. So, can Bacula currently handle this kind of setup in
>> a reasonable way?
>>
>>
>> Regards,
>> Timo
>>   
>>     
>
>   


-- 
Martin Schmid
APS systems AG, Neumatt 4, CH-4626 Niederbuchsiten
Tel direkt: +41 62 389 8891, Fax: +41 62 389 8880, Tel: +41 62 389 8888
www.aps-systems.ch




------------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It is the best place to buy or sell services for
just about anything Open Source.
http://p.sf.net/sfu/Xq1LFB
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users