[initially posted to the wrong address, sorry if it reaches you anyway]
Hi all
I'm a rather happy Bacula user and have been following the lists quietly
for a while. I'm piping up with some ideas and comments based on using
Bacula for a couple of years for my work's backup needs.
During my own use of Bacula for an off-site HDD-based backup setup, I've
noticed an increasing number of difficulties that all stem from a single
basic origin, and thought it worth raising here.
Bacula was designed as a network backup system for tape storage. The
director and especially sd design reflect this. Yet more and more people
are using off-site disk as their primary backup medium, not just as a
staging point for tape backups.
Increasingly, I'm coming to think that it'd be desirable to have a
dedicated storage daemon for HDD storage. This sd would eliminate much
of the complexity of managing disk volumes for fast, concurrent backups,
especially if the director was aware of disk-based storage daemons and
their capabilities.
Issues I struggle with right now:
- I need to define a lot of different devices on the disk-backed sd,
so that various backups may concurrently write to the SD. Each
"device" is really just a subdirectory of the main backup store,
with its own media type.
The only alternative is interleaving onto one big volume, which is
a nightmare if the different backups have different retention periods
and disk storage isn't infinite.
My backup setup isn't huge (6TB storage for backups) yet I have:
$ grep ^Device /etc/bacula/bacula-sd.conf | wc -l
10
... Device entries and matching director Storage entries.
- The need for all these different storage device definitions bloats the
sd config, as each device needs a whole bunch of redundant and
repetitive config in its definition
- Each Device{} entry for the SD needs a corresponding Storage{} entry
on the director, bloating the director config too.
- Effective volume lifetime management requires the definition of MANY
pools, and association of those pools with storage devices. It's
harder than it could be to reliably predict disk storage requirements
so the backup device doesn't fill up, and to ensure that volumes are
retained as long as they need to be. Doing it well requires lots and
lots of pools, usually three per job or class of job.
If storage is known to be disk based and one volume per job is forced,
it could be simpler to configure disk-based pools and storage.
To address these issues, a disk-only SD might:
- Have exactly one storage root. The admin can mount volumes under it,
use symlinks, or use bind mounts if some storage devices need to use
different file systems/partitions/LVs. So everything might live under
/storage/root (for the sake of this example).
- Treat any requested device name as a subdirectory of that storage
root. So if the director requests the device "Archival" then volumes
will be created/accessed in the directory /storage/root/Archival,
where the target directory will be created if it does not already
exist. The sd wouldn't require configuration of devices; the fact that
the director requested it would be considered enough configuration,
since all devices would have the following implicit config:
Device {
Name = $DEVICENAME
Media Type = File_$DEVICENAME
Archive Device = $STORAGEROOT/$DEVICENAME
SpoolDirectory = $STORAGEROOT/$DEVICENAME/spool
LabelMedia = yes;
Random Access = Yes;
AutomaticMount = yes;
RemovableMedia = no;
AlwaysOpen = no;
}
- Assume one volume per job, and expect the director to force this for
disk-based SDs. This would simplify volume management.
- Allow a device to be open multiple times with different volumes. So,
the "Archival" device might be writing to "Archival-002" and
"Archival-003" at the same time, while another job has "Archival-001"
mounted for read-verify.
This would eliminate the need to define lots of storage devices
that aren't actually any different, just so that many backups
may be in progress on the sd at once without volume interleaving
and without the need for huge spool files.
I'm not sure this is possible without an extension of dir<->sd
protocol, but as that's not stable release-to-release that shouldn't
be a big issue.
- (An alternative to the above) write spool files as valid volumes
in their proper target locations but with a temporary file name.
When they're written, rather than despooling them, simply mv()
them into place. This could only work with one volume per job, but
that should be forced for disk-based SDs anyway.
- Maybe implement par2 for damaged volume repair & recovery, since
one volume per job means that volumes will never be appended
to only truncated or deleted.
The director, when using a disk-based sd, would:
- Require that disk-based SDs be declared as such in their Storage {}
entries, and refuse to talk to a disk-based sd not declared as such.
- Know that it can open a device on a disk-based sd multiple times
with *different* volumes without the need for spooling. Currently
the director can let multiple jobs use a device, but only with
if they share the same volume. It's expected that the sd will
spool the jobs or will interleave the data on the volume. Neither
is necessary or desirable for disk storage; the sd can just write
to multiple volume files within a directory at once.
- Send a "delete volume file" message to the disk sd when a volume
is deleted from the catalog. Similarly, when a volume is purged,
send a "truncate volume file" message to the disk sd.
- Support an alternative form of Storage {} definition for disk based
storage, where multiple device names may be listed. So instead of:
Storage {
# Max concurrent jobs = 1, no spooling required
Name = File_Archival
Address = backup
SDPort = 9103
Password = "XXXXXXXXXXX"
Device = FileStorage_Archival
Media Type = File_Archival
Maximum Concurrent Jobs = 1
}
Storage {
# Max concurrent jobs = 1, no spooling required
Name = File_CyrusMail
Address = backup
SDPort = 9103
Password = "XXXXXXXXXXX"
Device = FileStorage_CyrusMail
Media Type = File_CyrusMail
Maximum Concurrent Jobs = 1
}
Storage {
# Jobs using this device must spool!
Name = File_HomeDir
Address = backup
SDPort = 9103
Password = "XXXXXXXXXXX"
Device = FileStorage_HomeDir
Media Type = File_HomeDir
Maximum Concurrent Jobs = 4
}
.... etc ....
one could write:
DiskStorage {
Name = DiskSD
Devices = "File_Archival", "File_CyrusMail", "File_HomeDir"
Address = backup
SDPort = 9103
Password = "XXXXXXXXXXX"
}
Is this insane? Or a viable approach to tackling some of the
complexities of faking tape backup on disk as Bacula currently tries to do?
--
Craig Ringer
------------------------------------------------------------------------------
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users
|