Bacula-users

[Bacula-users] Idea/suggestion for dedicated disk-based sd

2010-04-06 03:00:32
Subject: [Bacula-users] Idea/suggestion for dedicated disk-based sd
From: Craig Ringer <craig AT postnewspapers.com DOT au>
To: bacula-users <bacula-users AT lists.sourceforge DOT net>
Date: Tue, 06 Apr 2010 14:37:43 +0800
[initially posted to the wrong address, sorry if it reaches you anyway]

Hi all

I'm a rather happy Bacula user and have been following the lists quietly
for a while. I'm piping up with some ideas and comments based on using
Bacula for a couple of years for my work's backup needs.

During my own use of Bacula for an off-site HDD-based backup setup, I've
noticed an increasing number of difficulties that all stem from a single
basic origin, and thought it worth raising here.

Bacula was designed as a network backup system for tape storage. The
director and especially sd design reflect this. Yet more and more people
are using off-site disk as their primary backup medium, not just as a
staging point for tape backups.

Increasingly, I'm coming to think that it'd be desirable to have a
dedicated storage daemon for HDD storage. This sd would eliminate much
of the complexity of managing disk volumes for fast, concurrent backups,
especially if the director was aware of disk-based storage daemons and
their capabilities.

Issues I struggle with right now:


- I need to define a lot of different devices on the disk-backed sd,
  so that various backups may concurrently write to the SD. Each
  "device" is really just a subdirectory of the main backup store,
  with its own media type.

  The only alternative is interleaving onto one big volume, which is
  a nightmare if the different backups have different retention periods
  and disk storage isn't infinite.

  My backup setup isn't huge (6TB storage for backups) yet I have:

  $ grep ^Device /etc/bacula/bacula-sd.conf  | wc -l
  10

  ... Device entries and matching director Storage entries.


- The need for all these different storage device definitions bloats the
  sd config, as each device needs a whole bunch of redundant and
  repetitive config in its definition

- Each Device{} entry for the SD needs a corresponding Storage{} entry
  on the director, bloating the director config too.

- Effective volume lifetime management requires the definition of MANY
  pools, and association of those pools with storage devices. It's
  harder than it could be to reliably predict disk storage requirements
  so the backup device doesn't fill up, and to ensure that volumes are
  retained as long as they need to be. Doing it well requires lots and
  lots of pools, usually three per job or class of job.

  If storage is known to be disk based and one volume per job is forced,
  it could be simpler to configure disk-based pools and storage.



To address these issues, a disk-only SD might:

- Have exactly one storage root. The admin can mount volumes under it,
  use symlinks, or use bind mounts if some storage devices need to use
  different file systems/partitions/LVs. So everything might live under
  /storage/root (for the sake of this example).

- Treat any requested device name as a subdirectory of that storage
  root. So if the director requests the device "Archival" then volumes
  will be created/accessed in the directory /storage/root/Archival,
  where the target directory will be created if it does not already
  exist. The sd wouldn't require configuration of devices; the fact that
  the director requested it would be considered enough configuration,
  since all devices would have the following implicit config:

  Device {
    Name = $DEVICENAME
    Media Type = File_$DEVICENAME
    Archive Device = $STORAGEROOT/$DEVICENAME
    SpoolDirectory = $STORAGEROOT/$DEVICENAME/spool
    LabelMedia = yes;
    Random Access = Yes;
    AutomaticMount = yes;
    RemovableMedia = no;
    AlwaysOpen = no;
  }

- Assume one volume per job, and expect the director to force this for
  disk-based SDs. This would simplify volume management.

- Allow a device to be open multiple times with different volumes. So,
  the "Archival" device might be writing to "Archival-002" and
  "Archival-003" at the same time, while another job has "Archival-001"
  mounted for read-verify.

  This would eliminate the need to define lots of storage devices
  that aren't actually any different, just so that many backups
  may be in progress on the sd at once without volume interleaving
  and without the need for huge spool files.

  I'm not sure this is possible without an extension of dir<->sd
  protocol, but as that's not stable release-to-release that shouldn't
  be a big issue.

- (An alternative to the above) write spool files as valid volumes
  in their proper target locations but with a temporary file name.
  When they're written, rather than despooling them, simply mv()
  them into place. This could only work with one volume per job, but
  that should be forced for disk-based SDs anyway.

- Maybe implement par2 for damaged volume repair & recovery, since
  one volume per job means that volumes will never be appended
  to only truncated or deleted.


The director, when using a disk-based sd, would:

- Require that disk-based SDs be declared as such in their Storage {}
  entries, and refuse to talk to a disk-based sd not declared as such.

- Know that it can open a device on a disk-based sd multiple times
  with *different* volumes without the need for spooling. Currently
  the director can let multiple jobs use a device, but only with
  if they share the same volume. It's expected that the sd will
  spool the jobs or will interleave the data on the volume. Neither
  is necessary or desirable for disk storage; the sd can just write
  to multiple volume files within a directory at once.

- Send a "delete volume file" message to the disk sd when a volume
  is deleted from the catalog. Similarly, when a volume is purged,
  send a "truncate volume file" message to the disk sd.

- Support an alternative form of Storage {} definition for disk based
  storage, where multiple device names may be listed. So instead of:


Storage {
  # Max concurrent jobs = 1, no spooling required
  Name = File_Archival
  Address = backup
  SDPort = 9103
  Password = "XXXXXXXXXXX"
  Device = FileStorage_Archival
  Media Type = File_Archival
  Maximum Concurrent Jobs = 1
}
Storage {
  # Max concurrent jobs = 1, no spooling required
  Name = File_CyrusMail
  Address = backup
  SDPort = 9103
  Password = "XXXXXXXXXXX"
  Device = FileStorage_CyrusMail
  Media Type = File_CyrusMail
  Maximum Concurrent Jobs = 1
}
Storage {
  # Jobs using this device must spool!
  Name = File_HomeDir
  Address = backup
  SDPort = 9103
  Password = "XXXXXXXXXXX"
  Device = FileStorage_HomeDir
  Media Type = File_HomeDir
  Maximum Concurrent Jobs = 4
}

.... etc ....

   one could write:


DiskStorage {
  Name = DiskSD
  Devices = "File_Archival", "File_CyrusMail", "File_HomeDir"
  Address = backup
  SDPort = 9103
  Password = "XXXXXXXXXXX"
}



Is this insane? Or a viable approach to tackling some of the
complexities of faking tape backup on disk as Bacula currently tries to do?

--
Craig Ringer


------------------------------------------------------------------------------
Download Intel&#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users