Bacula-users

Re: [Bacula-users] [Bacula-devel] Vbackup feature

2008-07-10 10:33:44
Subject: Re: [Bacula-users] [Bacula-devel] Vbackup feature
From: Kern Sibbald <kern AT sibbald DOT com>
To: bacula-devel AT lists.sourceforge DOT net
Date: Thu, 10 Jul 2008 16:33:33 +0200
On Thursday 10 July 2008 16:23:15 Josh Fisher wrote:
> Kern Sibbald wrote:
> > Hello,
> >
> > I'm a bit burned out from intensive bug fixing over the last couple of
> > months, so decided to do something totally new yesterday.  I started
> > implementing what I call Virtual Backup or Vbackup, which is essentially
> > project #3 "Merge multiple backups (Synthetic Backup or Consolidation)".
> >
> > In attempting to implement it, I've realized a few things:
> >
> > 1. It is probably better to implement it as a new "level" under the
> > normal Backup code, for example "level=vbackup".  The resulting output
> > will be recorded in the catalog as a "Full".
> >
> > 2. In most all respects it must behave much like a Migration job in that
> > it does not use a FD, it reads an existing set of backups, and writes
> > them to a new Volume.
> >
> > 3. One difference from a Migration job is that all the old jobs remain
> > unchanged (i.e. like a Copy).
> >
> > 4. Another difference is that it has many fewer features in that it
> > simply finds all the current backup records and copies them.  There are
> > no complicated selection criteria.
> >
> > 5. Like the Migration and Copy jobs, the input Pool (from where it reads
> > the currently backed up data) and the output Pool (where it writes the
> > merged data) must be different.  This ensures that the job does not
> > attempt to read and write to the same device, which just will not work.
> >
> > Well the problem with the above -- principally item #5 is consider the
> > following:
> >
> > You have a job J1, which does a Full, one or more Diff backups, then any
> > number of Inc backups all going to Pool P1.  At some point in time
> > (possibly via the Schedule), you run a vbackup level, so it finds all the
> > current backup files (Full, last Diff, and all later Inc) and copies the
> > data from the input Pool (P1) to the output Pool (P2).
> >
> > Now, if you then redo a normal Full backup and restart with Diff and Inc
> > jobs again, all will work.
> >
> > However, it is much more likely that you will then continue doing
> > incremental backups (no more Full or Diffs).  At some point later, you
> > want to do another vbackup to "consolidate" all the Inc backups, and now
> > the process fails, because you are going to need to read from Pools P2
> > (Full produced by the vbackup) and P1 (new Incs), and you will attempt to
> > write to P2, which will not work.
> >
> > Thus without some other mechanism to move Volumes from Pool to Pool, a
> > setup like described above won't work, and I suspect this is what will be
> > done the most frequently (i.e. do only one Full and there after vbackups
> > when there are enough Incs to warrant a consolidation).
> >
> > Any comments?
>
> Why not always write vbackup jobs to a volume in a "special" pool first?

Well, that is exactly what happens.  It is just called "Next Pool" rather than 
special pool.

However, the Next Pool cannot be the same as the Pool from which the jobs are 
going to be read.

> When writing to the volume(s) in the special pool is completed, either
> the volume(s) could be moved from the special pool to the destination
> pool specified by the job, or the entire vbackup job could be migrated
> to the destination pool. 

It sounds like you are more or less re-inventing the Scratch pool, but with a 
slight twist.  It probably would work, but sounds a bit complicated to me.  I 
suspect we would get a lot of support requests :-(

> The former would be faster, but would likely 
> waste space due to fragmentation if the destination pool was only used
> for vbackup jobs. However, the destination pool could be the same pool
> used for normal backups, and remaining volume space would be usable by
> normal jobs. In either case, all normal pools (except the special pool
> and the Scratch pool) could then be used for input, including the
> destination pools of previous vbackup jobs. I would envision putting
> vbackup jobs into the same pool I put normal full backup jobs.

For the moment, I don't see any clean solution to this problem unless we 
program logic to implement the concept of a "different" volume or if the user 
does some manual intervention (moving of volumes between pools -- which rules 
writing more vbackups to them).

-------------------------------------------------------------------------
Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW!
Studies have shown that voting for your favorite open source project,
along with a healthy diet, reduces your potential for chronic lameness
and boredom. Vote Now at http://www.sourceforge.net/community/cca08
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users