Bacula-users

Re: [Bacula-users] Vbackup feature

2008-07-10 08:59:10
Subject: Re: [Bacula-users] Vbackup feature
From: Kern Sibbald <kern AT sibbald DOT com>
To: Dan Langille <dan AT langille DOT org>
Date: Thu, 10 Jul 2008 14:58:56 +0200
On Thursday 10 July 2008 14:47:15 Dan Langille wrote:
> On Jul 10, 2008, at 7:45 AM, Kern Sibbald wrote:
> > Hello,
> >
> > I'm a bit burned out from intensive bug fixing over the last couple
> > of months,
> > so decided to do something totally new yesterday.  I started
> > implementing
> > what I call Virtual Backup or Vbackup, which is essentially project
> > #3 "Merge
> > multiple backups (Synthetic Backup or Consolidation)".
> >
> > In attempting to implement it, I've realized a few things:
> >
> > 1. It is probably better to implement it as a new "level" under the
> > normal
> > Backup code, for example "level=vbackup".  The resulting output
> > will be
> > recorded in the catalog as a "Full".
> >
> > 2. In most all respects it must behave much like a Migration job in
> > that it
> > does not use a FD, it reads an existing set of backups, and writes
> > them to a
> > new Volume.
> >
> > 3. One difference from a Migration job is that all the old jobs remain
> > unchanged (i.e. like a Copy).
> >
> > 4. Another difference is that it has many fewer features in that it
> > simply
> > finds all the current backup records and copies them.  There are no
> > complicated selection criteria.
> >
> > 5. Like the Migration and Copy jobs, the input Pool (from where it
> > reads the
> > currently backed up data) and the output Pool (where it writes the
> > merged
> > data) must be different.  This ensures that the job does not
> > attempt to read
> > and write to the same device, which just will not work.
> >
> > Well the problem with the above -- principally item #5 is consider the
> > following:
> >
> > You have a job J1, which does a Full, one or more Diff backups,
> > then any
> > number of Inc backups all going to Pool P1.  At some point in time
> > (possibly
> > via the Schedule), you run a vbackup level, so it finds all the
> > current
> > backup files (Full, last Diff, and all later Inc) and copies the
> > data from
> > the input Pool (P1) to the output Pool (P2).
> >
> > Now, if you then redo a normal Full backup and restart with Diff
> > and Inc jobs
> > again, all will work.
> >
> > However, it is much more likely that you will then continue doing
> > incremental
> > backups (no more Full or Diffs).  At some point later, you want to
> > do another
> > vbackup to "consolidate" all the Inc backups, and now the process
> > fails,
> > because you are going to need to read from Pools P2 (Full produced
> > by the
> > vbackup) and P1 (new Incs), and you will attempt to write to P2,
> > which will
> > not work.
> >
> > Thus without some other mechanism to move Volumes from Pool to
> > Pool, a setup
> > like described above won't work, and I suspect this is what will be
> > done the
> > most frequently (i.e. do only one Full and there after vbackups
> > when there
> > are enough Incs to warrant a consolidation).
>
> How does the above strategy cope with deleted files?  For example,
> foo.bar is
> included in the Full backup, but removed from disk before the next
> backup job.
>
> How does this strategy deal with the above is:
> - foo.bar is not explicitly listed in the FileSet
> - foo.bar is explicity listed
>
> e.g. File = /usr
> versus File = /usr/foo.bar
>
> My thoughts: In the First case, the Vbackup job would not include
> foo.bar.  In the
> second case, it should.  In both cases, the file could be retrieved
> from the original
> Full backup.
>
> Critics who think it should always be included should not be using
> Vbackup.

If you have done all your prior backups using Acurrate = yes in the Job 
resource, then deleted files will be handled correctly.  We have discussed 
whether or not it would be a good idea to make Accurate the default, but it 
requires additional overhead on both the server and the client machines, so 
for the moment it must be explicitly turned on.  Note: accurate applies only 
to the current development trunk code, and is not in the 2.4.x code.

Once you have done backups with Accurate enabled, the appropriate delete 
records are stored in the database, and all restores, Migrations, Copies, and 
Vbackups after that point will handle the delete records properly (i.e. it is 
always turned on), but it takes "Accurate = yes" to get the records in the 
database, and to enable code in the FD that finds newly inserted files (even 
with old dates).

-------------------------------------------------------------------------
Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW!
Studies have shown that voting for your favorite open source project,
along with a healthy diet, reduces your potential for chronic lameness
and boredom. Vote Now at http://www.sourceforge.net/community/cca08
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users