Bacula-users

Re: [Bacula-users] Vbackup feature

2008-07-10 11:17:38
Subject: Re: [Bacula-users] Vbackup feature
From: mark.bergman AT uphs.upenn DOT edu
To: Kern Sibbald <kern AT sibbald DOT com>
Date: Thu, 10 Jul 2008 11:17:25 -0400

In the message dated: Thu, 10 Jul 2008 13:45:02 +0200,
The pithy ruminations from Kern Sibbald on 
<[Bacula-users] Vbackup feature> were:
=> Hello,
=> 
=> I'm a bit burned out from intensive bug fixing over the last couple of 
months, 
=> so decided to do something totally new yesterday.  I started implementing 
=> what I call Virtual Backup or Vbackup, which is essentially project #3 
"Merge 
=> multiple backups (Synthetic Backup or Consolidation)".


Yeah!


=> 
=> In attempting to implement it, I've realized a few things: 
=> 
=> 
=> 2. In most all respects it must behave much like a Migration job in that it 
=> does not use a FD, it reads an existing set of backups, and writes them to a 
=> new Volume.

Huh?

=> 
=> 3. One difference from a Migration job is that all the old jobs remain 
=> unchanged (i.e. like a Copy).  

OK...if they old jobs remain unchanged, then why copy all those millions of 
bits from physical media to physical media?

=> 
=> 4. Another difference is that it has many fewer features in that it simply 
=> finds all the current backup records and copies them.  There are no 
=> complicated selection criteria.

Sounds good.

=> 
=> 5. Like the Migration and Copy jobs, the input Pool (from where it reads the 
=> currently backed up data) and the output Pool (where it writes the merged 
=> data) must be different.  This ensures that the job does not attempt to read 
=> and write to the same device, which just will not work.
=> 
=> Well the problem with the above -- principally item #5 is consider the 
=> following:
=> 
=> You have a job J1, which does a Full, one or more Diff backups, then any 
=> number of Inc backups all going to Pool P1.  At some point in time (possibly 
=> via the Schedule), you run a vbackup level, so it finds all the current 
=> backup files (Full, last Diff, and all later Inc) and copies the data from
                                                         ^^^^^^^^^^^^^^^

I haven't been following the development list closely, so please forgive me if 
what I'm asking has already been discussed...perhaps I'm completely off-track...

Why copy the data at all? I see vbackups as being most useful for sites with a 
lot of data. If the data needs to be copied from one set of physical media to 
another, that would:

        be very slow -- the vbackup may take longer than a real backup (ie.,
        read multiple tapes, compute the virtual backup, then write to multiple
        tapes)

        require significan server resources (how will multi-TB backups be
        "virtualized" without requiring tremendous spool space or RAM, if the
        actual backup data is being copied)

        tie up physical resources (tape drives, slots in a tape changer, etc)

        require human resources (ie., inserting tapes, mounting external
        drives, etc)

What happens if the Full or last Diff that bacula wants to use in the vbackup 
aren't physically available (ie., they're not in the tape changer, those USB 
drives or DVDs aren't mounted, etc.)?

I had thought that a virtual backup would operate at the database level as much
as possible, in other words, when a vbackup is run, then the the current backup
files (Full, last Diff, and all later Inc) are "tagged" within the database as
being used for both the physical backup where they were created and for the
virtual backup.

I can understand writing records to physical media that represent the "diff"
between the current state and the combined backups that were already made. In
other words, the results of a vbackup would be something like a ".bsr" file,
specifying which existing physical backups would be used (in order), then
specifying a new backup that would be (virtually) superimposed on that
result--the new backup being the "diff" between the combined existing backups
and the current state of the client filesystem. The only things written to media
would be that "diff" and database updates that flag existing jobs as being
associated with the new vbackup.

This would have the side effect of extending the expiration period on the
physical media (ie., if a tape contains a 5-month old Full backup and full
backups (and the media) have an expiration period of 6 months, and that tape is
then flagged as part of a vbackup, the existing Full job record would expire as
normal, but the media and data records from the old Full backup would not be
purged until 6 months after the vbackup is made).

If there's a need for the vbackup to be on it's own media (for disaster
recovery, archiving, consolidating physical media, etc.), the process would be:

        run regular backups (Full, Incremental, Differential) over time

        run periodic vbackups

        run a copy job from the vbackup to new media (a new pool)

                
        [SNIP!]

=> 
=> Any comments?

Thanks for all your work!

Mark


----
Mark Bergman                              voice: 215-662-7310
mark.bergman AT uphs.upenn DOT edu                 fax: 215-614-0266
System Administrator     Section of Biomedical Image Analysis
Department of Radiology            University of Pennsylvania
      PGP Key: https://www.rad.upenn.edu/sbia/bergman 




The information contained in this e-mail message is intended only for the 
personal and confidential use of the recipient(s) named above. If the reader of 
this message is not the intended recipient or an agent responsible for 
delivering it to the intended recipient, you are hereby notified that you have 
received this document in error and that any review, dissemination, 
distribution, or copying of this message is strictly prohibited. If you have 
received this communication in error, please notify us immediately by e-mail, 
and delete the original message.

-------------------------------------------------------------------------
Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW!
Studies have shown that voting for your favorite open source project,
along with a healthy diet, reduces your potential for chronic lameness
and boredom. Vote Now at http://www.sourceforge.net/community/cca08
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users