Bacula-users

Re: [Bacula-users] How to avoid backing up restored files?

2010-01-15 02:23:54
Subject: Re: [Bacula-users] How to avoid backing up restored files?
From: Ralf Gross <Ralf-Lists AT ralfgross DOT de>
To: bacula-users AT lists.sourceforge DOT net
Date: Fri, 15 Jan 2010 08:21:11 +0100
Steve Costaras schrieb:
> 
> 
> Some history:
> 
> On 01/14/2010 16:24, Dan Langille wrote:
> > Steve Costaras wrote:
> >> On 01/14/2010 15:59, Dan Langille wrote:
> >>> Steve Costaras wrote:
> >>>> I see the mtimeonly flag in the fileset options but there are many
> >>>> caveats about using it as you will miss other files that may have been
> >>>> copied over that have retained mtimes from before the last backup.
> >>>> Since bacula does an MD5/SHA1 hash of all files I assumed (wrongly it
> >>>> seems) that it would be smart enough to not back up files that it
> >>>> already had backed up and are on tape.
> >>> Smart enough?  Sheesh.  ;)
> >>>
> >>> That hash is to ensure the file is restored properly.  And for 
> >>> verfication.  To do what you want is not easy.
> >> :)  Well I figured it would be relatively easy as an option (since 
> >> the hash is in the database and when a file is read from disk for 
> >> backup (since it's planning on backing it up anyway it would need to 
> >> read the file, if the file name & hash match those that are in the 
> >> database the file could be skipped.   (for 'real' completeness and to 
> >> keep in line w/ the accurate option perhaps update the database with 
> >> permissions et al on the inode but since the content matches that 
> >> would save a lot of tapes).
> >
> > In the database... not on the client.  That's the issue.  Let us not 
> > discuss this here.  It is not a trivial problem to do correctly.
> 
> I see your point, the fd would need to get this data (or send the hash 
> of the file to the director) 1) after reading and calculating the entire 
> file and 2) before sending it to the director to save not only time but 
> network bandwidth, not to mention 3) having need to buffer the file data 
> in memory or to do another file system look up and re-read of the same 
> data which if done could cause more of a window for a race condition 
> unless handled properly if the file was modified between the two reads 
> and when the data was stored in the catalogue.
> 
> That would be assuming you wanted to save both network bandwith and 
> tape.   Otherwise if the fd acted normally and the decision was made by 
> the director?  Does the FD talk directly to the SD or does it need to go 
> through the director as well?

You might want to look into the new Accurate Backup feature.

http://www.bacula.org/manuals/en/concepts/concepts/New_Features.html#SECTION00310000000000000000

But I think it won't help you here. What be more interesting,
especially with your amount of data is the upcoming Base Job feature.

http://sourceforge.net/apps/wordpress/bacula/2009/09/30/new-basejob-feature/

Ralf

------------------------------------------------------------------------------
Throughout its 18-year history, RSA Conference consistently attracts the
world's best and brightest in the field, creating opportunities for Conference
attendees to learn about information security's most important issues through
interactions with peers, luminaries and emerging and established companies.
http://p.sf.net/sfu/rsaconf-dev2dev
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users