Bacula-users

[Bacula-users] How to avoid backing up restored files?

2010-01-14 17:39:45
Subject: [Bacula-users] How to avoid backing up restored files?
From: Steve Costaras <stevecs AT chaven DOT com>
To: Dan Langille <dan AT langille DOT org>
Date: Thu, 14 Jan 2010 16:36:49 -0600

Some history:

On 01/14/2010 16:24, Dan Langille wrote:
> Steve Costaras wrote:
>> On 01/14/2010 15:59, Dan Langille wrote:
>>> Steve Costaras wrote:
>>>> I see the mtimeonly flag in the fileset options but there are many
>>>> caveats about using it as you will miss other files that may have been
>>>> copied over that have retained mtimes from before the last backup.
>>>> Since bacula does an MD5/SHA1 hash of all files I assumed (wrongly it
>>>> seems) that it would be smart enough to not back up files that it
>>>> already had backed up and are on tape.
>>> Smart enough?  Sheesh.  ;)
>>>
>>> That hash is to ensure the file is restored properly.  And for 
>>> verfication.  To do what you want is not easy.
>> :)  Well I figured it would be relatively easy as an option (since 
>> the hash is in the database and when a file is read from disk for 
>> backup (since it's planning on backing it up anyway it would need to 
>> read the file, if the file name & hash match those that are in the 
>> database the file could be skipped.   (for 'real' completeness and to 
>> keep in line w/ the accurate option perhaps update the database with 
>> permissions et al on the inode but since the content matches that 
>> would save a lot of tapes).
>
> In the database... not on the client.  That's the issue.  Let us not 
> discuss this here.  It is not a trivial problem to do correctly.

I see your point, the fd would need to get this data (or send the hash 
of the file to the director) 1) after reading and calculating the entire 
file and 2) before sending it to the director to save not only time but 
network bandwidth, not to mention 3) having need to buffer the file data 
in memory or to do another file system look up and re-read of the same 
data which if done could cause more of a window for a race condition 
unless handled properly if the file was modified between the two reads 
and when the data was stored in the catalogue.

That would be assuming you wanted to save both network bandwith and 
tape.   Otherwise if the fd acted normally and the decision was made by 
the director?  Does the FD talk directly to the SD or does it need to go 
through the director as well?

Hmmm.  The onion has a couple layers.  ;)



------------------------------------------------------------------------------
Throughout its 18-year history, RSA Conference consistently attracts the
world's best and brightest in the field, creating opportunities for Conference
attendees to learn about information security's most important issues through
interactions with peers, luminaries and emerging and established companies.
http://p.sf.net/sfu/rsaconf-dev2dev
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users

<Prev in Thread] Current Thread [Next in Thread>