BackupPC-users

Re: [BackupPC-users] Tar method - deleted files workaround

2013-10-07 16:56:26
Subject: Re: [BackupPC-users] Tar method - deleted files workaround
From: Les Mikesell <lesmikesell AT gmail DOT com>
To: "General list for user discussion, questions and support" <backuppc-users AT lists.sourceforge DOT net>
Date: Mon, 7 Oct 2013 15:54:55 -0500
On Mon, Oct 7, 2013 at 2:30 PM, Holger Parplies <wbppc AT parplies DOT de> 
wrote:
>
>> http://www.gnu.org/software/tar/manual/html_node/Incremental-Dumps.html
>
> from what I read there, surprisingly, tar files seem to be able to contain
> file deletions (i.e. extracting the archive will *delete* a file in the file
> system). This would mean it could actually work, at least in theory.
>
> On second reading, the documentation somewhat contradicts itself, so it's not
> really clear whether this is true. *Without* deletions being represented in
> the *tar file*, the whole exercise is somewhat pointless.

It doesn't 'represent deletions', it stores the current full directory
listing for each directory, even in an incremental run with each entry
marked as included in this archive or not.   During the restore, you
have the option to delete anything that was not present when the
backup was taken.

> The one problem I see is that you have a file with metadata ("snapshot file")
> in addition to the tar stream. While you *could* just keep that file at the
> remote end (on the backup client), there would need to be some preprocessing,
> i.e. copying the file for independent incrementals. This would also mean that
> BackupPC would be keeping part of its state on the client machine, which would
> be new (and probably undesired). Alternatively, the file could be copied
> between BackupPC server and client, perhaps in DumpPre/PostUserCmd. All of
> this means that the administrator of BackupPC needs to know much more about
> the backup process and the client machines (where may we put the snapshot
> file?).

Amanda has done this more or less forever and the admin doesn't need
to know anything about it.  It does use a client agent to do some of
the grunge work, but root-ssh can do anything a local agent can do.
The main job of that independent agent is to let all of the targets
send size estimates (done with dump or the trick with gnutar where if
it's output device is /dev/null it doesn't bother doing the work of
writing the archive so getting --totals is pretty cheap) so amanda can
schedule the right mix of fulls and incrementals to fill your tape -
something we don't need to worry much about.

> That doesn't mean it can't be done. It just means part of the process would
> need to be implemented by the (expert) BackupPC administrator. And for *local*
> backups (where BackupPC server == client), native support would be possible.

No, you'd just need a writable space to hold the files.

> Aside from that, we'd probably need support for file deletions in the BackupPC
> code. The rsync XferMethod already has that capability, so it shouldn't be too
> hard, I suppose. Providing this capability should be transparent for anyone
> not wanting to use --listed-incremental.

That part becomes a little more complicated - unless there is already
a perl module that understands gnudump format - and even then it would
have to be modified to process deletions in the archive instead of a
filesystem.

> Some new variables might also be needed both in the *Pre/PostUserCmds and,
> perhaps, TarClientCmd and/or TarFullArgs/TarIncrArgs, for instance the number
> of the baseline backup and the incremental level.

I don't think those concepts would change.

> Hmm. How do we *store* the snapshot file(s) in our pool FS? If the UserCmds
> need to access them, we'd either need some kind of hook, or they could just
> access $TopDir/pc/... directly (which is sort of ugly).

I've forgotten the exact details about how amanda handles these files
(there will be overlapping sets so you can chose the level of your
next run).  It's not technically necessary to have the listing files
on the central server at all, but it might make management easier.

> Is anyone actually interested in experimenting with this option?

I don't currently have anything where gnutar would be a better option
than rsync - but the handling did seem sensible back when people
actually put stuff on media other than hard drives.

-- 
    Les Mikesell
      lesmikesell AT gmail DOT com

------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/

<Prev in Thread] Current Thread [Next in Thread>