BackupPC-users

Re: [BackupPC-users] Backing up a BackupPC server

2009-06-02 18:11:05
Subject: Re: [BackupPC-users] Backing up a BackupPC server
From: Les Mikesell <les AT futuresource DOT com>
To: "Jeffrey J. Kosowsky" <backuppc AT kosowsky DOT org>
Date: Tue, 02 Jun 2009 17:06:29 -0500
Jeffrey J. Kosowsky wrote:
>  > 
>  > One simple thing that I've sometimes thought would be useful would be a 
>  > way to re-create the pool links down any PC tree.  That way you could 
>  > rsync individual pc directories to an offsite location which usually 
>  > works OK even with -H (I suppose there is some limit..) and then 
>  > reconstruct the pooling to reclaim the duplicate space, repeating for 
>  > each host or new backup without ever having to deal with the size of the 
>  > combined pool/pc tree.
>  > 
> 
> This would be a lot easier if the footer of the pool files included
> the md5sum checksum name of each pool file. For compressed pool files,
> this would just mean extending the footer by 16 bytes. Actually for
> speed, it would probably be better to store this information in the
> header after the initial magic byte. Even better add another couple of
> bytes to identify which element of the chain is being referenced when
> multiple files have the same pool hash (note: this would have to be
> adjusted when BackupPC_nightly runs and re-arranges the chain
> numbering).
> 
> Then you could pretty easily find the corresponding pool file from any
> of the hard links without the usual reverse-lookup problem when trying
> to identify the pool file from the hard-link inode in the pc file.

I'm not sure that would help much.  In the scenario I mentioned, the 
pool file won't exist at all or the matching content may have been 
re-linked with a different name due to differences in collisions.  I'd 
just like to be able to go through more or less the same motions the 
original server did when adding new entries, but on a backup copy or 
after an initial copy to a replacement server.  Then it would also need 
to periodically clean the pool like BackupPC_nightly if it is just a backup.

> Backing up the BackupPC data would then be as simple as the following:
> 1. Shutdown BackupPC
> 2. Copy the pool to the new destination (no hard links)
> 3. Recurse through the pc directories as follows:
>       - Copy directory entries to the new destination (i.e. recreate
>         directories using something like mkdir)
>       - Copy regular files with nlinks=1 to the new destination
>       - For hard-linked files, use the header (or footer) to find the
>         cpool pathname (reconstructed from the hash and the chain
>         number). Then create the corresponding link on the new
>         destination.
> 4. Restart BackupPC

This can sort-of be done now with BackupPC_tarPCCopy as long as nothing 
changes between copying the pool and getting the pc tree copy with it. 
But, I'd like something that would work for only a subset of the hosts 
or would let the remote copy remove old backups at a slower or faster 
pace than the master.

> If you don't add the pool hash information to the cpool file
> header/footer, then you could still do a similar process by adding an
> intermediate step (say 2.5) of creating a lookup table by recursing
> through the pool and associating inodes with cpool entries. Then in
> step 3 you would use the inode number of each hard-linked file in the
> pc directory to look up the corresponding link that needs to be
> created. This would require some cleverness to make the lookup fast
> for large pools where the entire table might not fit into memory. My
> only concern is that this may require O(n^2) or O(nlogn) operations
> vs. the O(n) for the first method.

But this requires the pool to be in sync with the master.  I'd rather 
recompute a hash and deal with collisions like the server normally does. 
  The structure of the pool is designed to make this reasonably fast, 
although I think the hash is based on the uncompressed content and you'd 
have a compressed copy at this point.

-- 
   Les Mikesell
    lesmikesell AT gmail DOT com



------------------------------------------------------------------------------
OpenSolaris 2009.06 is a cutting edge operating system for enterprises 
looking to deploy the next generation of Solaris that includes the latest 
innovations from Sun and the OpenSource community. Download a copy and 
enjoy capabilities such as Networking, Storage and Virtualization. 
Go to: http://p.sf.net/sfu/opensolaris-get
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/

<Prev in Thread] Current Thread [Next in Thread>