Re: [BackupPC-users] more efficient: dump archives over the internet or

On 11/2/2010 2:42 PM, Frank J. Gómez wrote:
> A little background:
> ==============
> I've been hacking on a copy of BackupPC_archiveHost to run archives
> through GPG before saving them to disk.  The reason for this is that,
> when I say "saving to disk," I mean saving to an Amazon s3 share mounted
> locally via s3fs <http://code.google.com/p/s3fs/wiki/FuseOverAmazon>.
>   Apparently the policy makers here don't trust Amazon sysadmins to stay
> out of their data.  Needless to say, we're preparing for a disaster
> recovery scenario in case the building, with all our machines and
> backuppc server inside, burns down to the ground.
>
> Anyway, I thought I had it all figured out, but when I decrypt, gunzip,
> and untar the resulting file, I get some "tar: Skipping to next header"
> messages in the output, and, although I do get some files out of the
> archive, eventually tar just hangs.  I was bouncing some ideas off a
> colleague of mine, and he suggested I was going about this all wrong:
> "So you have a nice non-redundant repo, and you want to make it
> redundant before you push it over the net??? Talk sense man!"
>
> The main question:
> ==============
> He thinks it would be more bandwidth-efficient to tar up and encrypt the
> pool, which accounts for duplicate files and so forth, and send that
> over to s3.  I counter that the pool will contain data concerning the
> last 2 weeks or so of changes, which I'm not interested in for the
> purposes of disaster recovery, and that transferring over that extra
> data is less efficient.  Who's right?  And if it's my colleague, which
> folders should I be interested in?  It looks to me like cpool, log, and
> pc, as pool is empty for me (I use compression on all backups).

Your problem is that the non-redundancy maintained by backuppc depends 
on hardlinks in the filesystem which aren't going to work on s3fs and 
would need atomic operations to handle any other way.

I just ran across http://code.google.com/p/brackup/ which appears to do 
exactly what you want, but as a command line run from a single host.  If 
you have enough of a backup window to run it independently, that might 
work better than pulling the copy back out of backuppc.  Or, since it is 
perl, maybe you can glue it into something like BackupPC_tarCreate to 
pull the source data from backuppc but store it chunked, de-duped, 
compressed, and encrypted in the cloud.

-- 
   Les Mikesell
    lesmikesell AT gmail DOT com


------------------------------------------------------------------------------
Nokia and AT&T present the 2010 Calling All Innovators-North America contest
Create new apps & games for the Nokia N8 for consumers in  U.S. and Canada
$10 million total in prizes - $4M cash, 500 devices, nearly $6M in marketing
Develop with Nokia Qt SDK, Web Runtime, or Java and Publish to Ovi Store 
http://p.sf.net/sfu/nokia-dev2dev
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] more efficient: dump archives over the internet or copy the whole pool?