BackupPC-users

Re: [BackupPC-users] BackupPC Pool synchronization?

2013-02-28 21:45:45
Subject: Re: [BackupPC-users] BackupPC Pool synchronization?
From: <backuppc AT kosowsky DOT org>
To: "General list for user discussion, questions and support" <backuppc-users AT lists.sourceforge DOT net>
Date: Thu, 28 Feb 2013 21:43:05 -0500
Mark Campbell wrote at about 14:10:13 -0700 on Thursday, February 28, 2013:
 > So I'm trying to get a BackupPC pool synced on a daily basis from a 1TB MD 
 > RAID1 array to an external Fireproof drive (with plans to also sync to a 
 > remote server at our collo).  I found the script BackupPC_CopyPcPool.pl by 
 > Jeffrey, but the syntax and the few examples I've seen online have indicated 
 > to me that this isn't quite what I'm looking for, since it appears to output 
 > it to a different layout.  I initially tried the rsync method with -H, but 
 > my server would end up choking at 350GB.  Any suggestions on how to do this?

The bottom line is that other than doing a block level file system
copy there is no "free lunch" that gets around the hard problem of
copying over densely hard-linked files.

As many like yourself have noted, rsync bogs down using the -H (hard
links) flag, in part because rsync knows nothing of the special structure
of the pool & pc trees so it has to keep full track of all possible
hard links.

One solution is BackupPC_tarPCCopy which uses a tar-like perl script
to track and copy over the structure.

My script BackupPC_copyPcPool tries to combine the best of both
worlds. It allows you to use rsync or even "cp -r" to copy over the
pool disregarding any hard links. The pc tree with its links to the
pool are re-created by creating a flat file listing all the links,
directories, and zero size files that comprise the pc tree. This is
done with the help of a hash that caches the inode number of each pool
entry. The pc tree is then recreated by sequentially (re)creating
directories, zero size files, and links to the pool.

I have substantially re-written my original script to make it orders
of magnitude faster by substituting a packed in-memory hash for the
file-system inode-tree I used in the previous version. Several other
improvements have been added, including the ability to record full
file md5sums and to fix broken/missing links.

I was able to copy over a BackupPC tree consisting of 1.3 million pool
files (180 GB)  and 24 million pc tree entries (4 million directories, 20
million links, 300 thousand zero-length files) in the following time:

~4 hours to copy over the pool
~5 hours to create the flat file mapping out the pc tree directories,
  hard links & zero length files
~7 hours to convert the flat file into a new pc tree on the target filesystem

These numbers are approximate since I didn't really time it. But it
was all done on a low end AMD dual-core laptop with a single USB3
drive.

For this case, the flat file of links/directories/zero length files is 660 MB
compress (about 3.5 GB uncompressed). The inode caching requires about
250MB of RAM (mostly due to perl overhead) for the 1.3 million pool
files. 

Note, before I release the revised script, I also hope to add a feature that
allows the copying of one or more backups from the pc tree on one
machine to the pc tree on another machine (with a different
pool). This feature is not available on any other backup scheme... and
effectively will allow "incremental-like" backups.

I also plan to allow the option to more tightly pack the inode caching
to save memory at the expense of some speed. I should be able to fit
10 million pool nodes in a 300MB cache.

I would like to benchmark my revised routine against
BackupPC_tarPCCopy in terms of speed, memory requirement, and
generated file size...

------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_d2d_feb
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/