> * For most people, rsync does not work to replicate a backup server > effectively. Period. I think *no* one would suggest this as a reliable > ongoing method of replicating a BackupPC server. Ever. > > * The best methods for this boil down to two camps: > 1) Run two BackupPC servers and have both back up the hosts > directly > No replication at all: it just works. > 2) Use some sort of block-based method of replicating the data > > * Block-based replication boils down to two methods > 1) Use md or dm to create a RAID-1 array and rotate members of > this array in and out > 2) Use LVM to create snapshots of partitions and dd the partition > to a different drive > (I guess 3) Stop BackupPC long enough to do a dd of the partition > *without* lVM) >
I think there is a 3rd camp: 3. Scripts that understand the special structure of the pool and pc trees and efficiently create lists of all hard links in pc directory. a] BackupPC_tarPCCOPY Included in standard BackupPC installations. It uses a perl script to recurse through the pc directory, calculate (and cache if you have enough memory) the file name md5sums and then uses that to create a tar-formatted file of the hard links that need to be created. This routine has been well-tested at least on smaller systems.
b] BackupPC_copyPcPool Perl script that I recently wrote that should be significantly faster than [a], particularly on machines with low memory and/or slower cpus. This script creates a new temporary inode-number indexed pool to allow direct lookup of links and avoid having to calculate and check file name md5sums. The pool is then rsynced (without hard links -- i.e. no -H flag) and then the restore script is run to recreate the hard links. I recently used this to successfully copy over a pool of almost 1 million files and a pc tree of about 10 million files. See the recent archives to retrieve a copy.
Some tape backup systems aren't smart about hard links If you backup the BackupPC pool to tape you need to make sure that the
tape backup system is smart about hard links. For example, if you
simply try to tar the BackupPC pool to tape you will backup a lot more
data than is necessary. Using the example at the start of the installation section, 65 hosts are
backed up with each full backup averaging 3.2GB. Storing one full backup
and two incremental backups per laptop is around 240GB of raw data. But
because of the pooling of identical files, only 87GB is used (with
compression the total is lower). If you run du or tar on the data
directory, there will appear to be 240GB of data, plus the size of the
pool (around 87GB), or 327GB total. If your tape backup system is not smart about hard links an alternative
is to periodically backup just the last successful backup for each host
to tape. Another alternative is to do a low-level dump of the pool
file system (ie: /dev/hda1 or similar) using dump(1). Supporting more efficient tape backup is an area for further
development.
I think that this answers your questions about tape requirements if directly tar'ing the pool. You might be better off just scheduling host archives periodically.
I don't know if Jeffrey Kosowsky still monitors the list, but somebody might have a copy of his scripts (3b, above). Unfortunately, these were part of the original BackupPC Wiki, which is no longer available.
------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are
consuming the most bandwidth. Provides multi-vendor support for NetFlow,
J-Flow, sFlow and other flows. Make informed decisions using capacity planning
reports.http://sdm.link/zohodev2dev