BackupPC-users

Re: [BackupPC-users] backup the backuppc pool with bacula

2009-06-10 16:50:00
Subject: Re: [BackupPC-users] backup the backuppc pool with bacula
From: Les Mikesell <lesmikesell AT gmail DOT com>
To: backuppc-users AT lists.sourceforge DOT net
Date: Wed, 10 Jun 2009 15:45:22 -0500
jhaglund wrote:
> There are several implied references here to likely problems with rsync and 
> how they are all deal breakers.  I've been trying to find a solution to this 
> problem for weeks and have not found any direct documentation or evidence to 
> support what is being said here.  I'm not skeptical, though, I just need to 
> understand what's going on.

It boils down to how much RAM rsync needs to handle all the directory 
entries and hardlinks and the amount of time it takes to wade through 
them.

> Rsync is the only option for me, and I'm rather confused by the other 
> solutions floated in this and other threads.  On-site backup is precarious 
> and viable only in a datacenter type situation imho.  What about the fire 
> scenario?  Getting the data somewhere else is crucial, and in my case I am 
> limited to rsync through rsh.  I'm running rsync 3.0.6 but the server is 
> 2.6.x.  I have ~ 1.9 files found by rsync and it always fails on some level.  
> I use -aH but it randomly exits with an unknown error during remote 
> comparison or the initial transfers.  During the transfer phase it says its 
> sending data, but nothing shows up on the server.  The server admins are not 
> aware of any incompatibility with their filesystem and the internet does not 
> seem to deal with this problem, which brings me back to the initial question.

3.x on both ends might help. It claims to not need the whole directory 
in memory at once - but you'll still need to build a table to map all 
the inodes with more than one link  (essentially everything) to 
re-create the hardlinks so you have to throw a lot of RAM at it anyway. 
  You shouldn't actually crash unless you run out of both ram and swap, 
but if you push the system into swap you might as well quit anyway.

Note that if you can do rsync over ssh initiated from the other site, 
you could just run the backuppc server there, or a separate independent 
copy.  Unless you have a lot of duplication among the on-site servers 
there wouldn't be a huge difference in traffic after the initial copy 
and you don't have a single point of failure.

> What does one use if not rsync?

The main alternative is some form of image-copy of the archive 
partition.  This is only practical if you have physical access to the 
server or very fast network connections.

> There's no way to justify or implement backing up the entire pool every time 
> without a lot of bandwidth, which I don't have.  What exactly is rsync's 
> problem?  Do I really need to shut down backuppc every time I want to attempt 
> a sync or would syncing to a local disk and rsync'ing from that be 
> sufficient?  I'd really like to know the specifics of the hardlink and inode 
> problem talked about in this thread like how to find out how many I have and 
> what the threshold is for Trouble and how the rest of the community deals 
> with getting pools of 100+GB offsite in less than a week of transfer time.

100 Gigs might be feasible - it depends more on the file sizes and how 
many directory entries you have, though.  And you might have to make the 
first copy on-site so subsequently you only have to transfer the changes.

> Lots of info requests, I know, but I really appreciate the help.  My ISP and 
> all the experts I've tapped are completely stumped on this one.

The root of the problem is that rsync has to include the entire archive 
in one pass to map the matching hardlinks - and it has to be able to 
hold the directory and inode table in RAM to do it at a usable speed. 
The other limiting issue is that the disk heads have to move around a 
lot to read and re-create all those directory entries and update the 
inode link counts.

-- 
   Les Mikesell
    lesmikesell AT gmail DOT com

------------------------------------------------------------------------------
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables unlimited
royalty-free distribution of the report engine for externally facing 
server and web deployment.
http://p.sf.net/sfu/businessobjects
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/