BackupPC-users

Re: [BackupPC-users] backup the backuppc pool with bacula

2009-06-11 09:24:56
Subject: Re: [BackupPC-users] backup the backuppc pool with bacula
From: Holger Parplies <wbppc AT parplies DOT de>
To: "Jeffrey J. Kosowsky" <backuppc AT kosowsky DOT org>
Date: Thu, 11 Jun 2009 14:31:02 +0200
Hi,

Jeffrey J. Kosowsky wrote on 2009-06-11 00:25:37 -0400 [Re: [BackupPC-users] 
backup the backuppc pool with bacula]:
> Holger Parplies wrote at about 04:22:03 +0200 on Thursday, June 11, 2009:
>  > Les Mikesell wrote on 2009-06-10 15:45:22 -0500 [Re: [BackupPC-users] 
> backup the backuppc pool with bacula]:
>  > [...]
>  > the file list [...] can and has been [optimized] in 3.0 (probably meaning
>  > protocol version 30, i.e. rsync 3.x on both sides).
> 
> Holger, I may be wrong here, but I think that you get the more
> efficient memory usage as long as both client & server are version >=3.0 
> even if protocol version is set to < 30 (which is true for BackupPC
> where it defaults back to version 28). 

firstly, it's *not* true. BackupPC (as client side rsync) is not
version >= 3.0. It's not even really rsync at all, and I doubt File::RsyncP
is more memory efficient than rsync, even if the core code is in C and copied
from rsync.

Secondly, I'm *guessing* that for an incremental file list you'd need a
protocol modification. I understand it that instead of one big file list
comparison done before transfer, 3.0 does partial file list comparisons during
transfer (otherwise it would need to traverse the file tree at least twice,
which is something you'd normally avoid). That would clearly require a
protocol change, wouldn't it?

Actually, I would think that rsync < 3.0 *does* need to traverse the file tree
twice, so the change might even have been made because of the wish to speed up
the transfer rather than to decrease the file list size (it does both, of
course, as well as better utilize network bandwidth by starting the transfer
earlier and allowing more parallelism between network I/O and disk I/O -
presuming my assumptions are correct).

> But I'm not an expert and my understanding is that the protocols themselves
> are not well documented other than looking through the source code.

Neither am I. I admit that I haven't even looked for documentation (or at the
source code). It just seems logical to implement it that way.

I can't rule out that the optimization could be possible with the older
protocol versions, but then, why wouldn't rsync have always operated that way?

>  > > > and how the rest of the community deals with getting pools of
>  > > > 100+GB offsite in less than a week of transfer time.
>  > > 
>  > > 100 Gigs might be feasible - it depends more on the file sizes and how 
>  > > many directory entries you have, though.  And you might have to make the 
>  > > first copy on-site so subsequently you only have to transfer the changes.
>  > 
>  > Does anyone actually have experience with rsyncing an existing pool to an
>  > existing copy (as in: verification of obtaining a correct result)? I'm 
> kind of
>  > sceptical that pool chain renumbering will be handled correctly. At least, 
> it
>  > seems extremely complicated to get right.
> 
> Why wouldn't rsync -H handle this correctly? 

I'm not saying it doesn't. I'm saying it's complicated. I'm asking whether
anyone has actually verified that it does. I'm asking because it's an
extremely rare corner case that the developers may not have had in mind and
thus may not have tested. The massive usage of hardlinks in a BackupPC pool
clearly is something they did not anticipate (or, at least, feel the need to
implement a solution for). There might be problems that appear only in
conjunction with massive counts of inodes with nlinks > 1.

In another thread, an issue was described that *could* have been caused by
this *not* working as expected (maybe crashing rather than doing something
wrong, not sure). It's unclear at the moment, and I'd like to be able to rule
it out on the basis of something more than "it should work, so it probably
does".

I'm also saying that pool backups are important enough to verify the contents
by looking closely at the corner cases we are aware of.

> And the renumbering will change the timestamps which should alert rsync to
> all the changes even without the --checksum flag.

This part I'm not sure on. Is it actually *guaranteed* that a rename(2) must
be implemented in terms of unlink(2) and link(2) (but atomically), i.e. that
it must modify the inode change time? The inode is not really changed, except
for the side effect of (atomically) decrementing and re-incrementing the link
count. By virtue of the operation being atomical, the link count is
*guaranteed* not to change, so I, were I to implement a file system, would
feel free to optimize the inode change away (or simply not implement it in
terms of unlink() and link()), unless it is documented somewhere that updating
the inode change time is mandatory (though it really is *not* an inode change,
so I don't see why it should be).

Does rsync even act on the inode change time? File modification time will be
unchanged, obviously. rsync's focus is on the file contents and optionally
keeping the attributes in sync (as far as it can). ctime is an indication that
attributes have been changed (which may mask a content change), but attributes
are compared "in full" anyway (if requested), aren't they?

Either way, if rsync is aware of the change, it will work (rsync should simply
need to delete the target and re-link according to its inode map, just as if
the link had not been there in the first place). If not, rsync would need to
keep and check a mapping {source inode number -> dest inode number} (for all
files with nlinks > 1) to find out if all links still reference the same inode.
That is a closer examination than is done for single link files without
--checksum, and a rather expensive one. I'm not saying this doesn't happen. I
didn't check the source code. It would make sense to make '-H' add this check.

> Or are you saying it would be difficult to do this manually with a
> special purpose algorithm that tries to just track changes to the pool
> and pc files?

I haven't given that topic much thought. The advantage in a special purpose
algorithm is that we can make assumptions about the data we are dealing with.
We shouldn't do this unnecessarily, but if it has notable advantages, then why
not? "Difficult" isn't really a point. The question is whether it can be done
efficiently.

> More generally, I think we really need to find a guinea pig to spend
> some time testing the methods that you and I have discussed of
> creating a sorted inode database of the pool.

Yes, and we need to think about how to *verify* such a copy. A verification
tool would also answer my question above. The algorithm for creating the
initial copy is not complex, so testing some sample cases might be sufficient.
I expect incremental updates to make the situation far more difficult. It
could be difficult to even imagine which cases could go wrong, so it would be
nice to have a tool that fully verifies that content and hardlink relationships
in a pool copy match the original.

> Then it would be
> instructive to compare execution times vs the straight rsync -H method
> and vs. the tar method. For small pools, I imagine rsync -H would be
> faster, but at some point the database would presumably be
> faster. Presumably the tar method would be slowest of all. The devil
> of course is in the details.

I agree. But the important point is scalability rather than speed. We need
something that will continue to work regardless of pool size. You can still
use rsync on small pools and switch at an appropriate time (i.e. before a
failing rsync update breaks your copy, even if the "database version", as you
call it, is still somewhat slower).

> Either way, this issue seem to be becoming a true FAQ for this list --

Always has been ;-).

> so we should probably agree on some definitive answer (or set of
> answers) so that we can put this one to rest.

Definitely. Somehow I still see people giving different answers and restarting
the discussion all over again ;-).

> My personal belief is that while disk images or ZFS may be the "ideal"
> answer, there still is a need for an alternative even if slower method
> for reliably backing up (and ideally incrementally synching) just
> $topdir for those who don't/can't back up the whole partition or who
> can't run ZFS. My understanding is that the simple answer of "rsync -H"
> seems to not be reliable enough on large pools at least for some
> people.

In addition, there are cases where the "copy" is to be stored on something
that doesn't support hardlinks. As long as the "copy" doesn't have to be
functional (but rather allow re-creating a functional pool), that is no
problem. It is not difficult to accomodate for this case - at least for the
initial copy - if we have it in mind from the start. We just need to split
up the copy operation into a "send" and a "receive" part (like 'tar -c' and
'tar -x') which can be plugged together for a straight copy or generate an
easily storable intermediate result. Incrementals might be harder, but we
should at least look into it.

Furthermore, I'd like to keep pool merging in mind. If we had a way to copy a
pool into a pre-existing *different* pool, that would be great. And it really
doesn't seem hard either, if we use PoolWrite() instead of File::Copy (well,
there might be some details to figure out, and it might be easier to make use
of the already known BackupPC hash and simply handle collisions like
PoolWrite() would). It may completely conflict with incremental updates
though. Or incrementals might be new pc/ file trees (based on timestamps) that
are merged into a pre-existing pool copy? Hmm ... there's potential there.
Generate a list of pool files and *some* pc/ directories, based on timestamp,
instead of attempting to handle the whole structure. That would miss changes
of existing backups (like deleting individual files), but BackupPC doesn't
really endorse changes of existing backups, does it? ;-)

Regards,
Holger

P.S.: I won't find any more time until at least Sunday, so please excuse me
      for not responding until then.

------------------------------------------------------------------------------
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables unlimited
royalty-free distribution of the report engine for externally facing 
server and web deployment.
http://p.sf.net/sfu/businessobjects
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/