BackupPC-users

Re: [BackupPC-users] How does BackupPC_tarPCCopy getting around hard link issue?

2011-01-21 12:10:34
Subject: Re: [BackupPC-users] How does BackupPC_tarPCCopy getting around hard link issue?
From: "Jeffrey J. Kosowsky" <backuppc AT kosowsky DOT org>
To: Craig Barratt <cbarratt AT users.sourceforge DOT net>
Date: Fri, 21 Jan 2011 12:07:36 -0500
AHHHH OK - so no magic.
I just coded up a new way that should in general be significantly
faster.

Basically, I create a new inode-centered pool that I call 'ipool' that
is a decimal-based tree (rather than the hexadecimal-based pool/cpool
trees). You can set how many levels you want.  Then I recurse through
the pool/cpool and for every entry, I store a corresponding file in
the ipool based on the pool/cpool *inode* number. The file's contents
are set to the *name* of the pool/cpool file (actually the path
relative to TopDir). Note that the ipool is indexed by the least
significant digits of the inode number to ensure more uniform
distribution across the tree.

Then you can recurse through the pc tree and quickly look up each
inode to find it's pool/cpool location via my ipool construct.

I haven't benchmarked, but I have to believe that this will in general
be significantly faster than (re)computing the partial file md5sum for
each file in the pc tree (though caching does help of course). Also my
method requires constant memory so it scales nicely.

Finally, I'm not sure if you implement it in BackupPC_tarPCCopy, but
if for some reason a pc tree entry (other than backupInfo) does not
have its inode in the ipool then I flag it and optionally correct it
by linking the file back into the pool/cpool. By the way, this alone
could be used as a much faster approach to solving Robin's quetion
earlier where she needed to check and fix a large pc tree where a
number of files had nlinks >1 but *none* of them were in the
pool/cpool.


Craig Barratt wrote at about 01:56:31 -0800 on Friday, January 21, 2011:
 > Jeffrey,
 > 
 > > I am trying to understand how BackupPC_tarPCCopy figures out all the
 > > hard links from the PC directory to the pool without doing a lot of
 > > work and/or without caching lots of pool inodes.
 > 
 > It just opens the file to compute the pool digest.  If there are
 > multiple files in the pool with the same digest it compares inode
 > numbers to determine the match.
 > 
 > Craig

------------------------------------------------------------------------------
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires 
February 28th, so secure your free ArcSight Logger TODAY! 
http://p.sf.net/sfu/arcsight-sfd2d
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/