BackupPC-users

Re: [BackupPC-users] Duplicate files in pool with same CHECKSUM and same CONTENTS

2008-10-30 06:14:48
Subject: Re: [BackupPC-users] Duplicate files in pool with same CHECKSUM and same CONTENTS
From: Tino Schwarze <backuppc.lists AT tisc DOT de>
To: backuppc-users AT lists.sourceforge DOT net
Date: Thu, 30 Oct 2008 11:13:27 +0100
Hi Jeffrey,

On Thu, Oct 30, 2008 at 03:55:16AM -0400, Jeffrey J. Kosowsky wrote:

> I have found a number of files in my pool that have the same checksum
> (other than a trailing _0 or _1) and also the SAME CONTENT. Each copy
> has a few links to it by the way.

That's intentional - what are the link counts for the files? 
If you look at BackupPC's status page, there is a line like:

* Pool hashing gives 649 repeated files with longest chain 28, 

> Why is this happening? 
>   Isn't this against the whole theory of pooling.  It also doesn't seem
>   to get cleaned up by BackupPC_nightly since that has run several times
>   and the pool files are now several days old.

Because there is a file-system dependent limit to the number of hard
links a file may have. Look at $Conf{HardLinkMax} in config.pl.

Hm. I just took a look in my cpool and found some files which didn't
hit the hardlink count yet, but have a _0 and _1:
.../cpool/0/0 # ls -l c/00cd83be1ea3c1ffa3c6af2f4e310206* 
-rw-r----- 4371 backuppc users 34 2005-01-14 17:01 
c/00cd83be1ea3c1ffa3c6af2f4e310206 
-rw-r----- 3536 backuppc users 34 2005-03-02 02:22 
c/00cd83be1ea3c1ffa3c6af2f4e310206_0 
-rw-r-----  439 backuppc users 34 2006-03-11 02:04 
c/00cd83be1ea3c1ffa3c6af2f4e310206_1 

MD5Sums are not equal for all files, so maybe something got corrupted
(or I updated BackupPC during the time - the files are rather old!):
.../cpool/0/0 # md5sum c/00cd83be1ea3c1ffa3c6af2f4e310206*
51ef559d1d7fa02c05fa032729c85804  c/00cd83be1ea3c1ffa3c6af2f4e310206
51ef559d1d7fa02c05fa032729c85804  c/00cd83be1ea3c1ffa3c6af2f4e310206_0
7e2276750fc478fa30142aa808df2a1f  c/00cd83be1ea3c1ffa3c6af2f4e310206_1

AFAIK, I started with $Conf{HardLinkMax} set to 32.000. As the files are
very old, a lot of links might have expired already.

I'm not sure though, how the file name is derived, I found another file
with same name but different MD5 sum:
.../cpool/0/0 # md5sum 8/0084734e7242df0fbc186ba6741d1bab*
db224998946bac7859f2448f41c58f88  8/0084734e7242df0fbc186ba6741d1bab
d1d8f3a86ae5492de0bf11f5cfb45860  8/0084734e7242df0fbc186ba6741d1bab_0

IIRC, BackupPC_nightly should perform chain cleaning.

Tino.

-- 
"What we nourish flourishes." - "Was wir nähren erblüht."

www.lichtkreis-chemnitz.de
www.craniosacralzentrum.de

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/