BackupPC-users

Re: [BackupPC-users] 2 cpool files with same checksum, different (compressed content) but same zcatt'ed content?????

2008-10-31 11:09:52
Subject: Re: [BackupPC-users] 2 cpool files with same checksum, different (compressed content) but same zcatt'ed content?????
From: "Jeffrey J. Kosowsky" <backuppc AT kosowsky DOT org>
To: "General list for user discussion, questions and support" <backuppc-users AT lists.sourceforge DOT net>
Date: Fri, 31 Oct 2008 11:07:46 -0400
Tino Schwarze wrote at about 12:20:50 +0100 on Friday, October 31, 2008:
 > On Thu, Oct 30, 2008 at 11:16:05PM -0400, Jeffrey J. Kosowsky wrote:
 > 
 > > I must be missing something on this whole compression, pooling, and
 > > checksum matter.
 > > 
 > > I found 2 files in my cpool that have the same checksum (one is _0)
 > > but 'cmp' to different values. However, when I zcat them, they have
 > > the same value. I thought that (lossless) compression was a 1-1
 > > mapping?
 > 
 > Please post the output of the following:
 > ls -l yourfile*
 > md5sum yourfile*
 > BackupPC_zcat yourfile | md5sum
 > BackupPC_zcat yourfile_0 | md5sum

-rw-r----- 5 backuppc backuppc 7870 Oct 27 16:28 
/var/lib/BackupPC/cpool/5/f/8/5f87fe62e8254679c582097314f97fe3
-rw-r--r-- 2 backuppc backuppc 7681 Oct 28 09:17 
/var/lib/BackupPC/cpool/5/f/8/5f87fe62e8254679c582097314f97fe3_1

08be9e936c80024809fde108f6df9bb1 
/var/lib/BackupPC/cpool/5/f/8/5f87fe62e8254679c582097314f97fe3
ce46e80af4a086e29ae17b0f800362e1 
/var/lib/BackupPC/cpool/5/f/8/5f87fe62e8254679c582097314f97fe3_1

4266be808f85826aedf3c64c1e240203
4266be808f85826aedf3c64c1e240203

 > 
 > > But here we seem to have two files that are identical (and thus have
 > > the same checksum) but compress to 2 *different* results?
 > 
 > Again: The file name ist NOT the checksum of the whole file's contents!
 > It's just an MD5 sum which incorporates the first 256k of the file and
 > the file's original length (as I learned this week from this list).

Yes but this is the OPPOSITE of hash collisions. Hash collision is
when 2 *different* (uncompressed) files have the *same* checksum.

Here 2 *identical* (uncompressed) files have *different* checksums.

As Craig and Holger explained, this is probably also attributable to
corruption where some backups had the rsync caching seeds included an
others not.

 > 
 > > This would seem to be going against the grain of pooling where two
 > > identical files share the same pool entry.
 > > 
 > > What am I missing?
 > 
 > That hash collisions are expected.
Yes but that is not the case here.
 > 
 > Tino.
 > 
 > -- 
 > "What we nourish flourishes." - "Was wir nähren erblüht."
 > 
 > www.lichtkreis-chemnitz.de
 > www.craniosacralzentrum.de
 > 
 > -------------------------------------------------------------------------
 > This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
 > Build the coolest Linux based applications with Moblin SDK & win great prizes
 > Grand prize is a trip for two to an Open Source event anywhere in the world
 > http://moblin-contest.org/redirect.php?banner_id=100&url=/
 > _______________________________________________
 > BackupPC-users mailing list
 > BackupPC-users AT lists.sourceforge DOT net
 > List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
 > Wiki:    http://backuppc.wiki.sourceforge.net
 > Project: http://backuppc.sourceforge.net/

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/