BackupPC-users

Re: [BackupPC-users] Bad md5sums due to zero size (uncompressed) cpool files - WEIRD BUG

2011-10-06 22:11:21
Subject: Re: [BackupPC-users] Bad md5sums due to zero size (uncompressed) cpool files - WEIRD BUG
From: "Jeffrey J. Kosowsky" <backuppc AT kosowsky DOT org>
To: Holger Parplies <wbppc AT parplies DOT de>
Date: Thu, 06 Oct 2011 22:09:52 -0400
Holger Parplies wrote at about 02:45:56 +0200 on Friday, October 7, 2011:
 > Hi,
 > 
 > Jeffrey J. Kosowsky wrote on 2011-10-06 19:28:38 -0400 [Re: [BackupPC-users] 
 > Bad md5sums due to zero size (uncompressed)?cpool files - WEIRD BUG]:
 > > Holger Parplies wrote at about 17:54:05 +0200 on Thursday, October 6, 2011:
 > > [...]
 > >  > Actually, what I would propose [...] would be to
 > >  > test for pool files that decompress to zero length. [...]
 > > 
 > > Actually this could be made even faster since there seem to be 2
 > > cases:
 > > 1. Files of length 8 bytes with first byte = 78 [no rsync checksums]
 > > 2. Files of length 57 bytes with first byte = d7 [rsync checksums]
 > > 
 > > So, all you need to do is to stat the size and then test the
 > > first-byte
 > 
 > I'm surprised that that isn't faster by orders of magnitude. Running both
 > BackupPC_verifyPool and the modified version which does exactly this in
 > parallel, it's only about 3 times as fast (faster, though, when traversing
 > directories currently in cache). An additionally running 'find' does report
 > some 57-byte files, but they don't seem to decompress to "". Let's see how
 > this continues. I still haven't found a single zero-length file in my pool
 > so far (BackupPC_verifyPool at 3/6/*, above check at 2/0/*).
 > 

Do those 57 byte files have rsync checksums or are they just
compressed files that happen to be 57 bytes long?

Given that the rsync checksums have both block and file checksums,
it's hard to believe that a 57 byte file including rsync checksums
would have much if any data. Even with no blocks of data, you have:
- 0xb3 separator (1 byte)
- File digest which is 2 copies of the full 16 byte MD4 digest (32 bytes)
- Digest info consisting of block size, checksum seed, length of the block 
digest and the magic number (16 bytes)

The above total 49 bytes which is exactly the delta between a 57 byte
empty compressed file with rsync checksums and an 8 byte empty
compressed file without rsync checksums. The common 8 bytes is
presumably the zlib header (which I think is 2 bytes) and the trailer
which would then be 6 bytes.

Note: If you have any data, then you would have 20 bytes (consisting a 4 byte
Adler32 and 16byte MD4 digest) for each block of data.


------------------------------------------------------------------------------
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security
threats, fraudulent activity, and more. Splunk takes this data and makes
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2dcopy2
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/

<Prev in Thread] Current Thread [Next in Thread>