BackupPC-users

[BackupPC-users] Bad md5sums due to zero size (uncompressed) cpool files - WEIRD BUG

2011-10-04 19:00:25
Subject: [BackupPC-users] Bad md5sums due to zero size (uncompressed) cpool files - WEIRD BUG
From: "Jeffrey J. Kosowsky" <backuppc AT kosowsky DOT org>
To: General list for user discussion <backuppc-users AT lists.sourceforge DOT net>
Date: Tue, 04 Oct 2011 18:58:51 -0400
After the recent thread on bad md5sum file names, I ran a check on all
my 1.1 million cpool files to check whether the md5sum file names are
correct.

I got a total of 71 errors out of 1.1 million files:
- 3 had data in it (though each file was only a few hundred bytes
  long)

- 68 of the 71 were *zero* sized when decompressed
         29 were 8 bytes long corresponding to zlib compression of a zero
         length file

         39 were 57 bytes long corresponding to a zero length file with an
         rsync checksum

Each such cpool file has anywhere from 2 to several thousand links

The 68 *zero* length files should *not* be in the pool since zero
length files are not pooled. So, something is really messed up here.

It turns out though that none of those zero-length decompressed cpool
files were originally zero length but somehow they were stored in the
pool as zero length with an md5sum that is correct for the original
non-zero length file.

Some are attrib files and some are regular files.

Now it seems unlikely that the files were corrupted after the backups
were completed since the header and trailers are correct and there is
no way that the filesystem would just happen to zero out the data
while leaving the header and trailers intact (including checksums).

Also, it's not the rsync checksum caching causing the problem since
some of the zero length files are without checksums.

Now the fact that the md5sum file names are correct relative to the
original data means that the file was originally read correctly by
BackupPC..

So it seems that for some reason the data was truncated when
compressing and writing the cpool/pc file but after the partial file
md5sum was calculated. And it seems to have happened multiple times
for some of these files since there are multiple pc files linked to
the same pool file (and before linking to a cpool file, the actual
content of the files are compared since the partial file md5sum is not
unique).

Also, on my latest full backup a spot check shows that the files are
backed up correctly to the right non-zero length cpool file which of
course has the same (now correct) partial file md5sum. Though as you
would expect, that cpool file has a _0 suffix since the earlier zero
length is already stored (incorrectly) as the base of the chain.

I am not sure what is going on with the other 3 files since I have yet
to find them in the pc tree (my 'find' routine is still running)

I will continue to investigate this but this is very strange and
worrying since truncated cpool files means data loss!

In summary, what could possibly cause BackupPC to truncate the data
sometime between reading the file/calculating the partial file md5sum
and compressing/writing the file to the cpool?

------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2dcopy1
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/