BackupPC-users

Re: [BackupPC-users] Bad md5sums due to zero size (uncompressed) cpool files - WEIRD BUG

2011-10-07 11:22:06
Subject: Re: [BackupPC-users] Bad md5sums due to zero size (uncompressed) cpool files - WEIRD BUG
From: Holger Parplies <wbppc AT parplies DOT de>
To: "Jeffrey J. Kosowsky" <backuppc AT kosowsky DOT org>
Date: Fri, 7 Oct 2011 17:19:46 +0200
Hi,

Jeffrey J. Kosowsky wrote on 2011-10-07 01:08:03 -0400 [Re: [BackupPC-users] 
Bad md5sums due to zero size (uncompressed) cpool files - WEIRD BUG]:
> Holger Parplies wrote at about 05:46:36 +0200 on Friday, October 7, 2011:
>  > Jeffrey J. Kosowsky wrote on 2011-10-06 22:54:44 -0400 [Re: 
> [BackupPC-users] Bad md5sums due to zero size (uncompressed) cpool?files - 
> WEIRD BUG]:
>  > > [...]
>  > > Why would PoolWrite.pm change the mod time of a pool file that is not
>  > > in the actual backup?

ok, so by now we seem to have concluded that Xfer::Rsync *might* modify a file
in a previous backup. Without looking at the code, it's just speculation, and
I haven't currently got the time, considering the code is quite complex.

After some reflection, it *does* make sense to add checksums to a file in a
previous backup, even if the file is found to have changed in the current
backup, because the *previous* backup may still be the reference for a future
backup (also, the pool file might be reused).

>  > Also: can you give a better resolution on the mod times, i.e. which one is
>  > older?
> 
> OK...
> #82: Modify: 2011-04-27 03:05:04.551226502 -0400
> #110: Modify: 2011-04-27 03:05:19.813321479 -040
> 
> So #110 was modified 15 seconds after #82. Hmmm

Strange time zone on #110 ;-).
15 seconds seems to be *ages*, considering we're talking about small files
(right?). I agree with you: Hmmm.

> Note both of those files have rsync checksums.
> 
> When I looked at a couple of files without the rsync checksums, the
> mod times differed by a day.

Meaning they corresponded to the backup times?

> As an aside, I noticed that when I looked at version without the rsync
> checksum, that the corrected version also doesn't have an rsync
> checksum even after having being backed up many times subsequently --
> Now I thought that the rsync checksum should be added after the 2nd or
> 3rd time the file is read... This makes me wonder whether there is
> potentially an issue with the rsync checksum...

I had thought the same (2nd backup - for the 3rd they should be present and
give a speedup). This is another thing we could check - which files in our
backups have checksum caches. What makes *me* wonder is that this still only
seems to happen to you.

>  > > Also, the XferLOG entry for both backups #82 and #110 have the line:
>  > >  pool     644       0/0         252 
> usr/share/FlightGear/Timezone/America/Port-au-Prince
>  > > 
>  > > But this doesn't make sense since if the new pool file was created as
>  > > part of backup #110, shouldn't it say 'create' and not 'pool'?
>  > 
>  > Considering the mtime, yes.

Or, put differently: no.

If the file *has* rsync checksums, they wouldn't have been added on the first
backup. mtime says they were added by #110, thus the file would have been in
the pool without checksums then. Or would it have been the *reference file* at
the time checksums were added!?

> And we know BackupPC *thinks* it's a new file since it creates a new
> pool file chain member. But what and why did the original file get
> clobbered just before the then?

No, we don't really know what the situation was then. We're trying to
reconstruct it from evidence, which we are having a hard time interpreting
(at least I am).

>  > > None of this makes sense to me but somehow I suspect that herein may be a
>  > > clue to the problem...
>  > 
>  > Xfer::Rsync opening the reference file?
> 
> But what would cause it to truncate the data portion?

Use the force, read the source. I don't think it's *meant* to clobber the data
portion.

> Maybe it's something with rsync checksum caching/seeding when it tries
> to add a checksum?

There does seem to be a connection, though there are those 8-byte files
*without* checksum cache. Are these failed attempts of some sort, or are they
- like you first said - an indication that it's *not* checksum caching?

> I'm just guessing here...

So am I.

Regards,
Holger

------------------------------------------------------------------------------
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security
threats, fraudulent activity, and more. Splunk takes this data and makes
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2dcopy2
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/