BackupPC-users

Re: [BackupPC-users] BackupPC_dump hangs with: .: size doesn't match (12288 vs 17592185913344)

2011-04-04 21:52:45
Subject: Re: [BackupPC-users] BackupPC_dump hangs with: .: size doesn't match (12288 vs 17592185913344)
From: "Jeffrey J. Kosowsky" <backuppc AT kosowsky DOT org>
To: "General list for user discussion, questions and support" <backuppc-users AT lists.sourceforge DOT net>
Date: Mon, 04 Apr 2011 21:49:55 -0400
Holger Parplies wrote at about 18:59:00 +0200 on Monday, April 4, 2011:
 > Hi,
 > 
 > John Rouillard wrote on 2011-03-31 15:20:23 +0000 [[BackupPC-users] 
 > BackupPC_dump hangs with: .: size doesn't match (12288 vs 17592185913344)]:
 > > [...]
 > > I get a bunch of output (the share being backed up /etc on a centos
 > > 5.5. box) which ends with:
 > > 
 > >   attribSet: dir=f%2fetc exists
 > >   attribSet(dir=f%2fetc, file=zshrc, size=640, placeholder=1)
 > >   Starting file 0 (.), blkCnt=134217728, blkSize=131072, remainder=0
 > >   .: size doesn't match (12288 vs 17592185913344)
 > 
 > at first glance, this would appear to be an indication of something I have
 > been suspecting for a long time: corruption - caused by whatever - in an
 > attrib file leading to the SIGALRM abort. If I remember correctly, someone
 > (presumably File::RsyncP) would ordinarily try to allocate space for the file
 > (though that doesn't seem to make sense, so I probably remember incorrectly)
 > and either gives up when that fails or refrains from trying in the first
 > place, because the amount is obviously insane.
 > 
 > The weird thing in this case is that we're seeing a directory. There is
 > absolutely no reason (unless I am missing something) to worry about the
 > *size* of a directory. The value is absolutely file system dependant and
 > not even necessarily an indication of the *current* amount of entries in
 > the directory. In any case, you restore the contents of a directory by
 > restoring the files in it, and you (incrementally) backup a directory by
 > determining if any files have changed or been added. The *size* of a
 > directory will not help with that decision.
 > 

I too don't understand why rsync would try to treat a directory '.'
like a file. I mean block count and block size aren't meaningful when
rsyncing a directory (at least according to my understanding of the
rsync algorithm).
Could it possibly be a character encoding issue where there is some
file (maybe with spaces, non-printable characters, or some
non-recognized font) that rsync ends up seeing as '.' rather than as
the true file name. For example, I know Windows and Linux have different 
concepts
of what characters are allowed in file names and that cygwin tries to
do its best to make sense of the differences but doesn't always
succeed. 

 > Then again, the problematic file (or attrib file entry) may or may
 > not be the last one reported (maybe it's the first one not
 > reported?).

I would suggest some trial-and-error.
What if you move the directory somewhere else and try a new backup
(using say a new machine alias)?
Can you do a binary-search to narrow down the error?
In particular, is the problem only when doing an incremental? Does it
work if you force a full? Does it work if you make this the first
backup on a new machine alias?
 > 
 > > [...] I have had similar hanging issues before
 > > but usully scheduling a full backup or removing a prior backup or two
 > > in the chain will let things work again. However I would like to
 > > actually get this fixed this time around as it seems to be occurring
 > > more often recently (on different backuppc servers and against
 > > different hosts).
 > 
 > I agree with you there. This is probably one of the most frustrating problems
 > to be encountered with BackupPC, because there is no obvious cause and 
 > nothing
 > obvious to correct (throwing away part of your backup history for no better
 > reason than "after that it works again" is somewhat unsatisfactory).
 > 
 > The reason not to investigate this matter any further so far seems to have
 > been that it is usually "solved" by removing the reference backup (I believe
 > simply running a full backup will encounter the same problem again), because
 > people tend to want to get their backups back up and running. There are two
 > things to think about here:
 > 
 > 1.) Why does attrib file corruption cause the backup to hang? Is there no
 >     sane(r) way to deal with the situation?
 > 2.) How does the attrib file get corrupted in the first place?
 > 
 > Presuming it *is* attrib file corruption. Could you please send me a copy of
 > the attrib file off-list?
 > 
 > > If I dump the root attrib file (where /etc starts) for either last
 > > successful or the current (partial) failing backup I see:
 > > 
 > >   '/etc' => {
 > >     'uid' => 0,
 > >     'mtime' => 1300766985,
 > >     'mode' => 16877,
 > >     'size' => 12288,
 > >     'sizeDiv4GB' => 0,
 > >     'type' => 5,
 > >     'gid' => 0,
 > >     'sizeMod4GB' => 12288
 > >   },
 > 
 > I would expect the interesting part to be the '.' entry in the attrib file 
 > for
 > '/etc' (f%2fetc/attrib of the last successful backup, that is). And I would 
 > be
 > curious about how the attrib file was decoded, because I'd implement decoding
 > differently from how BackupPC does, though BackupPC's method does appear to 
 > be
 > well tested ;-).
 > 
 > > [...] the last few lines of strace show:
 > > 
 > > [...]
 > >   19368 15:00:38.199634 select(1, [0], [], NULL, {60, 0}) = 0 (Timeout)
 > >     <59.994597>
 > 
 > I believe this is the result of File::RsyncP having given up on the transfer
 > because of either a failing malloc() or a suppressed malloc(). I'll have to
 > find some time to check in more detail. I vaguely remember it was a rather
 > complicated matter, and there was never really enough evidence to support 
 > that
 > corrupted attrib files were really the cause. But I sure would like to get to
 > the bottom of this :-).
 > 
 > Regards,
 > Holger
 > 
 > ------------------------------------------------------------------------------
 > Create and publish websites with WebMatrix
 > Use the most popular FREE web apps or write code yourself; 
 > WebMatrix provides all the features you need to develop and 
 > publish your website. http://p.sf.net/sfu/ms-webmatrix-sf
 > _______________________________________________
 > BackupPC-users mailing list
 > BackupPC-users AT lists.sourceforge DOT net
 > List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
 > Wiki:    http://backuppc.wiki.sourceforge.net
 > Project: http://backuppc.sourceforge.net/

------------------------------------------------------------------------------
Xperia(TM) PLAY
It's a major breakthrough. An authentic gaming
smartphone on the nation's most reliable network.
And it wants your games.
http://p.sf.net/sfu/verizon-sfdev
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/