BackupPC-users

Re: [BackupPC-users] Duplicate files in pool with same CHECKSUM and same CONTENTS

2008-10-30 12:39:23
Subject: Re: [BackupPC-users] Duplicate files in pool with same CHECKSUM and same CONTENTS
From: "Jeffrey J. Kosowsky" <backuppc AT kosowsky DOT org>
To: "General list for user discussion, questions and support" <backuppc-users AT lists.sourceforge DOT net>
Date: Thu, 30 Oct 2008 12:26:21 -0400
Jeffrey J. Kosowsky wrote at about 10:04:26 -0400 on Thursday, October 30, 2008:
 > Holger Parplies wrote at about 11:29:49 +0100 on Thursday, October 30, 2008:
 >  > Hi,
 >  > 
 >  > Jeffrey J. Kosowsky wrote on 2008-10-30 03:55:16 -0400 [[BackupPC-users] 
 > Duplicate files in pool with same CHECKSUM and same CONTENTS]:
 >  > > I have found a number of files in my pool that have the same checksum
 >  > > (other than a trailing _0 or _1) and also the SAME CONTENT. Each copy
 >  > > has a few links to it by the way.
 >  > > 
 >  > > Why is this happening? 
 >  > 
 >  > presumably creating a link sometimes fails, so BackupPC copies the file,
 >  > assuming the hard link limit has been reached. I suspect problems with 
 > your
 >  > NFS server, though not a "stale NFS file handle" in this case, since 
 > copying
 >  > the file succeeds. Strange.
 > 
 > Yes - I am beginning to think that may be true. However as I mentioned
 > in the other thread, the syslog on the nfs server is clean and the one
 > on the client shows only about a dozen or so nfs timeouts over the
 > past 12 hours which is the time period I am looking at now. Otherwise,
 > I don't see any nfs errors.
Actually I traced these errors to a timout due to disks on the NAS
spinning up. They appear to be just soft timeouts (and not related to
this link problem)

 > So if it is a nfs problem, something seems to be happening somewhat
 > randomly and invisibly to the filesystem.
 > 
 >  > 
 >  > >   Isn't this against the whole theory of pooling.
 >  > 
 >  > Well, yes :). But the action of copying the file when the method to 
 > implement
 >  > pooling (hard links) does not work for some reason (max link count 
 > reached, or
 >  > NFS file server errors if you think about it - you *do* get some level of
 >  > pooling; otherwise you'd have an independant copy or a missing file each 
 > time
 >  > linking fails) is perfectly reasonable.
 >  > 
 >  > >   It also doesn't seem
 >  > >   to get cleaned up by BackupPC_nightly since that has run several times
 >  > >   and the pool files are now several days old.
 >  > 
 >  > BackupPC_nightly is not supposed to clean up that situation. It could be
 >  > designed to do so (the situation may arise when a "link count overflow" is
 >  > resolved by expired backups), but it would be horribly inefficient: for 
 > the
 >  > file to be eliminated, you would have to find() every occurrence of the 
 > inode
 >  > in all pc/* trees and replace them with links to the duplicate(s) to be 
 > kept.
 >  > You don't want that.
 > 
 > Yes but it would be nice to have a switch perhaps that allowed this
 > more comprehensive cleanup.
 > Even in a non-error case, I can imagine situations where at some point
 > the max file links may have been exceeded and then backups were
 > deleted so that the link count came back down below the max.
 > 
 > The logic wouldn't seem to be that horrendous. Since you would only
 > need to walk down the pc/* trees once -- i.e. first walk down
 > (c)pool/* to compile list of repeated but identical checksums. Then
 > walk down the pc/* tree to find the files on the list.
 > 
 >  > 
 >  > > What can I do to clean it up?
 >  > 
 >  > Fix your NFS server? :) Is there a consistent maximum number of links, or 
 > do
 >  > the copies seem to happen randomly? Honestly, I don't think the savings 
 > you
 >  > may gain from storing the pool over NFS are worth the headaches. What is
 >  > cheaper about putting a large disk into a NAS device than into your 
 > BackupPC
 >  > server? Well, yes, you can share it ... how about exporting part of the 
 > disk
 >  > from the BackupPC server (I would still recommend distinct partitions)?
 >  > 
 > 
 > You are right in theory. But I would still like to get NFS working for
 > various reasons and it is always a good "learning experience" to
 > troubleshoot such things ;)
 > 

Now this is interesting...
Looking through my BackupPC log files, I noticed that this problem
*FIRST* occurred on Oct 27 and has affected every backup since. The
error are only occurring when BackupPC_link runs (and I didn't have
any problems with BackupPC_link in the 10 or so previous days that I
have been using BackupPC).

So, I used both find and the incremental backups themselves to see
what happened between the last error-free backup at 18:08PM on Oct 26
and the first bad one at 1AM on Oct 27. But it doesn't seem like any
files changed on either the BackupPC server or the NFS server.

Also, interestingly, this problem occurred on the first backup attempt
*after* I rebooted my Linux server (which I hadn't rebooted in several
weeks). 

So, I'm starting to wonder whether the problem is the reboot...
I will try rebooting my server (again) to see what happens.
I will also run memtest86 for a bit just in case...

Any other suggestions?


-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/

<Prev in Thread] Current Thread [Next in Thread>