BackupPC-users

Re: [BackupPC-users] Duplicate files in pool with same CHECKSUM and same CONTENTS

2008-10-30 10:06:30
Subject: Re: [BackupPC-users] Duplicate files in pool with same CHECKSUM and same CONTENTS
From: "Jeffrey J. Kosowsky" <backuppc AT kosowsky DOT org>
To: General list for user discussion <backuppc-users AT lists.sourceforge DOT net>
Date: Thu, 30 Oct 2008 10:04:26 -0400
Holger Parplies wrote at about 11:29:49 +0100 on Thursday, October 30, 2008:
 > Hi,
 > 
 > Jeffrey J. Kosowsky wrote on 2008-10-30 03:55:16 -0400 [[BackupPC-users] 
 > Duplicate files in pool with same CHECKSUM and same CONTENTS]:
 > > I have found a number of files in my pool that have the same checksum
 > > (other than a trailing _0 or _1) and also the SAME CONTENT. Each copy
 > > has a few links to it by the way.
 > > 
 > > Why is this happening? 
 > 
 > presumably creating a link sometimes fails, so BackupPC copies the file,
 > assuming the hard link limit has been reached. I suspect problems with your
 > NFS server, though not a "stale NFS file handle" in this case, since copying
 > the file succeeds. Strange.

Yes - I am beginning to think that may be true. However as I mentioned
in the other thread, the syslog on the nfs server is clean and the one
on the client shows only about a dozen or so nfs timeouts over the
past 12 hours which is the time period I am looking at now. Otherwise,
I don't see any nfs errors.
So if it is a nfs problem, something seems to be happening somewhat
randomly and invisibly to the filesystem.

 > 
 > >   Isn't this against the whole theory of pooling.
 > 
 > Well, yes :). But the action of copying the file when the method to implement
 > pooling (hard links) does not work for some reason (max link count reached, 
 > or
 > NFS file server errors if you think about it - you *do* get some level of
 > pooling; otherwise you'd have an independant copy or a missing file each time
 > linking fails) is perfectly reasonable.
 > 
 > >   It also doesn't seem
 > >   to get cleaned up by BackupPC_nightly since that has run several times
 > >   and the pool files are now several days old.
 > 
 > BackupPC_nightly is not supposed to clean up that situation. It could be
 > designed to do so (the situation may arise when a "link count overflow" is
 > resolved by expired backups), but it would be horribly inefficient: for the
 > file to be eliminated, you would have to find() every occurrence of the inode
 > in all pc/* trees and replace them with links to the duplicate(s) to be kept.
 > You don't want that.

Yes but it would be nice to have a switch perhaps that allowed this
more comprehensive cleanup.
Even in a non-error case, I can imagine situations where at some point
the max file links may have been exceeded and then backups were
deleted so that the link count came back down below the max.

The logic wouldn't seem to be that horrendous. Since you would only
need to walk down the pc/* trees once -- i.e. first walk down
(c)pool/* to compile list of repeated but identical checksums. Then
walk down the pc/* tree to find the files on the list.

 > 
 > > What can I do to clean it up?
 > 
 > Fix your NFS server? :) Is there a consistent maximum number of links, or do
 > the copies seem to happen randomly? Honestly, I don't think the savings you
 > may gain from storing the pool over NFS are worth the headaches. What is
 > cheaper about putting a large disk into a NAS device than into your BackupPC
 > server? Well, yes, you can share it ... how about exporting part of the disk
 > from the BackupPC server (I would still recommend distinct partitions)?
 > 

You are right in theory. But I would still like to get NFS working for
various reasons and it is always a good "learning experience" to
troubleshoot such things ;)

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/

<Prev in Thread] Current Thread [Next in Thread>