Holger Parplies wrote at about 05:39:15 +0200 on Sunday, October 2, 2011:
> Mike Dresser wrote on 2011-09-29 14:11:20 -0400 [[BackupPC-users] Fairly
> large backuppc pool (4TB) moved with backuppc_tarpccopy]:
> > [...] Did see a few errors, all of them were related to the attrib files,
> > similar to "Can't find xx/116/f%2f/fvar/flog/attrib in pool, will copy
> > file"
> > [...]
> > Out of curiosity, where are those errors (the attrib in pool ones)
> > coming from?
>
> (which is a question, and a good one).
>
> I can't promise that this is the correct answer, but it's a possibility:
> prior
> to BackupPC 3.2.0, *top-level* attrib files (i.e. those for the directory
> containing all the share subdirectories) were linked into the pool with an
> incorrect digest, presuming there was more than one share. This would mean
> that
> BackupPC_tarPCCopy would not find the content in the pool, because it would
> look for a file with the *correct* digest (i.e. file name). Please note that
> your quote above does *not* reference a *top-level* attrib file (that would
> be
> "xx/116/attrib"), and, beyond that, you don't seem to have multiple shares,
> so it might well be a different problem.
>
> According to the ChangeLog, Jeffrey should have pointed this out, because he
> discovered the bug and supplied a patch ;-).
>
> I notice this problem on my pool when investigating where the longest hash
> collision chain comes from: it's a chain of top-level attrib files - all for
> the same host and with different contents and thus certainly different
> digests.
As Holger points out, the bug I reported and suggested a patch for
involved top-level attribs where you have more than one share. This
has been fixed in 3.2.0.
That being said, in the past I did find a couple of broken attrib file
md5sums out of many 100's of thousands but I assumed at the time that
it was an artifact of some other messing around I may have been done.
If you are finding missing pooled attrib files not in the top-level,
then it would be interesting to figure out what is causing it since
there may be a real bug somewhere (though again I haven't seen the
problem recently but I haven't really checked recently either).
If you want to troubleshoot, I would do the following:
- Look up the inode of the bad attrib file in the pc tree
- Check how many links it has
- Assuming it has nlinks >1, search for that inode in the pool using
say find <topdir>/cpool -inum <inode number>
- If the file is indeed hard-linked into the pool, calculate the
actual partial md5sum of the file (not the *nix md5sum) using say
one of my routines. Check to see if the calculated partial file
md5sum matches the pool file name. Presumably it should be
different.
- If the file is not there, then that is another issue.
- Also, look back through your logs to see when the attrib file was
actually created and written to the pool. See if anything is
weird/wrong there
Assuming that you do have a real issue with non-top-level attrib file
md5sums or pool links, it would be interesting to see if anybody has
encountered the same problem in versions >= 3.2.0
>
> > I still have the old filesystem online if it's something I
> > should look at.
>
> I don't think it's really important. If the attrib file was not in the pool
> previously, then that may simply have wasted a small amount of space. As I
> understand the matter, the file will remain unpooled in the copy. You could
> fix that with one of Jeffrey's scripts or just live with a few wasted bytes.
> If you are running a BackupPC version < 3.2.0, pooling likely won't work for
> those attrib files anyway.
>
> It might be interesting to determine whether the non-top-level attrib files
> you got errors for are also, in fact, pooled under an incorrect pool file
> name, though that would involve finding the pool file by inode number and
> calculating the correct pool hash (or ruling out the existance of a pool file
> due to a link count of 1 :-).
Agreed - see my suggestion above
------------------------------------------------------------------------------
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security
threats, fraudulent activity, and more. Splunk takes this data and makes
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2dcopy2
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List: https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki: http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/
|