BackupPC-users

Re: [BackupPC-users] BackupPC_verifyPool mismatchs found - what to do?

2012-12-02 13:53:59
Subject: Re: [BackupPC-users] BackupPC_verifyPool mismatchs found - what to do?
From: <backuppc AT kosowsky DOT org>
To: "General list for user discussion, questions and support" <backuppc-users AT lists.sourceforge DOT net>
Date: Sun, 02 Dec 2012 12:51:51 -0500
Holger Parplies wrote at about 17:59:43 +0100 on Sunday, December 2, 2012:
 > Hi,
 > 
 > Matthias Meyer wrote on 2012-12-02 01:57:07 +0100 [[BackupPC-users] 
 > BackupPC_verifyPool mismatchs found - what to do?]:
 > > I've tried the BackupPC_verifyPool.pl from Holger Parplies.
 > > Unfortunately some MD5 errors found and print out. e.g.:
 > > [28878] 083212b41c2482783128c9212f1f8a26     (        78) != 
 > > 70f6bdb839eed7efdfe8f8b01f4dcbc7
 > > 
 > > Did I have a problem?
 > 
 > if there are no other indications, not necessarily. While this is not
 > *supposed* to happen, there used to be a bug in BackupPC that caused 
 > top-level
 > attrib files for backups with more than one share to be linked into the pool
 > with an incorrect pool file name. The only adverse effect was that the (small
 > amount of) space for the attrib file was not shared between backups. I'm not
 > sure when this bug was fixed. Perhaps Jeffrey can provide you with more 
 > detail
 > - he's the one that debugged the problem. The indicated file size (78 bytes)
 > seems plausible for a top-level attrib file, so this may well be what you are
 > seeing.

Here is a link to the thread explaining the original problem and
outlining several possible solutions.
http://adsm.org/lists/html/BackupPC-users/2009-12/msg00259.html

I think that Craig ultimately corrected it in the current release but
I am not sure.

Note that I have also diagnosed problems when switching between x86
and ARM architectures due to bugs in the checkusmming algorithms
though I would guess this is unlikely to be your issue. In addition to
being an unlikely use case, it also happens to make all the file
checksums wrong.

 > 
 > > What should/can I do?
 > 
 > Investigate whether this is the case. If not, look at the files in question.
 > See below.
 > 
 > > How to find out which file it is?
 > 
 > The above file is cpool/0/8/3/083212b41c2482783128c9212f1f8a26 (or pool/... 
 > if
 > you used the -u switch). Which file(s) in the pc/ tree this links to is more
 > difficult to determine. First of all, for the problem I mentioned above, my
 > understanding is that the link count of the pool file should be 2 (one pool
 > link, one attrib file link; after that the pool file will never be re-used,
 > because it will never match, as it has a name not matching its contents). If
 > the link count is *not* 2, this seems to be an indication that the contents
 > changed on disk when they in no case should have, which is Not Good(tm).
 > 
 > >From the pool file ('ls -i cpool/0/8/3/083212b41c2482783128c9212f1f8a26') 
 > >you
 > can determine the inode and search for that in the pc/ tree. Assuming this is
 > a top-level attrib file, you can speed things up greatly by not traversing 
 > the
 > backups. Depending on the number of hosts and backups, you might get away 
 > with
 > something as simple (though not necessarily as fast as you might expect) as
 > 'ls -i1 pc/*/*/attrib | sort > /tmp/top-level-attrib-inums'. Search the 
 > output
 > file for the inode numbers you determined.
 > 
 > If it's not top-level attrib files, you're looking at something like
 > 'find pc/ -inum ... -ls', which will take *long*. You'll probably want to at
 > least look for all inodes in one traversal, which is probably easier to code
 > in Perl than type into one find invocation ;-).
 > 
 > In any case, you can look at the contents using just the pool file name and
 > BackupPC_zcat. That might give you enough information to be able to locate 
 > the
 > file.
 > For attrib files, BackupPC_zcat produces output that is not very human
 > readable, though it *does* contain the file names (meaning share names, in 
 > the
 > case of a top-level attrib file), so it might be good enough. I'm not sure
 > whether BackupPC_attribPrint will work with the pool file name, but you could
 > try that as well.
 > 
 > Hope that helps.
 > 
 > Regards,
 > Holger

Thanks Holger for giving the above pointers.
While I am not familiar with Holger's routine, I have written a
similar routine that verifies and also *fixes* broken md5sum file
names. It does so by re-naming +/- moving the pool file as required.

Note that the program is more complicated than one might expect since
it needs to deal with "chains" of pool files with the same partial
md5sum but different suffixes, including knowing where/when to add into
the chain, including linking rather than copying if the target already
exists in the chain or alternatively adding a new suffix to the chain if the 
content is distinct or
if MAXLINKS exceeded. Also, if the moved file is part of a chain, then
the original chain needs to be renumbered to fill the hole following
the renaming.


Here is the code in case you are interested...
Note: it requires my jlib library of routines which should be
available on the wikki

########################################################################


#!/usr/bin/perl
#============================================================= -*-perl-*-
#
# BackupPC_fixPoolMdsums: Rename/move pool files if mdsum path name invalid
#
# DESCRIPTION
#   See 'usage' for more detailed description of what it does
#   
# AUTHOR
#   Jeff Kosowsky
#
# COPYRIGHT
#   Copyright (C) 2011  Jeff Kosowsky
#
#   This program is free software; you can redistribute it and/or modify
#   it under the terms of the GNU General Public License as published by
#   the Free Software Foundation; either version 2 of the License, or
#   (at your option) any later version.
#
#   This program is distributed in the hope that it will be useful,
#   but WITHOUT ANY WARRANTY; without even the implied warranty of
#   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
#   GNU General Public License for more details.
#
#   You should have received a copy of the GNU General Public License
#   along with this program; if not, write to the Free Software
#   Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
#
#========================================================================
#
# Version 0.2, released January 2011
#
#========================================================================

use strict;
use warnings;

use lib "/usr/share/BackupPC/lib";
use BackupPC::Lib;
use BackupPC::jLib 0.4.0;  # Requires version >= 0.4.0
use File::Glob ':glob';
use Getopt::Long qw(:config no_ignore_case bundling);

#Variables
my $bpc = BackupPC::Lib->new or die("BackupPC::Lib->new failed\n");
my $md5 = Digest::MD5->new;
my $MAXLINKS = $bpc->{Conf}{HardLinkMax};

#Option variables:
my $Force;
my $warndups;
my $outfile;
my $TopDir = my $TopDir_def = $bpc->{TopDir};
my $verbose=0;
#$dryrun=0;  #Global variable defined in jLib.pm (do not use 'my')
$dryrun=1;  #Global variable defined in jLib.pm (do not use 'my')

usage() unless( 
        GetOptions( 
                "dryrun|d!"        => \$dryrun,
                "Force"          => \$Force,     #Override stuff...
                "outfile|o=s"      => \$outfile,
                "topdir|t=s"       => \$TopDir,    #Location of TopDir
                "verbose|v+"       => \$verbose,   #Verbosity (repeats allowed)
                "warndups|w"       => \$warndups,  #Warn if dup created in new 
chain
        )
        && defined $outfile
        );

#############################################################################
if($TopDir ne $TopDir_def) {
        #NOTE: if we are not using the TopDir in the config file, then we
        # need to manually override the settings of BackupPC::Lib->new
        # which *doesn't* allow you to set TopDir (even though it seems so
        # from the function definition, it gets overwritten later when the
        # config file is read)
        $TopDir =~ s|//*|/|g; #Remove any lurking double slashes
        $TopDir =~ s|/*$||g; #Remove trailing slash
        $bpc->{TopDir} = $TopDir;
        $bpc->{Conf}{TopDir} = $TopDir;

        $bpc->{storage}->setPaths({TopDir => $TopDir});
        $bpc->{PoolDir}  = "$bpc->{TopDir}/pool";
        $bpc->{CPoolDir} = "$bpc->{TopDir}/cpool";
}

%Conf   = $bpc->Conf(); #Global variable defined in jLib.pm (do not use 'my')
#############################################################################
my $compress = $Conf{CompressLevel};
my $pool = $compress > 0 ? "cpool" : "pool";
my $compare = $compress > 0 ? \&zcompare2 : \&jcompare;
my $file2md5 = $compress > 0 ? \&zFile2MD5 : \&File2MD5;

my ($OUT);
die "ERROR: '$outfile' already exists!\n" if -e $outfile;
open($OUT, '>', "$outfile") or
        die "ERROR: Can't open '$outfile' for writing!($!)\n";

chdir $TopDir;

if(!$Force && 
   (my @partialbackups = glob("pc/*/NewFileList{,.[0-9]*}"))) {
        die("ERROR: Pool conflicts will occur if NewFileList present (--Force 
overrides):\n          " . 
                join('\n          ', @partialbackups) . "\n");
}

system("$bpc->{InstallDir}/bin/BackupPC_serverMesg status jobs >/dev/null 
2>&1");
unless(($? >>8) == 1) {
        die "Dangerous to run when BackupPC is running!!!\n"
                if $TopDir eq $TopDir_def;
        warn "WARNING: May be dangerous to run when BackupPC is running!!!\n"; 
    #Warn but don't die if *appear* to be in different TopDir
}

my $totalfiles = 0;
my $md5errors = 0;
my $fixed = 0;
my $badpoolentry = 0;
my $chaindups = 0;
my $norename = 0;
my $renumbererr = 0;

scan_pool($pool);

#Note: Total = total pool files scanned
#     NotFixed = NoRename (i.e. rename fails - shouldn't happen...)
#     ChainDups (only calcualted if --warndups set)
#     RenumberErrs = Errors renumbering old chain after move (shouldn't 
happen...)
printf("Total=%d Md5PathErrors=%d [Fixed=%d, NotFixed=%d]%s\n",
           $totalfiles, $md5errors, $fixed, ($md5errors-$fixed), 
           ($dryrun ? " DRY-RUN" : ""));
printf("BadPoolEntries=%d RenumberErrs=%d\n", $badpoolentry, $renumbererr);
printf("ChainDups=%d\n", $chaindups) if $warndups;
print "\n";
exit;

#######################################################################
#Run through the pool looking for misnamed md5sum paths
sub scan_pool
{
        my ($fpool) = @_;
        my ($dh, @fstat);

        return unless glob("$fpool/[0-9a-f]"); #No entries in pool
        my @hexlist = ('0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'a', 
'b', 'c', 'd', 'e', 'f');
        my ($idir, $jdir, $kdir);
        foreach my $i (@hexlist) {
                print STDERR "\n**$fpool/$i: " if $verbose >=2;
                $idir = $fpool . "/" . $i . "/";
                foreach my $j (@hexlist) {
                        print STDERR "$j " if $verbose >=2;
                        $jdir = $idir . $j . "/";
                        foreach my $k (@hexlist) {
                                $kdir = $jdir . $k . "/";
                                unless(opendir($dh, $kdir)) {
                                        warn "Can't open pool directory: 
$kdir\n" if $verbose>=4;
                                        next;
                                }
                                #Sort directory entries so that chains are 
ordered lowest to
                #highest - This preserves sequential order between source and 
                #target chains PLUS ensures that we fill holes corretly and
                                #most efficiently
                                my @entries = sort {poolname2number($a) cmp 
poolname2number($b)}
                                                    (readdir ($dh));
                                close($dh);
                                warn "POOLDIR: $kdir (" . ($#entries-1) ." 
files)\n"
                                        if $verbose >=3;

                                my $chainholes = 0;
                                my $chainstart;
                                my $lastdigest='';
                                foreach (@entries) {
                                        next if /^\.\.?$/;     # skip dot files 
(. and ..)

                                        $totalfiles++;
                                        my $origfile = ${kdir} . $_;
                                        unless(m|^([0-9a-f]+)(_[0-9]+)?|) {
                                                warn "ERROR: '$origfile' is not 
a valid pool entry\n";
                                                $badpoolentry++;
                                                next;
                                        }
                                        my $origdigest = $1;
                                        my $newdigest = $file2md5->($bpc, $md5, 
$origfile, -1, $compress);
                                        if($newdigest eq "-1") {
                                                $badpoolentry++;
                                                warn "ERROR: Can't calculate 
md5sum name for: $origfile\n";
                                                next;
                                        }
                                        if($newdigest ne $origdigest) { #MD5sum 
Path is WRONG
                                                $md5errors++;
                                                if($origdigest ne $lastdigest) 
{ #New chain
                                                        #So go back and 
renumber last chain to remove holes
                                                        if($chainholes > 0) {
                                                                
renumber_pool_chain($chainstart, $chainholes)>0
                                                                        or 
$renumbererr++;
                                                        }
                                                        
$lastdigest=$origdigest; #Reset to new chain base
                                                        $chainholes = 0;
                                                        $chainstart = 
$origfile; #lowest element of chain
                                                        #since we are sorting 
directory in chain order
                                                }
                                                if(rename_entry($origfile, 
$newdigest)>0) {$chainholes++}
                                        }
                                }
                                #Check in case any holes unfixed at end of 
directory scan
                                if($chainholes > 0) {
                                        renumber_pool_chain($chainstart, 
$chainholes) >0
                                                or $renumbererr++;
                                }
                        }
                }
        }
        print STDERR "\n" if $verbose >=2;
}

#Rename/move pool chain entry $source to first open position
#in $digest chain if permitted. Renumber source chain as
#needed after the move
sub rename_entry
{
        my ($source, $digest) = @_;

        my $dups = '';
        my @dups = ();

        my $poolpath = $bpc->MD52Path($digest,$compress);
        my $poolbase_ = $poolpath . "_";
        for(my $i=0; -f $poolpath; $i++) { # Iterate through pool chain with 
same md5sum to find
                                    # first free entry
                if($warndups && (stat(_))[3] < $MAXLINKS &&
                   ! $compare->($source,$poolpath)) { #Matches existing pool 
entry
                        push(@dups,$i);
                }
                $poolpath = $poolbase_ . $i;
        }
        $poolpath =~ m|^$TopDir/?((.*)/.*)|;
        my $target = $1; #Path relative to TopDir
        my $dir = $2; #Relative to TopDir
#       print "$source $target [$md5errors/$totalfiles]$dups\n";

        if(@dups) {
                $dups = " CHAINDUPS(" . join(',', @dups) . ")";
                warn "WARN: $dups: $source->$target\n" if $verbose >=1;
                $chaindups++;
        }

        if((!-d $dir && !jmkpath($dir, 0, 0750)) || #Directory creation error
           !jrename($source,$poolpath)) { #Rename error
                warn "ERROR: Can't rename: $source->$target\n" if $verbose >=1;
                print $OUT "$source $target NO_RENAME$dups\n";
                $norename++;
                return -1;
                }

        #Fixed without errors
        print $OUT "$source $target FIXED$dups\n";
        $fixed++;
        return 1;
}

sub usage
{
    print STDERR <<EOF;

usage: $0 [options] --outfile|-o <outfile>  
  Options:
   --dryrun|-d         Dry-run 
                       Negate with: --nodryrun
   --Force             Force - MAY BE DANGEROUS!
   --topdir|-t         Location of TopDir. [Default = $TopDir_def]
                       Note you may want to change from default for example
                       if you are working on a shadow copy.
   --verbose|-v        Verbose (repeat for more verbosity)
   --warndups|-w       Warn if renamed pool entry is a pool duplicate

  DESCRIPTION:
    Find and fix md5sum pool name errors in pool and cpool

  DETAILS:
    Recurses through pool and cpool trees to test if the md5sum name of each
    pool file is correct relative to the file data. If not, the program attempts
    to rename (i.e. move) it to its proper md5sum name.

        If there already are pool files with the new name, then move it to
    the end of the target chain. After removing, renumber the source
    chain (if needed) to fill in holes left by the move. Note that the relative
    ordering of each chain is preserved.

    If the contents of the file match the contents of any of the files in the
    target chain, note the duplicate suffix numbers.

    If the --warndups|-w flag is set then check to see if the renamed pool 
        entry duplicates an existing pool entry (with <MAX LINKS). This may be
    slow if you have a lot of chains since to check for duplicates you need
    to compare files
    Note: it is not generally an error to have two pool entries in the same
    chain with the same data (in fact, it occurs intentionally when you exceed 
    MAX LINKS), it just may waste some space. My routine BackupPC_fixLinks.pl 
    can correct just such duplicates later if that is an issue.

    In any case, if all your misnumbering was consistent you won\'t have this
    situation anyway.

        <outfile> records all the changes made plus appends a status code:

        FIXED = pool file moved/renamed and original chain renumbered if needed.
    NO_RENAME = Signals error in renaming/moving the pool file. The mdsum name
                was thus not corrected.
        DUPS(n1,n2,n3) = Signals duplicates in the target chain and lists the
                     suffixes (-1 = no suffix)

        The following errors are tabulated but shouldn\'t occur:
    BadPoolEntries = Files in pool with name not of form [0-9a-f]+(_[0-9]+)?
                     or that can\'t be read to compute the MD5 path digest
    RenumberErrors = Failures to renumber original chain to fill hole 
                      after renaming
    
EOF
exit(1)
}

------------------------------------------------------------------------------
Keep yourself connected to Go Parallel: 
DESIGN Expert tips on starting your parallel project right.
http://goparallel.sourceforge.net/
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/