BackupPC-users

Re: [BackupPC-users] Filesystem corruption: consistency of the pool

2010-11-16 19:01:46
Subject: Re: [BackupPC-users] Filesystem corruption: consistency of the pool
From: "Jeffrey J. Kosowsky" <backuppc AT kosowsky DOT org>
To: "General list for user discussion, questions and support" <backuppc-users AT lists.sourceforge DOT net>
Date: Tue, 16 Nov 2010 18:59:17 -0500
martin f krafft wrote at about 19:16:52 +0100 on Wednesday, November 3, 2010:
 > Hello,
 > 
 > My filesystem holding the backuppc pool was corrupted. While e2fsck
 > managed to fix it all and now doesn't complain anymore, I am a bit
 > scared that the backuppc pool isn't consistent anymore.
 > 
 > Is there a tool to check the consistency of the pool?
 > 
 > Is there a tool to repair an inconsistent pool?
 > 

I wrote two programs that might be helpful here:
1. BackupPC_digestVerify.pl
   If you use rsync with checksum caching then this program checks the
   (uncompressed) contents of each pool file against the stored md4
   checksum. This should catch any bit errors in the pool. (Note
   though that I seem to recall that the checksum only gets stored the
   second time a file in the pool is backed up so some pool files may
   not have a checksum included - I may be wrong since it's been a
   while...)

2. BackupPC_fixLinks.pl
   This program scans through both the pool and pc trees to look for
   wrong, duplicate, or missing links. It can fix most errors.

The second program is on the wikki somewhere.
I will attach below a copy of the first program.
I find that the above two routines do a pretty good job of checking
for corruption in the pc and pool trees.

-----------------------------------------------------------------------------

#!/usr/bin/perl
#========================================================================
#
# BackupPC_digestVerify.pl
#                       
#
# DESCRIPTION
#   Check contents of cpool and/or pc tree entries (or the entire tree) 
#   against the stored rsync md4 checksum digests (when available)
#
# AUTHOR
#   Jeff Kosowsky
#
# COPYRIGHT
#   Copyright (C) 2010  Jeff Kosowsky
#
#   This program is free software; you can redistribute it and/or modify
#   it under the terms of the GNU General Public License as published by
#   the Free Software Foundation; either version 2 of the License, or
#   (at your option) any later version.
#
#   This program is distributed in the hope that it will be useful,
#   but WITHOUT ANY WARRANTY; without even the implied warranty of
#   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
#   GNU General Public License for more details.
#
#   You should have received a copy of the GNU General Public License
#   along with this program; if not, write to the Free Software
#   Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
#
#========================================================================
#
# Version 0.1, released Nov 2010
#
#========================================================================


use strict;
use Getopt::Std;

use lib "/usr/share/BackupPC/lib";
use BackupPC::Xfer::RsyncDigest;
use BackupPC::Lib;
use File::Find;

use constant RSYNC_CSUMSEED_CACHE     => 32761;
use constant DEFAULT_BLOCKSIZE     => 2048;


my $dotfreq=100;
my %opts;
if ( !getopts("cCpdv", \%opts) || @ARGV !=1
         || ($opts{c} + $opts{C} + $opts{p} > 1)
         || ($opts{d} + $opts{v} > 1)) {
    print STDERR <<EOF;
usage: $0 [-c|-C|-p] [-d|-v] [File or Directory]
  Verify Rsync digest in compressed files containing digests.
  Ignores directories and files without digests
  Only prints if digest does not match content unless verbose flag
  (firstbyte = 0xd7)
  Options:
    -c   Consider path relative to cpool directory
    -C   Entry is a single cpool file name (no path)
    -p   Consider path relative to pc directory
    -d   Print a '.' for every $dotfreq digest checks
    -v   Verbose - print result of each check;

EOF
exit(1);
}

die("BackupPC::Lib->new failed\n") if ( !(my $bpc = BackupPC::Lib->new) );
#die("BackupPC::Lib->new failed\n") if ( !(my $bpc = BackupPC::Lib->new("", "", 
"", 1)) ); #No user check

my $Topdir = $bpc->TopDir();
my $root;
$root = $Topdir . "/pc/" if $opts{p};
$root = "$bpc->{CPoolDir}/" if $opts{c};
$root =~ s|//*|/|g;

my $path = $ARGV[0];
if ($opts{C}) {
        $path = $bpc->MD52Path($ARGV[0], 1, $bpc->{CPoolDir});
        $path =~ m|(.*/)|;
        $root = $1; 
}
else {
        $path = $root . $ARGV[0];
}
my $verbose=$opts{v};
my $progress= $opts{d};

die "$0: Cannot read $path\n" unless (-r $path);


my ($totfiles, $totdigfiles, $totbadfiles) = (0, 0 , 0);
find(\&verify_digest, $path); 
print "\n" if $progress;
print "Looked at $totfiles files including $totdigfiles digest files of which 
$totbadfiles have bad digests\n";
exit;

sub verify_digest {
        return -200 unless (-f);
        $totfiles++;
        return -200 unless -s > 0;
        return -201 unless BackupPC::Xfer::RsyncDigest->fileDigestIsCached($_); 
#Not cached type (i.e. first byte not 0xd7); 
        $totdigfiles++;

        my $ret = BackupPC::Xfer::RsyncDigest->digestAdd($_, DEFAULT_BLOCKSIZE, 
RSYNC_CSUMSEED_CACHE, 2);  #2=verify
#Note setting blocksize=0, results in using the default blocksize of 2048 also, 
but it generates an error message
#Also leave out final protocol_version input since by setting it undefined we 
make it read it from the digest.
        $totbadfiles++ if $ret!=1;

        (my $file = $File::Find::name) =~ s|$root||;
        if ($progress && !($totdigfiles%$dotfreq)) {
                print STDERR "."; 
                ++$|; # flush print buffer
        }
        if ($verbose || $ret!=1) {
                my $inode = (stat($File::Find::name))[1];
                print "$inode $ret $file\n";
        }
        return $ret;
}

# Return codes:
# -100: Wrong RSYNC_CSUMSEED_CACHE or zero file size
# -101: Bad/missing RsyncLib
# -102: ZIO can't open file
# -103: sysopen can't open file
# -104: sysread can't read file
# -105: Bad first byte (not 0x78, 0xd6 or 0xd7)
# -106: Can't seek to end of file
# -107: First byte not 0xd7
# -108: Error on readin digest
# -109: Can't seek when trying to position to rewrite digest data (shouldn't 
happen if only verifying)
# -110: Can't write digest data (shouldn't happen if only verifying)
# -111: Can't seek looking for extraneous data after digest (shouldn't happen 
if only verifying)
# -112: Can't truncate extraneous data after digest (shouldn't happen if only 
verifying)
# -113: If can't sysseek back to file beginning (shouldn't happen if only 
verifying)
# -114: If can't write out first byte (0xd7) (shouldn't happen if only 
verifying)
# 1: Digest verified
# 2: Digest wrong

------------------------------------------------------------------------------
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2 & L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today
http://p.sf.net/sfu/msIE9-sfdev2dev
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/