Bacula-users

Re: [Bacula-users] Verify differences: SHA1 sum doesn't match but it should

2010-08-30 13:01:05
Subject: Re: [Bacula-users] Verify differences: SHA1 sum doesn't match but it should
From: Bruno Friedmann <bruno AT ioda-net DOT ch>
To: bacula-users AT lists.sourceforge DOT net
Date: Mon, 30 Aug 2010 18:58:48 +0200
On 08/30/2010 06:35 PM, Tobias Brink wrote:
> Steve Costaras <stevecs AT chaven DOT com> writes:
> 
>>  Could be due to a transient error (transmission or wild/torn read at
>> time of calculation).   I see this a lot with integrity checking of
>> files here (50TiB of storage).
>>
>> Only way to get around this now is to do a known-good sha1/md5 hash of
>> data (2-3 reads of the file make sure that they all match and that the
>> file is not corrupted) save that as a baseline and then when doing
>> reads/compares if one fails do another re-read and see if the first
>> one was in error and compare that with your baseline.     This is one
>> reason why I'm switching to the new generation of sas drives that have
>> ioecc checks on READS not just writes to help cut down on some of
>> this.
>>
>> Corruption does occur as well and is more probable with the higher the
>> capacity of the drive.     Ideally you would have a drive that would
>> do ioecc on reads, plus using T10 PI extensions (DIX/DIF) from drive
>> to controller up to your file system layer.    It won't always prevent
>> it by itself but would allow if you have a raid setup to do some
>> self-healing when a drive reports a non transient (i.e. corrupted
>> sector of data).
> 
> First off, thanks for the answers.  The thing is that I am well aware of
> the reliability problems of hard drives and I would love to use some
> advanced file system like ZFS or btrfs, but I am on Debian and I will
> stay on Debian.  And btrfs is not mature enough to be used in production
> at the moment.  The other thing is that I do not think that this is an
> issue of corruption of the data itself!  As I said I checked the files
> against backups and MD5 sums supplied by Debian (several times and from
> cold cache) and the data seems to be OK.  The executables that are
> reported by Bacula to have changed continue to work well and bug-free
> just as before.
> 
> So I think this is a problem/bug with either the Postgresql database or
> Bacula, not with my hard drives.  I just wonder how something like this
> could happen and how I could avoid this.  I'm also not willing to do
> additional checksums with other programs (AIDE or similar) because they
> take _lots_ of time to run.  With Bacula I get the checksums for free.
> I just want to use them to detect corruption on disk from time to time
> and because I use VirtualFull and want to know if my differential
> backups have missed something.
> 
> So I still don't know how to proceed.  Apart from that I will try to
> upgrade my director and sd to 5.0.2 as soon as Debian backports are
> available and see if the problem goes away.  I will also re-run the
> DiskToCatalog after my next differential backup and see if something
> is different.
> 
> Thanks,
> Tobias
> 

Tobias, I use this little python script to extract information which I used to 
track duplicates files
(users are users :-)

Hope this could help you a bit to have inspiration and decode the lstat column.
(If I remember, someone has also do the same in pl/pgsql: check the archives 
list)

#!/usr/bin/python
# -*- coding: utf-8 -*-
#
#       call it with a jobid and pipe it to csv file
#
import sys
import time
import MySQLdb

jobid = sys.argv[1]

def base64_decode_lstat(record, position):
    b64 = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/"
    val = 0
    size = record.split(' ')[position]
    for i in range(len(size)):
        val += (b64.find(size[i])) * (pow(64,(len(size)-i)-1))
    return val

# Adjust localhost,username,passwd,dbname

db = MySQLdb.connect(host="localhost", user="bacula", passwd="bacula", 
db="bacula")
db.set_character_set('utf8')
cursor = db.cursor()
cursor.execute('SET NAMES utf8;')
cursor.execute('SET CHARACTER SET utf8;')
cursor.execute('SET character_set_connection=utf8;')
cursor.execute("SELECT File.MD5 as cheksum, convert(Filename.Name using utf8) 
as filename, convert(Path.Path using utf8) as
path, File.LStat as lstat\
                FROM File, Filename, Path \
                WHERE File.JobId = '%s' \
                    AND Filename.FilenameId = File.FilenameId \
                    AND Path.PathId = File.PathId\
                                        ORDER BY 
File.MD5,Filename.Name,Path.Path" % jobid)
result = cursor.fetchall()

# no headerprint 
'"checksum";"filename";"path";"lstat";"gid";"uid";"bytes";"blocksize";"blocks_allocated";"atime";"mtime";"ctime"'

for record in result:

        print '"%s";"%s";"%s";"%s";"%s";"%s";"%s";"%s";"%s";"%s";"%s";"%s"' % (
        record[0] , record[1] , record[2] , record[3] , \
        base64_decode_lstat(record[3],5) , \
        base64_decode_lstat(record[3],6) , \
        base64_decode_lstat(record[3],7) , \
        base64_decode_lstat(record[3],8) , \
        base64_decode_lstat(record[3],9) , \
        base64_decode_lstat(record[3],10) , \
        base64_decode_lstat(record[3],11) , \
        base64_decode_lstat(record[3],12) \
        )

# no empty line at end print





-- 

Bruno Friedmann  bruno AT ioda-net DOT ch

Ioda-Net Sàrl www.ioda-net.ch

  openSUSE Member
    User www.ioda.net/r/osu
    Blog www.ioda.net/r/blog

  fsfe fellowship www.fsfe.org
  (bruno.friedmann (at) fsfe.org )

  tigerfoot on irc

GPG KEY : D5C9B751C4653227

------------------------------------------------------------------------------
This SF.net Dev2Dev email is sponsored by:

Show off your parallel programming skills.
Enter the Intel(R) Threading Challenge 2010.
http://p.sf.net/sfu/intel-thread-sfd
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users