On 08/30/2010 06:35 PM, Tobias Brink wrote:
> Steve Costaras <stevecs AT chaven DOT com> writes:
>
>> Could be due to a transient error (transmission or wild/torn read at
>> time of calculation). I see this a lot with integrity checking of
>> files here (50TiB of storage).
>>
>> Only way to get around this now is to do a known-good sha1/md5 hash of
>> data (2-3 reads of the file make sure that they all match and that the
>> file is not corrupted) save that as a baseline and then when doing
>> reads/compares if one fails do another re-read and see if the first
>> one was in error and compare that with your baseline. This is one
>> reason why I'm switching to the new generation of sas drives that have
>> ioecc checks on READS not just writes to help cut down on some of
>> this.
>>
>> Corruption does occur as well and is more probable with the higher the
>> capacity of the drive. Ideally you would have a drive that would
>> do ioecc on reads, plus using T10 PI extensions (DIX/DIF) from drive
>> to controller up to your file system layer. It won't always prevent
>> it by itself but would allow if you have a raid setup to do some
>> self-healing when a drive reports a non transient (i.e. corrupted
>> sector of data).
>
> First off, thanks for the answers. The thing is that I am well aware of
> the reliability problems of hard drives and I would love to use some
> advanced file system like ZFS or btrfs, but I am on Debian and I will
> stay on Debian. And btrfs is not mature enough to be used in production
> at the moment. The other thing is that I do not think that this is an
> issue of corruption of the data itself! As I said I checked the files
> against backups and MD5 sums supplied by Debian (several times and from
> cold cache) and the data seems to be OK. The executables that are
> reported by Bacula to have changed continue to work well and bug-free
> just as before.
>
> So I think this is a problem/bug with either the Postgresql database or
> Bacula, not with my hard drives. I just wonder how something like this
> could happen and how I could avoid this. I'm also not willing to do
> additional checksums with other programs (AIDE or similar) because they
> take _lots_ of time to run. With Bacula I get the checksums for free.
> I just want to use them to detect corruption on disk from time to time
> and because I use VirtualFull and want to know if my differential
> backups have missed something.
>
> So I still don't know how to proceed. Apart from that I will try to
> upgrade my director and sd to 5.0.2 as soon as Debian backports are
> available and see if the problem goes away. I will also re-run the
> DiskToCatalog after my next differential backup and see if something
> is different.
>
> Thanks,
> Tobias
>
Tobias, I use this little python script to extract information which I used to
track duplicates files
(users are users :-)
Hope this could help you a bit to have inspiration and decode the lstat column.
(If I remember, someone has also do the same in pl/pgsql: check the archives
list)
#!/usr/bin/python
# -*- coding: utf-8 -*-
#
# call it with a jobid and pipe it to csv file
#
import sys
import time
import MySQLdb
jobid = sys.argv[1]
def base64_decode_lstat(record, position):
b64 = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/"
val = 0
size = record.split(' ')[position]
for i in range(len(size)):
val += (b64.find(size[i])) * (pow(64,(len(size)-i)-1))
return val
# Adjust localhost,username,passwd,dbname
db = MySQLdb.connect(host="localhost", user="bacula", passwd="bacula",
db="bacula")
db.set_character_set('utf8')
cursor = db.cursor()
cursor.execute('SET NAMES utf8;')
cursor.execute('SET CHARACTER SET utf8;')
cursor.execute('SET character_set_connection=utf8;')
cursor.execute("SELECT File.MD5 as cheksum, convert(Filename.Name using utf8)
as filename, convert(Path.Path using utf8) as
path, File.LStat as lstat\
FROM File, Filename, Path \
WHERE File.JobId = '%s' \
AND Filename.FilenameId = File.FilenameId \
AND Path.PathId = File.PathId\
ORDER BY
File.MD5,Filename.Name,Path.Path" % jobid)
result = cursor.fetchall()
# no headerprint
'"checksum";"filename";"path";"lstat";"gid";"uid";"bytes";"blocksize";"blocks_allocated";"atime";"mtime";"ctime"'
for record in result:
print '"%s";"%s";"%s";"%s";"%s";"%s";"%s";"%s";"%s";"%s";"%s";"%s"' % (
record[0] , record[1] , record[2] , record[3] , \
base64_decode_lstat(record[3],5) , \
base64_decode_lstat(record[3],6) , \
base64_decode_lstat(record[3],7) , \
base64_decode_lstat(record[3],8) , \
base64_decode_lstat(record[3],9) , \
base64_decode_lstat(record[3],10) , \
base64_decode_lstat(record[3],11) , \
base64_decode_lstat(record[3],12) \
)
# no empty line at end print
--
Bruno Friedmann bruno AT ioda-net DOT ch
Ioda-Net Sàrl www.ioda-net.ch
openSUSE Member
User www.ioda.net/r/osu
Blog www.ioda.net/r/blog
fsfe fellowship www.fsfe.org
(bruno.friedmann (at) fsfe.org )
tigerfoot on irc
GPG KEY : D5C9B751C4653227
------------------------------------------------------------------------------
This SF.net Dev2Dev email is sponsored by:
Show off your parallel programming skills.
Enter the Intel(R) Threading Challenge 2010.
http://p.sf.net/sfu/intel-thread-sfd
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users
|