Bacula-users

[Bacula-users] Bacula tape errors?

2010-07-15 09:10:53
Subject: [Bacula-users] Bacula tape errors?
From: Andrei <bacula-debug AT gbif DOT org>
To: bacula-users AT lists.sourceforge DOT net
Date: Thu, 15 Jul 2010 15:07:31 +0200
Hello,

I'm running some backups and I'm not quite sure what to make of the reports:

10-Jul 23:26 dio-sd JobId 438: Alert: smartctl 5.39.1 2010-01-28 r3054 [x86_64-redhat-linux-gnu] (local build)

10-Jul 23:26 dio-sd JobId 438: Alert: Copyright (C) 2002-10 by Bruce Allen,http://smartmontools.sourceforge.net
10-Jul 23:26 dio-sd JobId 438: Alert:
10-Jul 23:26 dio-sd JobId 438: Alert: TapeAlert: OK
10-Jul 23:26 dio-sd JobId 438: Alert:
10-Jul 23:26 dio-sd JobId 438: Alert: Error Counter logging not supported
10-Jul 23:26 dio-sd JobId 438: Alert:
10-Jul 23:26 dio-sd JobId 438: Alert: Last n error events log page
10-Jul 23:26 dio-sd JobId 438: Alert:   Error event 29556:
10-Jul 23:26 dio-sd JobId 438: Alert:     [binary]:
10-Jul 23:26 dio-sd JobId 438: Alert:  00     63 63 75 72 72 65 6e 63  65 2f 6c 69 73 74 3f 6d
10-Jul 23:26 dio-sd JobId 438: Alert:  10     6f 64 65 3d 72 61 77 26  66 6f 72 6d 61 74 3d 64
10-Jul 23:26 dio-sd JobId 438: Alert:  20     61 72 77 69 6e 26 63 6f  6f 72 64 69 6e 61 74 65
10-Jul 23:26 dio-sd JobId 438: Alert:  30     73 74 61 74 75 73 3d 74  72 75 65 26 68 6f 73 74
10-Jul 23:26 dio-sd JobId 438: Alert:  40     69 73 6f 63 6f 75 6e 74  72 79 63 6f 64 65 3d 4d
10-Jul 23:26 dio-sd JobId 438: Alert:  50     58 26 73 63 69 65 6e 74  00 00 00 00 00 00 00 00
10-Jul 23:26 dio-sd JobId 438: Alert:  60     00 00 00 00 00 00 00 00  00 00 00 00 00 00 00
........
SD Errors: 0
FD termination status: OK
SD termination status: OK
Termination: Backup OK

I'm using smartctl as suggested (Alert Command = "sh -c 'smartctl -H -l error %c'").

- some backups give errors like the above, some don't; unrelated to which server is backed up
- some restore jobs give errors like the above, some don't; could not yet establish a correlation between restore jobs and errors (i.e. if restore of one specific backup job will give the same errors/none at all)
- when they occur, the errors are not identical; a specific job run repeatedly will give different errors or even none at all
- the errors occur regardless of how the jobs are run (concurrently, independent, with or without disk spooling) and regardless of the physical tape cartridge (tried with different tapes)
- the backup jobs statuses reported by Bacula are all OK in spite of the smartctl errors
- the restore of jobs backed up with the smartctl errors (sometimes they have their own smartctl errors) always ends with Bacula status OK

The Bacula btape tape and changer tests were all successful.

I've run the following test: restored from Bacula job with smartctl errors (restore job had smartctl errors of its own) then computed the MD5 sums of the restored files and compared them with the sums of the files on the server that was being backed up. I've done this for 4 different jobs of 4 separate servers (~800.000 files and 40GBs in total) and the only MD5sum differences were a handful of files that had very good reasons for it (logs, bash history and such). So basically there seemed to be no errors in the restored files.

So far I was unable to match the errors with anything (codes) in the tape library documentation.
The TL2000 tape library is brand new.

I'm wondering if this might be a false positive?!

I'm using:
Fedora 13 (x86_64)
Bacula 5.0.2-5 (RPM packages coming with FC13 updates)
PostgreSQL 8.4.4-1 (RPMs same as above)

Dell PowerVault TL2000 (LTO 4)
(the drive is IBM ULT3580-TD4 with the A232 firmware)
IBM LTO 4 tapes


Thank you!

Best regards,
Andrei Cenja


------------------------------------------------------------------------------
This SF.net email is sponsored by Sprint
What will you do first with EVO, the first 4G phone?
Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users
<Prev in Thread] Current Thread [Next in Thread>
  • [Bacula-users] Bacula tape errors?, Andrei <=