Bacula-users

Re: [Bacula-users] bacula-sd crashing

2008-08-06 04:04:43
Subject: Re: [Bacula-users] bacula-sd crashing
From: Ronald Buder <rbuder AT proficom-ag DOT de>
To: Daniel Harper <daniel AT srcentre.com DOT au>
Date: Wed, 6 Aug 2008 10:04:07 +0200 (CEST)
Hi,

I'm not entirely sure, however this could stand in connection with a
known 2.4.1 sd bug (http://bugs.bacula.org/view.php?id=1125).

You might want to upgrade to 2.4.2.

Interstingly enough we haven't had a crash yet today, even though the
last two Wednesdays (seriously...) we've had quite a bit of trouble.
We're running 2.4.2 as of Monday, had one crash so far.

We're currently running the bacula-sd in debug mode as root user in
order for it to have full access rights to anything. Might be helpful
for you, too.

/usr/sbin/bacula-sd -c /etc/bacula/bacula-sd.conf -f -d 100 -u root -g
root

Give it a try, might at least give you an idea of what's happening
before the crash.

Ronald

Am Mi 06.08.2008 02:38 schrieb Daniel Harper <daniel AT srcentre.com DOT au>:

>I am having some problems with bacula and any help would be
>appreciated.
>
>I am running, Version 2.4.1, on a CentOS 5.2 Dell 2900, with a TANDBERG
>Model: TS400 (LTO2)
>
>Firstly I get the following error from bacula ...
>
>06-Aug 01:05 vic-dir JobId 109: Start Backup JobId 109,
>Job=rimu.2008-08-06_01.05.33
>06-Aug 01:05 vic-dir JobId 109: Using Device "TS400"
>06-Aug 01:05 vic-sd JobId 109: Wrote label to prelabeled Volume
>"v-monthly-001" on device "TS400" (/dev/nst0)
>06-Aug 01:59 vic-sd JobId 109: Error: block.c:568 Write error at
>24:12836 on device "TS400" (/dev/nst0). ERR=Input/output error.
>06-Aug 01:59 vic-sd JobId 109: Error: Error writing final EOF to tape.
>This Volume may not be readable.
>dev.c:1681 ioctl MTWEOF error on "TS400" (/dev/nst0). ERR=Input/output
>error.
>06-Aug 01:59 vic-sd JobId 109: End of medium on Volume "v-monthly-001"
>Bytes=24,826,540,032 Blocks=384,835 at 06-Aug-2008 01:59.
>06-Aug 09:40 rimu-fd JobId 109: Fatal error: backup.c:892 Network send
>error to SD. ERR=Input/output error
>06-Aug 09:40 vic-sd JobId 109: Job rimu.2008-08-06_01.05.33 marked to
>be
>canceled.
>
>The tape isn't full, it was brand new as of last night when it was
>inserted, I also inserted a cleaning tape before the new tape
>
>In /var/log/messages, at the same time bacula gives errors I get ...
>
>Aug 6 01:59:29 rimu kernel: st0: Current: sense key: Medium Error
>Aug 6 01:59:29 rimu kernel: Add. Sense: Excessive write errors
>Aug 6 01:59:29 rimu kernel:
>Aug 6 01:59:29 rimu kernel: Info fld=0xfc00
>Aug 6 01:59:29 rimu kernel: st0: Current: sense key: Medium Error
>Aug 6 01:59:29 rimu kernel: Add. Sense: Excessive write errors
>Aug 6 01:59:29 rimu kernel:
>Aug 6 01:59:29 rimu kernel: Info fld=0x1
>
>And a bit later ......
>Aug 6 05:52:50 rimu kernel: mptscsih: ioc0: attempting task abort!
>(sc=f4479ac0)
>Aug 6 05:52:50 rimu kernel: st 2:0:6:0:
>Aug 6 05:52:50 rimu kernel: command: Rezero Unit/Rewind: 01 00
>00 00 00 00
>Aug 6 05:52:50 rimu kernel: mptscsih: ioc0: task abort: FAILED
>(sc=f4479ac0)
>Aug 6 05:52:50 rimu kernel: mptscsih: ioc0: attempting target reset!
>(sc=f4479ac0)
>Aug 6 05:52:50 rimu kernel: st 2:0:6:0:
>Aug 6 05:52:50 rimu kernel: command: Rezero Unit/Rewind: 01 00
>00 00 00 00
>Aug 6 05:52:51 rimu kernel: mptscsih: ioc0: target reset: SUCCESS
>(sc=f4479ac0)
>Aug 6 05:52:51 rimu kernel: st0: Current: sense key: Not Ready
>Aug 6 05:52:51 rimu kernel: Add. Sense: Logical unit not ready,
>cause not reportable
>Aug 6 05:52:51 rimu kernel:
>Aug 6 05:53:01 rimu kernel: mptscsih: ioc0: attempting task abort!
>(sc=f4479ac0)
>Aug 6 05:53:01 rimu kernel: st 2:0:6:0:
>Aug 6 05:53:01 rimu kernel: command: Prevent/Allow Medium
>Removal: 1e 00 00 00 01 00
>Aug 6 05:53:01 rimu kernel: mptscsih: ioc0: task abort: FAILED
>(sc=f4479ac0)
>Aug 6 05:53:01 rimu kernel: mptscsih: ioc0: attempting target reset!
>(sc=f4479ac0)
>Aug 6 05:53:01 rimu kernel: st 2:0:6:0:
>
>When running the status command on the storage daemon it hangs. And
>then
>when stopping or killing bacula I get defunct bacula-sd process which
>locks the tape drive.
>bacula 3592 1 0 Jul30 ? 00:10:42 [bacula-sd] <defunct>
>
>I then have to reboot the server to get the tape + bacula to work
>again.
>
>We also have an identical setup at another site, the same thing was
>occurring until I upgraded to 2.4.1. But this server with 2.4.1 is
>still
>having the problem, every couple of weeks.
>
>Cheers,
>
>Daniel
>
>
>
>
>
>
>
>
>
>
>
>
>--
>This message has been scanned for viruses and
>dangerous content by MailScanner, and is
>believed to be clean.
>
>

>>-------------------------------------------------------------------------
>This SF.Net email is sponsored by the Moblin Your Move Developer's
>challenge
>Build the coolest Linux based applications with Moblin SDK & win great
>prizes
>Grand prize is a trip for two to an Open Source event anywhere in the
>world
>http://moblin-contest.org/redirect.php?banner_id=100&url=/
>_______________________________________________
>Bacula-users mailing list
>Bacula-users AT lists.sourceforge DOT net
>https://lists.sourceforge.net/lists/listinfo/bacula-users
>
>


Mit freundlichen Grüßen

Ronald Buder
Dipl-Ing.(BA)

Profi.Com AG
Büro Dresden
Stresemannplatz 3
D-01309 Dresden mail: rbuder AT proficom-ag DOT de


-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users
<Prev in Thread] Current Thread [Next in Thread>