Bacula-users

[Bacula-users] sd died after Device /dev/... - not ready, retrying

2008-10-07 03:58:32
Subject: [Bacula-users] sd died after Device /dev/... - not ready, retrying
From: Ralf Gross <Ralf-Lists AT ralfgross DOT de>
To: bacula-users AT lists.sourceforge DOT net
Date: Tue, 7 Oct 2008 09:54:56 +0200
Hi,

last night I was hit by a mtx/drive problem. 

20081007-00:02:22 Doing mtx -f /dev/Neo4100 load 96 2
20081007-00:02:22 Device /dev/ULTRIUM-TD4-D3 - not ready, retrying...
20081007-00:02:23 Device /dev/ULTRIUM-TD4-D3 - not ready, retrying...
[...]
20081007-00:07:35 Parms: /dev/Neo4100 loaded 96 /dev/ULTRIUM-TD4-D3 2
20081007-00:07:35 Doing mtx -f /dev/Neo4100 2 -- to find what is
loaded
20081007-00:07:35 Parms: /dev/Neo4100 load 96 /dev/ULTRIUM-TD4-D3 2
20081007-00:07:35 Doing mtx -f /dev/Neo4100 load 96 2
20081007-00:07:35 Device /dev/ULTRIUM-TD4-D3 - not ready, retrying...
[...]
20081007-00:12:34 Device /dev/ULTRIUM-TD4-D3 - not ready, retrying...
20081007-00:12:36 Parms: /dev/Neo4100 loaded 96 /dev/ULTRIUM-TD4-D3 2
20081007-00:12:36 Doing mtx -f /dev/Neo4100 2 -- to find what is
loaded
20081007-00:12:36 Parms: /dev/Neo4100 load 96 /dev/ULTRIUM-TD4-D3 2
20081007-00:12:36 Doing mtx -f /dev/Neo4100 load 96 2
20081007-00:12:37 Device /dev/ULTRIUM-TD4-D3 - not ready, retrying...
[...]
20081007-00:17:35 Device /dev/ULTRIUM-TD4-D3 - not ready, retrying...
20081007-00:17:37 Parms: /dev/Neo4100 loaded 111 /dev/ULTRIUM-TD4-D3 2
20081007-00:17:37 Doing mtx -f /dev/Neo4100 2 -- to find what is
loaded
20081007-00:17:37 Parms: /dev/Neo4100 load 111 /dev/ULTRIUM-TD4-D3 2
20081007-00:17:37 Doing mtx -f /dev/Neo4100 load 111 2
20081007-00:18:14 Parms: /dev/Neo4100 loaded 111 /dev/ULTRIUM-TD4-D3 2
20081007-00:18:14 Doing mtx -f /dev/Neo4100 2 -- to find what is
loaded
20081007-00:18:18 Parms: /dev/Neo4100 unload 111 /dev/ULTRIUM-TD4-D3 2
20081007-00:18:18 Doing mtx -f /dev/Neo4100 unload 111 2


Then the sd died:

07-Okt 00:19 VU0EA003-sd: ABORTING due to ERROR in dev.c:724
dev.c:723 Bad call to rewind. Device "ULTRIUM-TD4-D3"
(/dev/ULTRIUM-TD4-D3) not open
Kaboom! bacula-sd, VU0EA003-sd got signal 11 - Segmentation violation.
Attempting traceback.
Kaboom! exepath=/usr/sbin/
Calling: /usr/sbin/btraceback /usr/sbin/bacula-sd 15802


http://www.bacula.org/en/dev-manual/What_Do_When_Bacula.html

gdb is installed but bacula-sd is not running as root, maybe that was
the reason why I got no traceback by mail.


Anyway, I've seen this 'not ready, retrying...' problem only once
5 months ago. There is nothing in the system logs or the changer
logfile when it happens.

Any ideas what I've to do to prevent bacula from crash at that point?

I've changed the mtx-changer script to wait a bit longer:

wait_for_drive() {
  i=0
  while [ $i -le 50 ]; do  # Wait max 1000 seconds
    if mt -f $1 status | grep "${ready}" >/dev/null 2>&1; then
      break
    fi
    debug "Device $1 - not ready, retrying..."
    sleep 1
    i=`expr $i + 20`
  done
}

I've no idea what the drive was doing during the 15 minutes this night...

Ralf

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users

<Prev in Thread] Current Thread [Next in Thread>