Bacula-users

[Bacula-users] btape fill test fails ioctl MTWEOF and final unload

2009-03-18 12:12:50
Subject: [Bacula-users] btape fill test fails ioctl MTWEOF and final unload
From: "(private) HKS" <hks.private AT gmail DOT com>
To: "bacula-users AT lists.sourceforge DOT net" <Bacula-users AT lists.sourceforge DOT net>
Date: Wed, 18 Mar 2009 11:59:36 -0400
OpenBSD 4.4, Bacula 2.2.8, Dell Powervault 124T autochanger.

In the continuing saga of making my autochanger work with Bacula, I've
finally got my mtx-changer script squared away (coming soon) and both
drive and autochanger tests work fine. The last obstacles in my way
right now are a pair of errors when I run the "fill" test on multiple
tapes.

First is the ioctl MTWEOF error after all the data is written to the first tape:
----
17-Mar 15:06 btape JobId 0: Error: block.c:569 Write error at
617:14776 on device "124T-Drive" (/dev/nrst0). ERR=Input/output error.
17-Mar 15:06 btape JobId 0: Error: Error writing final EOF to tape.
This Volume may not be readable. dev.c:1669 ioctl MTWEOF error on
"124T-Drive" (/dev/nrst0). ERR=Input/output error.
----

Even though this fails, the read-back works fine:
----
15:10:55 Done filling tapes at 0:13. Now beginning re-read of first tape ...
<snip>
The last block of the first tape matches.
----

I've update firmware, cleaned the drive, and tested multiple tapes.
Since it seems to be working anyhow, is this something to even be
concerned about?



The second error is when it tries to unload the first tape to read the
second (ignore the slot # weirdness):
----
17-Mar 16:43 btape JobId 0: 3307 Issuing autochanger "unload slot 13,
drive 0" command.
17-Mar 16:43 btape JobId 0: 3995 Bad autochanger "unload slot 13,
drive 0": ERR=Child died from signal 9: Kill, unblockable
Results=Program killed by Bacula watchdog (timeout)
----

I can't figure out why this is being killed, and so quickly. This is
the /only/ unload that is failing in these tests - there are at least
two others in the "fill" test that work correctly, and the autochanger
test works flawlessly. According to my debug log, the script is killed
while (or perhaps immediately after?) trying to take the drive offline
(mt offline).

Why would this be timing out in less than a minute? "Maximum Changer
Wait Time" is set to 300. Is there another timeout value that needs to
be set? Or does this need to be set on the Autochanger resource?



Thanks for any help.

-HKS



bacula-sd.conf:
-----
Storage {                             # definition of myself
  Name = bacula-sd
  SDPort = 9103                  # Director's port
  WorkingDirectory = "/bacula"
  Pid Directory = "/var/run"
  Maximum Concurrent Jobs = 20
}

Director {
  Name = bacula-dir
  Password = "<munged>"
}

# Dell Powervault 124T autochanger
Autochanger {
  Name = 124T-Autochanger
  Device = 124T-Drive
  Changer Command = "/usr/local/libexec/bacula/mtx-changer %c %o %S %a %d"
  Changer Device = /dev/ch0
}

Device {
  Name = 124T-Drive
  Drive Index = 0
  Media Type = 124T
  Archive Device = /dev/nrst0
  AutomaticMount = yes;               # when device opened, read it
  AlwaysOpen = yes;
  RemovableMedia = yes;
  RandomAccess = no;
  AutoChanger = yes
  Maximum Open Wait = 300
  Maximum Changer Wait = 300

  # requested by btape
  Hardware End of Medium = No
  Fast Forward Space File = No
  BSF at EOM = yes
}

------------------------------------------------------------------------------
Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are
powering Web 2.0 with engaging, cross-platform capabilities. Quickly and
easily build your RIAs with Flex Builder, the Eclipse(TM)based development
software that enables intelligent coding and step-through debugging.
Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users

<Prev in Thread] Current Thread [Next in Thread>