Bacula-users

Re: [Bacula-users] multiple mtx-changer scripts

2008-07-22 12:25:57
Subject: Re: [Bacula-users] multiple mtx-changer scripts
From: Mark Nienberg <gmane AT tippingmar DOT com>
To: bacula-users AT lists.sourceforge DOT net
Date: Tue, 22 Jul 2008 09:25:36 -0700
Timo Neuvonen wrote:

>> P.S. Why?  Because I have modified the "load" command to use "btape" to 
>> read the
>> bacula label on the volume after loading it.  If it fails, then it unloads 
>> and
>> reloads the tape and tries again. This is an attempt to automatically fix 
>> the I/O
>> problem that happens about 10-20% of the times bacula changes a tape in 
>> the middle of
>> a backup.  Note that it is not a timing problem.  It happens even after
>> "wait_for_drive" returns ready.  Obviously, when working at bconsole, 
>> there are some
>> commands that I have to perform on tapes that have no labels, so I need a 
>> different
>> "load" command.  But don't get bogged down thinking about all this, just 
>> think about
>> the question above, OK?
>>
> 
> Could you tell something more about this problem, what hardware you have 
> etc?
> 
> I have experienced some, possibly similar problems with Exabyte 1U 10 tape 
> autochanger. No way to reproduce the problem, but happens quite often when 
> tapes need to be changed in real work :-(

Yes, I have an Exabyte 1U 10 tape changer with a VXA 320 drive in it.  The 
problem 
looks like this:

19-Jul 03:32 khyber-sd JobId 2235: 3305 Autochanger "load slot 6, drive 0",
status is OK.
19-Jul 03:32 khyber-sd JobId 2235: 3301 Issuing autochanger "loaded? drive 0" 
command.
19-Jul 03:32 khyber-sd JobId 2235: 3302 Autochanger "loaded? drive 0",
result is Slot 6.
19-Jul 03:32 khyber-sd JobId 2235: Error: block.c:995 Read error on fd=3 at 
file:blk 
0:0 on device "VXA3drive" (/dev/nst0). ERR=Input/output error.
19-Jul 03:32 khyber-sd JobId 2235: Error: block.c:995 Read error on fd=3 at 
file:blk 
0:0 on device "VXA3drive" (/dev/nst0). ERR=Input/output error.

When I first got the changer I saw other errors too, but those were all related 
to 
timing problems with mtx-changer not waiting long enough (specifically the 
wait_for_drive subroutine).  I have since resolved all of that by tweaking 
mtx-changer and the Max Changer Wait setting in the device resource.  But the 
error 
above still happens.  As you say, it is not reproducible.  Generally I can use 
bconsole to unmount and then mount the tape again and it will proceed.  But 
this is 
putting a strain on my Saturday mornings, since my full tape backup runs Friday 
night!

Since I have never been able to find the cause of the problem I have decided 
instead 
to try to work around it as I described in my initial message.  In effect, I 
want the 
mtx-changer script to do what I have to do manually (unmount and mount again).

I do daily backups to disk so my tape backups only happen once a week, and 
since the 
problem doesn't happen every time it can be a month or so before I have real 
world 
results of the effectiveness of my revisions.  So if you are interested in 
testing my 
script, let me know.

Mark


-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users