Bacula-users

Re: [Bacula-users] Tape Jobs failing since upgrade to 2.4.0

2008-08-06 12:24:39
Subject: Re: [Bacula-users] Tape Jobs failing since upgrade to 2.4.0
From: Kern Sibbald <kern AT sibbald DOT com>
To: "Mingus Dew" <shon.stephens AT gmail DOT com>
Date: Wed, 6 Aug 2008 18:23:58 +0200
On Wednesday 06 August 2008 16:39:03 Mingus Dew wrote:
> Yes,
>     The bacual-users list did solve my problem very well. John Drescher
> recommended an upgrade to 2.4.2, which I performed and tested. It seems to
> have solved the both my problems.

Thanks for the feedback.

Sorry for the problems we had in 2.4.0 and 2.4.1, it was the first really 
major change in the SD acquire/mount algorithm for at least 5+ years, so some 
of the finer points escaped me and the regression tests.  Hopefully 
everything is covered in 2.4.2, though I am still worried about a few of the 
more esoteric options like polling ... as they are not automatically tested.

Best regards,

Kern

>
> Thank you,
> Shon
>
> On Mon, Aug 4, 2008 at 2:48 PM, Kern Sibbald <kern AT sibbald DOT com> wrote:
> > I haven't looked at your problems, since the bacula-users list does that
> > very
> > well, but I can recommend two things:
> >
> > 1. Upgrade right away to version 2.4.2
> >
> > 2. If you ever modified your mtx-changer script, all those changes are
> > lost during an upgrade, so you will need to recover your old script or
> > re-make the
> > same changes.  This problem is fixed in the development code for the next
> > version.
> >
> > Regards,
> >
> > Kern
> >
> > On Monday 04 August 2008 18:35:12 Mingus Dew wrote:
> > > All,
> > >
> > >      I have been having problems with my tape jobs failing since
> >
> > upgrading
> >
> > > to 2.4.0. I am running Bacula 2.4.0, compiled from source, on Solaris
> > > 10_x86 platform. My tape drive is a SCSI attached Exabyte Magnum 224
> > > Autoloader with LTO-3 drive. Previously I rarely had issues with tape
> >
> > jobs
> >
> > > beyond the occasional volume replacement.
> > >
> > >      Initially I thought I might need to increase the timeout for drive
> > > responses. I did so in mtx-changer script. However, I am not sure this
> > > is the problem. I am having multiple issues now. The first is problems
> > > like this:
> > >
> > > 02-Aug 19:03 adm8 JobId 9820: Fatal error: job.c:1811 Bad response to
> > > Append Data command. Wanted 3000 OK data , got 3903 Error append data
> > >
> > > mt-back4.storage JobId 9820: Fatal error: 3992 Bad autochanger "load
> > > slot 13, drive 0": ERR=Child died from signal 15: Terminated.
> > > Results=mtx: Request Sense: Long Report=yes
> > > mtx: Request Sense: Valid Residual=no
> > > mtx: Request Sense: Error Code=0 (Unknown?!)
> > > mtx: Request Sense: Sense Key=No Sense
> > > mtx: Request Sense: FileMark=no
> > > mtx: Request Sense: EOM=no
> > > mtx: Request Sense: ILI=no
> > > mtx: Request Sense: Additional Sense Code = 00
> > > mtx: Request Sense: Additional Sense Qualifier = 00
> > > mtx: Request Sense: BPV=no
> > > mtx: Request Sense: Error in CDB=no
> > > mtx: Request Sense: SKSV=no
> > > MOVE MEDIUM from Element Address 13 to 81 Failed Program killed by
> > > Bacula watchdog (timeout)
> > >
> > >      First, I'm not sure why this would start timing out now, when its
> >
> > been
> >
> > > running properly for over a year. Secondly, I know that Element Address
> >
> > 13
> >
> > > is the slot that Bacula wanted to load, but I don't know what
> > > destination 81 is. Usually the tape drive is Data Element 0. I've run
> > > mtx manually to load, unload tapes and run mt to check tape status and
> > > have done all the testing in the manual that I did when initially
> > > configuring the drive.
> >
> > All
> >
> > > these tests performed as expected. I also recorded the response time of
> >
> > the
> >
> > > drive for loading/forwarding/rewinding/unloading tapes, and made sure
> >
> > that
> >
> > > mtx-changer was configured to account for these times with a margin for
> > > error.
> > >
> > >      I am also now experiencing problems with Bacula being able to pick
> > > tapes from Pools correctly. The job is waiting for an "Appendable
> > > Volume"
> > >
> > > 04-Aug 11:40 mt-back4.storage JobId 9877: Job
> > > Oracle_Weekly_Tape.2008-08-03_08.00.05 waiting. Cannot find any
> >
> > appendable
> >
> > > volumes.
> > > Please use the "label"  command to create a new Volume for:
> > >     Storage:      "Ultrium-TD3" (/dev/rmt/0cbn)
> > >     Pool:         Oracle_Tapes
> > >     Media type:   LTO-3
> > >
> > >      However, when I query for what tapes Bacula things are in the
> >
> > Changer
> >
> > > (and they are in the changer). I see that the Oracle_Tapes pool has 3
> > > volumes that are in an "Append" status...
> > >
> > > Choose a query (1-16): 15
> >
> > +---------+------------+-----------+-------------+------+---------------+
> >--
> >
> > >---------+-----------+
> > >
> > > | MediaId | VolumeName | GB        | Storage     | Slot | Pool         
> > > | |
> > >
> > > MediaType | VolStatus |
> >
> > +---------+------------+-----------+-------------+------+---------------+
> >--
> >
> > >---------+-----------+
> > >
> > > |       1 | A00001     | 0.00      | Exabyte_224 |    1 | Full_Tapes |
> > >
> > > LTO-3     | Recycle   |
> > >
> > > |       2 | A00002     | 0.00      | Exabyte_224 |    2 | Incr_Tapes  |
> > >
> > > LTO-3     | Recycle   |
> > >
> > > |       3 | A00003     | 1067.91   | Exabyte_224 |    3 | Incr_Tapes  |
> > >
> > > LTO-3     | Full      |
> > >
> > > |       4 | A00004     | 427.60    | Exabyte_224 |    4 | Dump_Tapes |
> > >
> > > LTO-3     | Full      |
> > >
> > > |       5 | A00005     | 626.40    | Exabyte_224 |    5 | Oracle_Tapes
> > > | |
> > >
> > > LTO-3     | Full      |
> > >
> > > |       6 | A00006     | 0.00      | Exabyte_224 |    6 | Full_Tapes |
> > >
> > > LTO-3     | Recycle   |
> > >
> > > |       7 | A00007     | 735.95    | Exabyte_224 |    7 | Full_Tapes |
> > >
> > > LTO-3     | Append    |
> > >
> > > |       8 | A00008     | 57.18     | Exabyte_224 |    8 | Dump_Tapes |
> > >
> > > LTO-3     | Append    |
> > >
> > > |       9 | A00009     | 0.00      | Exabyte_224 |    9 | Full_Tapes |
> > >
> > > LTO-3     | Recycle   |
> > >
> > > |      10 | A00010     | 0.00      | Exabyte_224 |   10 | Full_Tapes |
> > >
> > > LTO-3     | Recycle   |
> > >
> > > |      11 | A00011     | 972.31    | Exabyte_224 |   11 | Full_Tapes |
> > >
> > > LTO-3     | Full      |
> > >
> > > |      12 | A00012     | 1055.86   | Exabyte_224 |   12 | Incr_Tapes  |
> > >
> > > LTO-3     | Full      |
> > >
> > > |      95 | B00001     | 297.14    | Exabyte_224 |   13 | Diff_Tapes
> > >
> > > LTO-3     | Append    |
> > >
> > > |      94 | B00002     | 551.02    | Exabyte_224 |   14 | Incr_Tapes  |
> > >
> > > LTO-3     | Append    |
> > >
> > > |      98 | B00003     | 499.80    | Exabyte_224 |   15 | Diff_Tapes
> > >
> > > LTO-3     | Full      |
> > >
> > > |      97 | B00004     | 413.32    | Exabyte_224 |   16 | Diff_Tapes
> > >
> > > LTO-3     | Full      |
> > >
> > > |      96 | B00005     | 0.00      | Exabyte_224 |   17 | Diff_Tapes
> > >
> > > LTO-3     | Recycle   |
> > >
> > > |     101 | B00006     | 0.00      | Exabyte_224 |   18 | Oracle_Tapes
> > > | |
> > >
> > > LTO-3     | Append    |
> > >
> > > |     100 | B00007     | 0.00      | Exabyte_224 |   19 | Oracle_Tapes
> > > | |
> > >
> > > LTO-3     | Append    |
> > >
> > > |      99 | B00008     | 36.32     | Exabyte_224 |   20 | Oracle_Tapes
> > > | |
> > >
> > > LTO-3     | Append    |
> >
> > +---------+------------+-----------+-------------+------+---------------+
> >--
> >
> > >---------+-----------+
> > >
> > >      I have absolutely no idea why this is happening and any help or
> >
> > advice
> >
> > > is very much appreciated.
> > >
> > > Thank you,
> > > Shon



-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users