Hi,
04.05.2008 14:18, Erik Persson wrote:
> On 4 maj 2008, at 04.40, John Drescher wrote:
>
>> On Sat, May 3, 2008 at 10:31 AM, Erik Persson <erik AT lysator.liu DOT se>
>> wrote:
>>> Hello!
>>>
>>> This would be my first post to this list. We have deployed three
>>> bacula installations at the company I work for and it has been
>>> working
>>> fairly well except for one thing:
>>>
>>> Whenever we are running a backup job that causes the tape library to
>>> run out of tapes we usually get sd crashes while labelling (or
>>> possibly even just inventorying) new tapes. Sequence as follows:
>>>
>>> * Bacula wants to mount a tape for pool foo
>>> * unmount is performed
>>> * Magazine is ejected
>>> * New unlabeled tapes are loaded
>>> * Labeling is requested
>>> * bacula-sd dies
That would be a bug.
>>>
>>> This happens pretty much on every attempt. I am unsure wether it
>>> also
>>> happens if just an update slots on appendable/purged media is
>>> performed.
>>>
>>> We have not seen any crashes while labeling or scanning if bacula is
>>> otherwise idle so defining the jobs so that they are guaranteed not
>>> to
>>> run out of tapes does kind of take care of the problem.
>>>
>>> A sysadmin friend who has been dealing with mtx (with bacula and
>>> other
>>> backup software) told me that it may be a bit lacking when it comes
>>> to
>>> error handling
I kind of agree... mtx simply returns some sort of dump of the SCSI
error data, which is definitely not easily human readable.
>>> and his theory was that the sd may get confused if mtx
>>> returns something bad.
It shouldn't... usually, if the mtx process (or mtx-changer) returns
something unexpected, the SD considers this a problem, dumps the whole
stuff to the defined logging places (log file, console, and mail
usually) and asks for intervention.
>>> One common denominator for these systems is that they are using
>>> Overland 20-slot libraries and we have noticed that if an mtx command
>>> is issued while a library operation already is in progress we
>>> typically get a SCSI error in return. Could this have something to
>>> do
>>> with it?
It could, though I guess it would be a bit deeper in the code than
simply the SD stumbling over malformed output.
>>> Any hints would be greatly appreciated.
>>>
>>>
>> Do you wait 5 minutes after inserting the magazine before trying to
>> issue bacula commands to allow the archive inventory to finish? Did
>> you do an update slots before you tried to label the tapes?
>>
>> John
>
> It's a bit of a walk back to the office but I cannot guarantee that
> the library might not have still been inventorying itself. I'll do an
> mtx status outside of bacula next time to make sure it's ready.
>
> No, I don't usually do an update slots before labeling. I'll try that
> too and see what happens.
>
> But still: Is it not a bit odd that the sd just dies if it runs into
> a transient error or inconsistency?
Definitely. I'd try capturing debug trace output from the SD and
trigger the error condition.
Also, you should get a backtrace of the SD if you've got the SD
running un-stripped, and gdb is available.
With a backtrace plus debug output, this would be worth a bug report.
Arno
> Best regards,
>
> /Erik
>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by the 2008 JavaOne(SM) Conference
> Don't miss this year's exciting event. There's still time to save $100.
> Use priority code J8TL2D2.
> http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
> _______________________________________________
> Bacula-users mailing list
> Bacula-users AT lists.sourceforge DOT net
> https://lists.sourceforge.net/lists/listinfo/bacula-users
>
--
Arno Lehmann
IT-Service Lehmann
www.its-lehmann.de
-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference
Don't miss this year's exciting event. There's still time to save $100.
Use priority code J8TL2D2.
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users
|