Bacula-users

Re: [Bacula-users] [Bacula-devel] bacula mount request

2011-02-03 12:37:58
Subject: Re: [Bacula-users] [Bacula-devel] bacula mount request
From: Alan Brown <ajb2 AT mssl.ucl.ac DOT uk>
To: Kern Sibbald <kern AT sibbald DOT com>
Date: Thu, 03 Feb 2011 17:35:20 +0000
Kern Sibbald wrote:

> I would recommend:
> 
> 1. Ask HP *exactly* what the tape alert means.

It means what it says.

> 2. Write a little program that you can put in the Bacula "unmount" script 
> that 
> explicitly unlocks the drive door, then use mt offline to eject the tape.

I've been trying this by hand. It doesn't work.

>  The 
> combination of the two should unload the tape.  If they do not, ask HP if 
> they 
> support Linux, and if they say yes, don't mention Bacula but ask them why you 
> are unable to unload a drive.

I've already been doing this - for 6 years. HP don't want to know about it.

HOWEVER, I _have_ managed to work out what's causing my problem, thanks 
to an offhand comment from someone on one of the HP tape support forums.

Apparently, drive locking is supposed to be a per-host issue - so if one 
host issues a drive lock, another host can't unlock it.

To make things even more confusing, a locked drive can still be 
read/written/forward spaced/etc by hosts other than the one which set 
the lock.

The kicker: Documentation is wrong. It's _not_ per-host, it's PER INITATOR

In the case of Fibre channel, multiply connected systems (ie: more than 
one fibre connection to the fabric), one host gets seen by the tape 
drives as _several initiators_.

I've got 2 connections to the fabric from the -sd machine and each drive 
shows up as 2 st devices. The order they appear in is random

On Linux, udev then (randomly) maps /dev/tape/by-id/{WWID}-nst to one of 
the /dev/nst devices.

I've verified the problem by locking a drive using /dev/nstA and then 
attempting to unlock using /dev/nstB - it repeatably gives the same 
symptoms as I've been seeing. Unlocking on /dev/nstA works fine.

On a fabric the path to a drive may change for a number of reasons. Not 
being able to cope with path changes is a very bad thing.

This throws a huge spanner in the works of using multiple connections 
for path redundancy and/or bandwidth and it appears to be the same 
mistake made across the entire spectrum of drive manufacturers.

I've flagged this to the our drive robot maker (Overland). At the moment 
they appear seriously concerned about this as they push the use of 
multiple ports on servers for path redundancy.

I feel the lock/unlock semantics are just plain wrong and require too 
much manual intervention if there are problems. The whole "reboot/power 
cycle it if it plays up" attitude is something instilled from 
DOS/Windows and other unreliable systems. I'd hate to have to do that if 
my car ignition computer died while out on the highway and I don't want 
to be forced to do it on what's supposed to be a lights-out installation.

There has to be a better way of addressing the issue.

AB



------------------------------------------------------------------------------
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires 
February 28th, so secure your free ArcSight Logger TODAY! 
http://p.sf.net/sfu/arcsight-sfd2d
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users