Bacula-users

Re: [Bacula-users] Maximum Changer Wait directive

2014-04-02 14:24:56
Subject: Re: [Bacula-users] Maximum Changer Wait directive
From: Alan Brown <ajb2 AT mssl.ucl.ac DOT uk>
To: "Rao, Uthra R. (GSFC-672.0)[ADNET SYSTEMS INC]" <uthra.r.rao AT nasa DOT gov>, Alan Brown <ajb2 AT mssl.ucl.ac DOT uk>, "Ana Emília M. Arruda" <emiliaarruda AT gmail DOT com>
Date: Wed, 02 Apr 2014 19:19:27 +0100
On 02/04/14 18:07, Rao, Uthra R. (GSFC-672.0)[ADNET SYSTEMS INC] wrote:
> Alan,
>
> In my case it looks like the autochanger is unable to get the tape out of its 
> slot. I have already opened a case with IBM and they suggested a Firmware 
> upgrade which I did last week. This problem has not occurred since the 
> firmware upgrade but I have to wait longer for this conclusion. I was looking 
> in to    adding " Maximum Changer Wait directive 3d" so that if the tape gets 
> stuck in the slot during the weekend the job will not fail and I will have 
> some time to manually mount the needed tape from the command line.  I would 
> also be interested in the scrip that you have mentioned to unlock the drives.
>


What does "mt -f /dev/{drive} unlock" return?

What about "mt -f /dev/{drive} eject"

Can you look insde the changer to see if the tape ejects?


If these don't work then the problem is most likely to be a locking 
issue - scsi locks set via one interface have to be unset on the same 
interface.


The quick'n'dirty solution is to power cycle the autochanger, but it's 
far better to identify if the same tape drive is being seen multiple times.

The attached (ugly) script will do the trick, but first:

mkdir -p /etc/bacula/DEVICES/
ln -s /dev/tape/by-id/*-nst /etc/bacula/DEVICES/

If you can ID which drive is in which position on which changer, then

ln -s /dev/tape/by-id/{wwid}-nst 
/etc/bacula/DEVICES/CHANGER_{foo}-DRIVE-{bar}

this will give stable anchor points for bacula to use instead of having 
to reconfigure bacula-sd and bacula-dir every time a drive is changed 
out or the fabric changes.

Note that it is _extremely dangerous_ to refer to /dev/nst* (or 
/dev/st*) in a fabric environment as they tend to jump around at the 
slightest provokation, resulting in attempts to load/unload the wrong 
drive, (or worse, write to the wrong one).

=====
$ cat /usr/local/bin/unlocktapedrive.sh

#!/bin/bash

# on a Fibrechannel host with 2 connections to the fabric,
# a tape drive will show up twice.
#
# The same thing will occur if the drive has 2 fabric connections
# and if both ends are multiply connected then the numbers end up
# multiplied (dual + dual = 4 instances seen)
#
# This script assumes only 2 instances and will need modifying for more.
#
# It's normal for udev to swap /dev/tape/by-id around from time to time,
# which is fine for standard operations, BUT:
#
# Tape drive door locks are set per-initiator and ORed.
# Therefore a drive might be locked by one initiator
# and unlocked from the other. That doesn't work as locks
# set by one initator have to be released by the same initator
#
# This script gets BOTH /dev/nst devices for each
# /etc/bacula/DEVICES/{drive} (symlinks to /dev/tape/by-id/)
# and sends unlocks to them, to be on the safe side.
#
# input is assumed to be /etc/bacula/DEVICES/drive

#echo 0 $0 1 $1 2 $2 3 $3

if test -z $1
         then
         echo Argument: /etc/bacula/DEVICES/drive
         exit 1
fi

if ! test -L $1
         then
         echo Argument: /etc/bacula/DEVICES/drive
         exit 1
fi

# GET THE /dev/tape/by-id for this device
export INDIRECT=`/bin/ls -l $1 | /bin/cut -f2 -d\>`

if ! test -L $INDIRECT
         then
         echo I have lost that drive! Consider running udevtrigger to 
recover it.
         exit 1
fi

#echo $INDIRECT

# get the /dev/nst being used.
export DEVICE=`/bin/ls -l $INDIRECT | /bin/cut -f2 -d\>  | /bin/cut -f3 -d/`

#echo $DEVICE

# get the device's /dev/tape/by-path entry
export PATH=`/bin/ls -l /dev/tape/by-path | /bin/grep $DEVICE$  | 
/bin/cut -f4 -d-`


#echo $PATH

# get the OTHER device
export DEVICEGHOST=`/bin/ls -l /dev/tape/by-path/*$PATH*-nst | /bin/grep 
-v $DEVICE$ | /bin/cut -f2 -d\> | /bin/cut -f3 -d/`

# echo $DEVICEGHOST
echo $1 $INDIRECT $DEVICE $PATH $DEVICEGHOST
#echo $DEVICE $DEVICEGHOST
#
export STATUS=`/bin/mt -f /dev/$DEVICE status | /bin/grep OPEN`

if test -n "$STATUS"
   then echo $d already unlocked
   else /bin/mt -f /dev/$DEVICE unlock
        /bin/mt -f /dev/$DEVICEGHOST unlock
fi
#
exit 0
=================


This script is also useful. I use it as one of my RunAfterJobs.


============================
$ cat cat /usr/local/bin/gettapeinfo.sh

#!/bin/bash
# input is assumed to be /etc/bacula/DEVICES/{drive}

if test -z $1
         then
         echo Argument: /etc/bacula/DEVICES/drive
         exit 1
fi

# GET which tape is in the drive
export DRIVE=`echo $1 | cut -f2 -d"-"`
export CHANGER=`echo $1 | cut -f1 -d"-"`
export CONTENT=`mtx -f $CHANGER-changer status | grep "Data Transfer 
Element "$DRIVE`

# GET THE /dev/tape/by-id for this device
export INDIRECT=`ls -l $1 | cut -f2 -d\>`

# get the /dev/nst being used.
export DEVICE=`ls -l $INDIRECT | cut -f2 -d\>  | cut -f3 -d/`

# get the /dev/sg
export GENERIC=`ls /sys/class/scsi_tape/$DEVICE/device/scsi_generic | 
rev | cut -f1 -d" " | rev`

echo $1 $INDIRECT $DEVICE $GENERIC $CHANGER > /tmp/tapeinfo.log.$$
echo $1 $CONTENT >> /tmp/tapeinfo.log.$$

smartctl -T permissive -H -d scsi -a -l error /dev/$GENERIC >> 
/tmp/tapeinfo.log.$$
tapeinfo -f /dev/$GENERIC >> /tmp/tapeinfo.log.$$
smartctl -T permissive -H -d scsi -a -l error /dev/$GENERIC >> 
/tmp/tapeinfo.log.$$
tapeinfo -f /dev/$GENERIC >> /tmp/tapeinfo.log.$$

export TAPEALERT=`grep "TapeAlert Error" /tmp/tapeinfo.log.$$ | cut -c1-5`

if test -n $TAPEALERT
         then
         # this section is shamelessly ripped from 
http://wiki.bacula.org/doku.php?id=tapealert

         export BACULA_ETC="/opt/bacula/etc/bacula-dir.conf.d"
         export BACULA_DIR_CONF="messages"

         export MAIL_BIN="/usr/bin/mail"
         export SUBJECT="Bacula tapedrive TapeInfo alert"

         export TAPEINFO_LOG="/tmp/tapeinfo.log.$$"

         # --- get email-address of Bacula's Tape-Operator & System 
Administrator ---
         ## AJB2 ours is in /opt/bacula/etc/bacula-dir.conf.d/messages 
thanks to include statements - hence the "odd" locations.
         export BACDIRCONF=$BACULA_ETC"/"$BACULA_DIR_CONF

         ## get the first email-address of the Bacula Tape Operator(s)
         export bo=`cat $BACDIRCONF |sed -e 's/^[ \t]*//' | grep -w 
^operator |cut -d"=" -f2 | head -1`
         export bo=${bo//[[:space:]]}

         ## get the first mail-address of the Bacula System Administrator(s)
         export bs=`cat $BACDIRCONF |sed -e 's/^[ \t]*//' | grep -w 
^mail |cut -d"=" -f2 | head -1`
         export bs=${bs//[[:space:]]}

         #echo "Email-address of Bacula's Tape-operator : $bo"
         #echo "Email-address of Bacula's System Admin  : $bs"

         # sanity check for email-addresses:
         if [ ! -n "$bo" ]; then
           echo "Could not retrieve an email-address for the Bacula 
Tape-Operator."
           exit 1;
         fi

         if [ ! -n "$bs" ]; then
           echo "Could not retrieve an email-address for the Bacula 
System Administrator."
           exit 1;
         fi

         # sanity check for existence of the logfile
         if [ ! -f $TAPEINFO_LOG ]; then
           echo "Could not find the logfile containing TapeInfo results."
           exit 1;
         fi

         # now actually do what is intended to be done!
         export TA_OK=`cat $TAPEINFO_LOG | grep "TapeAlert: OK" -c`
         if [ $TA_OK != "1" ]; then
           # hmm. TapeInfo is not OK. Send an email!
           $MAIL_BIN -s "$SUBJECT" -c "$bo" "$bs" < $TAPEINFO_LOG
         fi
fi

cat /tmp/tapeinfo.log.$$
rm /tmp/tapeinfo.log.$$

echo $1 $INDIRECT $DEVICE $GENERIC $CHANGER $CONTENT

=======================


> Thank you.
> Uthra
>
> -----Original Message-----
> From: Alan Brown [mailto:ajb2 AT mssl.ucl.ac DOT uk]
> Sent: Wednesday, April 02, 2014 12:47 PM
> To: "Ana Emília M. Arruda"; Rao, Uthra R. (GSFC-672.0)[ADNET SYSTEMS INC]
> Cc: Alan Brown; bacula-users AT lists.sourceforge DOT net
> Subject: Re: [Bacula-users] Maximum Changer Wait directive
>
> On 02/04/14 17:18, Ana Emília M. Arruda wrote:
>> Hi Ulthra,
>>
>> I had this problem with my tape library. But it was just with one
>> specific slot. It had physical problems and frequently stuck the tape.
>
> I've run into jamming problems too (but that was on a Neo8000)
>
> Does the tape library have a front panel or webmin option to run a robot 
> recalibration routine?
>
> The issue is trying to find whether the drive is still locked (if the server 
> has multiple fibre connectors it's possible to lock the drive twice), or if 
> the tape is mechanically jamming or if there's another factor at work.
>
> If it's a locking issue I have a script which can be tweaked to unlock all 
> iterations of the same drive.
>
> If mechanical then a support call is probably required.
>
>
>> Regards,
>> Ana
>>
>>
>> On Tue, Apr 1, 2014 at 3:39 PM, Rao, Uthra R. (GSFC-672.0)[ADNET
>> SYSTEMS INC] <uthra.r.rao AT nasa DOT gov <mailto:uthra.r.rao AT nasa DOT 
>> gov>> wrote:
>>
>>      How is the library connected to the bacula server?
>>      -- The Tape Library is connected to the server through fiber.
>>
>>        What's the exact error message you get?
>>      -- I get Media Error when this happens.
>>
>>      What shows on the library front panel?
>>      -- "Media attention!"
>>
>>      Thanks,
>>      Uthra
>>
>>
>>      -----Original Message-----
>>      From: Alan Brown [mailto:ajb2 AT mssl.ucl.ac DOT uk
>>      <mailto:ajb2 AT mssl.ucl.ac DOT uk>]
>>      Sent: Tuesday, April 01, 2014 2:17 PM
>>      To: Rao, Uthra R. (GSFC-672.0)[ADNET SYSTEMS INC];
>>      bacula-users AT lists.sourceforge DOT net
>>      <mailto:bacula-users AT lists.sourceforge DOT net>
>>      Subject: Re: [Bacula-users] Maximum Changer Wait directive
>>
>>      On 01/04/14 18:23, Rao, Uthra R. (GSFC-672.0)[ADNET SYSTEMS INC] wrote:
>>       > I am running bacula 5.2.12 on RHEL6 O.S. with IBM TS3200 Tape Library
>>       > with two LTO5 drives. Sometimes the autochanger is unable to mount a
>>       > required tape as the tape gets stuck in its slot and needs manual
>>       > intervension.
>>
>>      "stuck" can mean many things.
>>
>>      How is the library connected to the bacula server?
>>
>>      What's the exact error message you get?
>>      What shows on the library front panel?
>>
>>
>>
>>
>>
>>      
>> ------------------------------------------------------------------------------
>>      _______________________________________________
>>      Bacula-users mailing list
>>      Bacula-users AT lists.sourceforge DOT net
>>      <mailto:Bacula-users AT lists.sourceforge DOT net>
>>      https://lists.sourceforge.net/lists/listinfo/bacula-users
>>
>>
>
>
>
>
>
>




------------------------------------------------------------------------------
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users

<Prev in Thread] Current Thread [Next in Thread>