Networker

Re: [Networker] How to get more verbose nsrjb information and error messages in logfile

2003-08-26 01:42:20
Subject: Re: [Networker] How to get more verbose nsrjb information and error messages in logfile
From: Al <citecatt AT CITECUB.CITEC.QLD.GOV DOT AU>
To: NETWORKER AT LISTMAIL.TEMPLE DOT EDU
Date: Tue, 26 Aug 2003 01:40:21 -0400
On Thu, 21 Aug 2003 11:35:22 +0200, Renty, Bart <bart.renty AT HP DOT COM> 
wrote:

>Yesterday a backup was looping with the following messages :
>
>08/20/03 02:34:49 PM nsrd: media info: suggest mounting CIR664 on
>beqtrntrn1ms2m2 for writing  to pool 'MS2 Incr' 08/20/03 02:34:51 PM
>nsrd: media info: suggest mounting CIR633 on beqtrntrn1ms2m2 for writing
>to pool 'Index' 08/20/03 02:36:25 PM nsrd: media event cleared: Waiting
>for 1 writable volumes to backup pool 'Index' tape(s) on beqtrntrn1ms2m2
>08/20/03 02:36:58 PM nsrd: media waiting event: Waiting for 1 writable
>volumes to backup pool 'Index' tape(s) on beqtrntrn1ms2m2 08/20/03
>02:39:14 PM nsrd: media info: suggest mounting CIR638 on beqtrntrn1ms2m2
>for writing  to pool 'Index' 08/20/03 02:39:15 PM nsrd: media info:
>Labeling a new writable volume for pool 'MS2 Incr' 08/20/03 02:40:49 PM
>nsrd: media info: suggest mounting CIR664 on beqtrntrn1ms2m2 for writing
>to pool 'MS2 Incr' 08/20/03 02:40:49 PM nsrd: media info: suggest
>mounting CIR633 on beqtrntrn1ms2m2 for writing  to pool 'Index' 08/20/03
>02:44:08 PM nsrd: media info: suggest mounting CIR638 on beqtrntrn1ms2m2
>for writing  to pool 'Index' 08/20/03 02:44:12 PM nsrd: media info:
>Labeling a new writable volume for pool 'MS2 Incr' 08/20/03 02:47:41 PM
>nsrd: media critical event: Waiting for 1 writable volumes to backup
>pool 'MS2 Incr' tape(s) on beqtrntrn1ms2m2 08/20/03 02:50:17 PM nsrd:
>media info: suggest mounting CIR664 on beqtrntrn1ms2m2 for writing  to
>pool 'MS2 Incr'
>
>Initially we considered this as 'normal' assuming that there was no
>device available, but later on we got similar problems waiting for a
>restore :
>
>08/20/03 03:54:46 PM nsrd: beqtrntrn1ms1m2: browsing
>08/20/03 03:54:59 PM nsrd: media waiting event: waiting for tz89 tape
>CIR657 on cl1mbr2.trn.lighting.philips.com 08/20/03 04:10:09 PM nsrd:
>media critical event: waiting for tz89 tape CIR657 on
>cl1mbr2.trn.lighting.philips.com 08/20/03 04:23:58 PM nsrd:
>beqtrntrn1ms1m2: browsing
>
>So we finally started investigating these 'waitings', and found out that
>when trying to mount a volume manually, we got a popup box "input/output
>error".  As we got this for all devices in this jukebox, we concluded it
>to be a jukebox problem.  The control panel on the TL895 jukebox itself
>showed indeed something like "interface to library busy", and didn't
>react when pressing on the touchscreen. We recycled the jukebox, and all
>problems were solved.
>
>My question is not about the problem itself, but about a way to become
>informed about this kind of problems, as we couldn't see anything
>"wrong" in the logfile (why did'nt we see the input/output errors ?).
>
>Is there a way
>
>- to get more verbose output in the logfile about the nsrjb commands,
>and their returning statusses (input/output error ?)

You can try to run a nsrjb load command with extra -vvvv (I am not sure if
this is what you are after).


> and/or
>
>- to trap this kind of problems (notification ?)

A few things.  Not sure if this applies to your jukebox.  We have had
similary problem with L20 autoloaders.

1) You can set a Legato Networker notification that will perform some action
(say paging with an alarm?), when there is a critical wait request.  The
event type would be media with priority critical.  Bare in mind that this
could cause fail alarm sometimes.

2) Do you any other means of checking the autoloader on the OS level?  Like
with our L20 it has a build-in web interface that you can actually www into
it to check the status of the L20.  If you cannot even get the web interface
of the L20 then the interface has malfuntioned.

3) There is a jukebox tag command that comes with the storage node software
called sjirdtag that is located in /etc/LGTOuscsi (under Solaris).  The man
pages shows the description as "test the SJI Jukebox Interface"  This
command is meant to query the autoloader's controlling interface for a list
of physical tape labels from the OS level.  At the same time a very handy
command to query the autoloader controlling interface to check on its
status. At least in my opinion anyway.

I hope this helps you, as I have not dealt with a TL895 nor do I know what
OS you are using.

Cheers,

>
>
>Bart
>
>--
>Note: To sign off this list, send a "signoff networker" command via email
>to listserv AT listmail.temple DOT edu or visit the list's Web site at
>http://listmail.temple.edu/archives/networker.html where you can
>also view and post messages to the list.
>=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=

--
Note: To sign off this list, send a "signoff networker" command via email
to listserv AT listmail.temple DOT edu or visit the list's Web site at
http://listmail.temple.edu/archives/networker.html where you can
also view and post messages to the list.
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=

<Prev in Thread] Current Thread [Next in Thread>