Networker

Re: [Networker] Drive unable to unload

2005-02-07 20:32:17
Subject: Re: [Networker] Drive unable to unload
From: Lee Stecklov <stecklov AT SYMPATICO DOT CA>
To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
Date: Mon, 7 Feb 2005 20:24:29 -0500
Brian Huffman wrote:

What does pres_val mean?  All other tape drives have pres_val=1, the one
that's malfunctioning has the following output.  And it *is* mounted and
has the cleaning light on.

       Elem[007]: tag_val=1 pres_val=0 med_pres=1 med_side=0
                  VolumeTag=<AAE634                          >

When I get to this state I try to eliminate Networker. Try sjiielm - that's to initialize the element status. and see if the value changes. In this case, the pres_val =0 and that med_pres=1 means there could be a problem reading the bar-code label. If this is the case, that could be the reason the cleaning indicator is on, assuming there is a regular volume in the drive. Don't run the command while there's juekbox activity, i.e, loading or unloading.

You can compare the state of the element with the jukebox map kept by nsrjb.
Just run nsrjb.
If there is a conflict there, you'll need to re-inventory this element. At worst, you'll have to ask Networker to reset the jukebox: nsrjb -HE.

Have you checked the messages log for any scsi related errors ? I/O errors could be poor drive/scsi issues. I've seen these kinds of problems on overloaded adapters - and yes, on the weekends when backups are busiest.

Lee Stecklov


-----Original Message-----
From: Legato NetWorker discussion [mailto:NETWORKER AT LISTSERV.TEMPLE DOT EDU]
On Behalf Of Itzik Meirson
Sent: Monday, February 07, 2005 4:13 PM
To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
Subject: Re: [Networker] Drive unable to unload

You do not really have to walk to the jukebox...
You can use the sjirdtag command that will talk directly to the jukebox
and provide you the REAL status of the drives/slots.
Itzik
-----Original Message-----
From: Legato NetWorker discussion [mailto:NETWORKER AT LISTSERV.TEMPLE DOT EDU] On Behalf Of Brian Huffman
Sent: Monday, February 07, 2005 18:59
To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
Subject: Re: [Networker] Drive unable to unload

***********************
Your mail has been scanned by Skipper and found to be clean from Viruses.
***********-***********


Thanks - Unfortunately this only ends in the drive being put into service mode, so this doesn't seem like to correct way for things to work. In addition, I don't really believe that the tape has been unmounted. I'll have to run over to the library to check this out, but the last time this happened, the drive was *not* unmounted, and it was completely unresponsive. It couldn't even be unmounted from the front panel of the jukebox. I needed to power-cycle the jukebox to resolve this issue.

Brian

-----Original Message-----
From: Legato NetWorker discussion [mailto:NETWORKER AT LISTSERV.TEMPLE DOT EDU]
On Behalf Of thierry.faidherbe AT HP DOT COM
Sent: Monday, February 07, 2005 11:43 AM
To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
Subject: Re: [Networker] Drive unable to unload

In high active datazone, it can happen that nsrd forget to update jukebox map (fields loaded barcode, loaded volume and loaded slot) while unloading volumes. Most of the time, it's done when multiple nsrjb are running concurrently, the changes done by the first nsrjb not being commited yet into config files and in memory jukebox maps.

Jukebox map not being updated, networker still think a tape to be loaded in the device and try to unload it. But reality is such no tape to be loaded.

Result is nsrmmd fails with an opening I/O error. Then nsrjb tries, during 20 minutes to unload the tape. After that, it reports an error and mark the slot as unloaded. (In some time, you can also lose the inventory of the slot containing the ghost volume before second nsrjb).

I never found a way to reduce these 20 minutes to something more acceptable but am working with it. It has nothing to do with load/unload sleep, these values just being the time networker waits between jukebox PUT to device or UNLOAD from device before opening the OS driver (/dev/rmt/...., \\.\tapex)

HTH,

Th


I have a similar problem on an ATL-P7000 tape library.
However I *am*
still on a Sun server platform.  Does anyone know what the unload
sleep
for that type of library should be?  Mine is set at "5".

I get these types of errors almost every weekend:

02/05/05 06:38:22 nsrd: /dev/rmt/45cbn Eject operation in progress
02/05/05 06:42:33 nsrd: media warning: /dev/rmt/45cbn opening: I/O
error
02/05/05 06:42:53 nsrd: media info: unload error for jukebox
`ATL-P7000'
detected.  Retrying
02/05/05 06:42:53 nsrd: media info: unload retry for jukebox
`ATL-P7000': sleeping 30 seconds
02/05/05 06:43:38 nsrd: media info: unload retry for jukebox
`ATL-P7000'
failed - will retry again.
02/05/05 06:43:38 nsrd: media info: unload retry for jukebox
`ATL-P7000': sleeping 30 seconds
<snip>

This continues until the drive goes into service mode.  Is this
possibly
related to the unload sleep?  Or is there something more wrong here?

Thanks,
Brian

--
Note: To sign off this list, send a "signoff networker" command via
email
to listserv AT listserv.temple DOT edu or visit the list's Web site at http://listserv.temple.edu/archives/networker.html where
you can also
view and post messages to the list. Questions regarding this list should be sent to stan AT temple DOT edu =*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=

--
Note: To sign off this list, send a "signoff networker" command via email to listserv AT listserv.temple DOT edu or visit the list's Web site at http://listserv.temple.edu/archives/networker.html where you can also view and post messages to the list. Questions regarding this list should be sent to stan AT temple DOT edu =*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=

--
Note: To sign off this list, send a "signoff networker" command via email to listserv AT listserv.temple DOT edu or visit the list's Web site at http://listserv.temple.edu/archives/networker.html where you can also view and post messages to the list. Questions regarding this list should be sent to stan AT temple DOT edu =*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=



--
Note: To sign off this list, send a "signoff networker" command via
email
to listserv AT listserv.temple DOT edu or visit the list's Web site at
http://listserv.temple.edu/archives/networker.html where you can
also view and post messages to the list. Questions regarding this list
should be sent to stan AT temple DOT edu
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=

--
Note: To sign off this list, send a "signoff networker" command via email
to listserv AT listserv.temple DOT edu or visit the list's Web site at
http://listserv.temple.edu/archives/networker.html where you can
also view and post messages to the list. Questions regarding this list
should be sent to stan AT temple DOT edu
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=


--
Note: To sign off this list, send a "signoff networker" command via email
to listserv AT listserv.temple DOT edu or visit the list's Web site at
http://listserv.temple.edu/archives/networker.html where you can
also view and post messages to the list. Questions regarding this list
should be sent to stan AT temple DOT edu
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=

<Prev in Thread] Current Thread [Next in Thread>