Networker

[Networker] Endless loop in 7.3.1

2007-03-19 18:36:01
Subject: [Networker] Endless loop in 7.3.1
From: Andrew Dietz <andrew.dietz AT OIT.GATECH DOT EDU>
To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
Date: Mon, 19 Mar 2007 18:31:25 -0400
Starting sometime of Friday evening, one of the 4 LTO3 drives in our SL500 library decided it needed cleaning, so EBS 7.3.1 went on to grab a cleaning tape from an appropriate slot, then it finished the operation with a message "device cleaned notice: Jukebox `jb0' Device `/dev/rmt/3hbn' in jukebox `jb0' cleaned at" <date on Friday>. Ever since then, that drive has been displaying the same message every couple of minutes non-stop!
After poking around, I found the following:

In the Java Admin Console, the following shows up in 'Operations' tab under 'Monitoring', one after another:

Origin   Operation Data              Duration Progress Message
nsrmmgd  Clean device /dev/rmt/3hbn  30       retryable

In /nsr/logs/daemon.log, I see the same combination of lines repeat:

03/19/07 18:19:23 nsrd: [Jukebox `jb0', operation 2098]. Initiated operation `Clean device /dev/rmt/3hbn using cleaning slot 373'. 03/19/07 18:19:49 nsrlcpd #1: Jukebox error: Jukebox:jb0 access:/dev/scsi/changer/c2t2d0 failed:MOVE MEDIUM key:4 status:CHECK CONDITION No Additional Sense, Media Load or Eject Failed

03/19/07 18:19:49 nsrmmgd: lcpd 1 at host scotch.tape.gatech.edu reported error 'Jukebox:jb0 access:/dev/scsi/changer/c2t2d0 failed:MOVE MEDIUM key:4 status:CHECK CONDITION No Additional Sense, Media Load or Eject Failed
' for the command `4'.
03/19/07 18:19:49 nsrd: [Jukebox `jb0', operation # 2098]. Jukebox:jb0 access:/dev/scsi/changer/c2t2d0 failed:MOVE MEDIUM key:4 status:CHECK CONDITION No Additional Sense, Media Load or Eject Failed

03/19/07 18:19:49 nsrmmgd: Jukebox:jb0 access:/dev/scsi/changer/c2t2d0 failed:MOVE MEDIUM key:4 status:CHECK CONDITION No Additional Sense, Media Load or Eject Failed

03/19/07 18:19:49 nsrd: device cleaned notice: Jukebox `jb0' Device `/dev/rmt/3hbn' in jukebox `jb0' cleaned at `Mon Mar 19 18:19:49 GMT-0400 2007'. 03/19/07 18:19:53 nsrd: [Jukebox `jb0', operation # 2098]. Finished with status: retryable
03/19/07 18:19:53 nsrmmgd: RAP error: Invalid resource data.
03/19/07 18:19:53 nsrmmgd: Cannot update operation status resource (instance 2098).

What do you think is causing this, and how on earth can I stop it??! Drive /dev/rmt/3hbn still obeys commands such as "nsrjb -L -S 100 -f /dev/rmt/3hbn" so the drive seems to still work...

TIA,
Andrew Dietz

To sign off this list, send email to listserv AT listserv.temple DOT edu and type 
"signoff networker" in the body of the email. Please write to networker-request 
AT listserv.temple DOT edu if you have any problems with this list. You can access the 
archives at http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER