Veritas-bu

[Veritas-bu] Robot L180 errors out on robtest. - please help.

2002-07-26 13:55:54
Subject: [Veritas-bu] Robot L180 errors out on robtest. - please help.
From: MSyed AT xo DOT com (Syed, Mukarram)
Date: Fri, 26 Jul 2002 12:55:54 -0500
I am using Direct SCSI ... No Bridge, Fiber or SAN involved here.
-Mukarram.


-----Original Message-----
From: SIBLEY, Ken R. - ACCOREL [mailto:sibley_ken AT accorel DOT com]
Sent: Friday, July 26, 2002 9:35 AM
To: veritas-bu AT mailman.eng.auburn DOT edu
Subject: RE: [Veritas-bu] Robot L180 errors out on robtest. - please
help.




-----Original Message-----
From: Syed, Mukarram [mailto:MSyed AT xo DOT com]
Sent: Wednesday, July 24, 2002 1:46 PM
To: veritas-bu AT mailman.eng.auburn DOT edu
Subject: [Veritas-bu] Robot L180 errors out on robtest. - please help.



Hi
Here is my situation and I need some help
I am running Netbackup 3.4.1 running on Solaris 7 (which is my master and
media manager).
Early this morning my backups failed.  I have a STK L180 (SCSI) with 8 DLT
7000 drives.  When I checked my Device Monitor, all the drives were DOWN'ed.
I went into robtest and saw that there were tapes stuck all the drives. 
One of my backup was active and the other one which started later was
queued.

_________
2 Questions first:
1) ARe you using direct SCSI or going fiber to switch to bridge to drive?

2) If using bridge what hba do you have?

That being said we have seen something like this when one of two things 
happens:
1) The bridge gets confused and we have to reboot the bridge.
2) If you have other hosts on your SAN that do not use persistent
   binding this can cause a problem.  When we rebooted one host
   it would change the addressing for the drives on the other system.
   This would show up exactly the way you are seeing it.  The easiest
   thing to do is write down all of the the drive paths, delete the 
   drives from NBU, re-create the drives in NBU (fastest way is to use
   the java gui to set them up or use robtest the old fashioned way).
   Then check the old paths with the new ones.  If they are different
   start checking your hosts on the SAN.  If they are the same then
   I don't know.

Ken Sibley
Sr. Unix Admin
Accor Economy Lodging
ksibley AT accorel DOT com
469-737-3370

I killed both my backups and went to check the robot physically.  I saw that
all the drives were either in the REWOUND state or in someother state, I
don't recall.
I tried to unload the tapes using Robtest and move it to their respective
slot, but I get the following error for example:
unload d3
Opening /dev/rmt/2cbn, please wait...
Error - cannot open /dev/rmt/2cbn (I/O error)

When I initiate a move for example, it gives me the following error:
m d5 s45
Initiating MOVE_MEDIUM from address 504 to 1044
move_medium failed, CHECK CONDITION
sense key = 0x5, asc = 0x3a, ascq = 0x0, MEDIUM NOT PRESENT
Even though the tapes were in the drives, when I ran a mt command to check
the status of the drives, it said the following:
# mt -f /dev/rmt/5cbn status
/dev/rmt/5cbn: no tape loaded or drive offline
When I went to check the drives, the DLT 7000 drives had green lights on
them.

I manually removed the tapes from the drives and put it back in their
respective slots.  I powercycled the drives, not the robot.
When I then did a robtest to move the tape back and forth from the drive to
the slot and vice versa, the robtest successfully moved it from the slot to
the drive, but it would not move it from the drive back to the slot and the
unload command fails.
Here is the output:

m s3 d3
Initiating MOVE_MEDIUM from address 1002 to 502
MOVE_MEDIUM complete

unload d3
Opening /dev/rmt/2cbn, please wait...
Error - cannot open /dev/rmt/2cbn (I/O error)

m d3 s3
Initiating MOVE_MEDIUM from address 502 to 1002
move_medium failed, CHECK CONDITION
sense key = 0x5, asc = 0x3a, ascq = 0x0, MEDIUM NOT PRESENT

This is happening on all my drives.  Now I am going to powercycle the robot
and see what happens.
What could be the cause for this? 
What should I do next to troubleshoot this problem?  My gut feeling is that
the robot is not communicating with the Media Manager properly.  
Any suggestions would be greatly appreciated.
Thanks

Mukarram Syed
UNIX Systems Administrator 
XO Communications


_______________________________________________
Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
_______________________________________________
Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu

<Prev in Thread] Current Thread [Next in Thread>