Veritas-bu

Re: [Veritas-bu] Robot disappeared from OS

2010-02-11 16:42:11
Subject: Re: [Veritas-bu] Robot disappeared from OS
From: Dushyant Mehta <scorpio21179 AT gmail DOT com>
To: "Chapman, Scott" <Scott.Chapman AT icbc DOT com>
Date: Thu, 11 Feb 2010 16:42:03 -0500
Thats what we have planned...Tomorrow we have scheduled 2 people from Sun, 1 from Symantec and I will also be going to the datacenter and work this out...Thanks anyways for everyone's help and suggestions...
This has to get fixed in any case tomorrow...

-Dushyant

On Thu, Feb 11, 2010 at 4:39 PM, Chapman, Scott <Scott.Chapman AT icbc DOT com> wrote:

Do you have a case open with SUN?  I would escalate it to backline, get a storage specialist working on this… I really think that you are wasting time worrying about netbackup until you get the OS to see the changer properly.

 

I know that our guys that work on our robots know NOTHING about how to make the OS see the robots or drives, you should be talking to someone that knows the OS and how to make it see the storage.

 

Scott Chapman

Senior Technical Specialist

Storage and Database Administration

ICBC - Victoria

Ph:  250.414.7650  Cell:  250.213.9295

From: Dushyant Mehta [mailto:scorpio21179 AT gmail DOT com]
Sent: Thursday, February 11, 2010 1:33 PM
To: Chapman, Scott; Justin Piszcz; neil AT mbari DOT org
Cc: veritas-bu AT mailman.eng.auburn DOT edu
Subject: Re: [Veritas-bu] Robot disappeared from OS

 

Justin,

We have all the logging enabled and bp.conf has VERBOSE=5 and vm.conf has VERBOSE entry. We have ltid log created and it shows :

12:31:58.427 [24878] <4> InitLtid: connected to EMM server

12:31:58.440 [24878] <2> nbconf_get_info: nbconf_glue.cpp.237: NBCONF_LIB: /usr/openv/lib/libVnbconf.so

12:31:58.440 [24878] <2> mm_getnodename: (0) hostname njbkupmaster (from cached_hostname)

12:31:58.453 [24878] <2> nbconf_get_info: nbconf_glue.cpp.237: NBCONF_LIB: /usr/openv/lib/libVnbconf.so

12:31:58.478 [24878] <16> emmlib_GetMachineAliasList: (0) Number of Alias String Count 2

12:31:58.479 [24878] <16> emmlib_GetMachineAliasList: (0) Alias Strings (njbkupmaster)

12:31:58.479 [24878] <16> emmlib_GetMachineAliasList: (0) Alias Strings (njbkupmaster.lehman.com)

12:32:00.494 [24878] <4> InitLtid: Device Mappings version in EMM database is 1.83

12:32:00.521 [24878] <4> InitLtid: Local device mapping is up-to-date

12:32:00.538 [24878] <4> InitLtid: Resetting media server allocations

12:32:00.539 [24878] <2> VssGetFQDNHostName: vss_auth.cpp.4033: Function: VssGetFQDNHostName. Search name

12:32:00.540 [24878] <2> VssInit: vss_auth.cpp.720: Function: VssInit. Using Cached entries FALSE

12:32:00.540 [24878] <2> vnet_cached_gethostbyname: vnet_hosts.c.301: found host in cache: njbkupmaster

12:32:00.540 [24878] <2> vnet_cached_gethostbyaddr: vnet_hosts.c.454: found IP in cache: 127.0.0.1

12:32:00.540 [24878] <2> VssGetFQDNHostName: vss_auth.cpp.4033: Function: VssGetFQDNHostName. Search name njbkupmaster

12:32:00.541 [24878] <2> VssGetFQDNHostName: vss_auth.cpp.4380: Function: VssGetFQDNHostName. Match njbkupmaster.lehman.com

12:32:00.677 [24878] <4> InitLtid: RobotCount = 1

12:32:00.677 [24878] <4> InitLtid: DriveCount = 6

12:32:00.678 [24878] <4> InitLtid: Found 1 robots

12:32:00.686 [24878] <4> FillDriveStatusArray: [i = 0] DriveName: STK.T9840B.000, DrivePath: /dev/rmt/5cbn(), DriveIndex: 0

12:32:00.686 [24878] <6> WriteEntry: Updating drive STK.T9840B.000 serial number 461000024335 at path /dev/rmt/5cbn on attach host

12:32:00.740 [24878] <4> FillDriveStatusArray: [i = 1] DriveName: STK.T9840B.001, DrivePath: /dev/rmt/6cbn(), DriveIndex: 1

12:32:00.740 [24878] <6> WriteEntry: Updating drive STK.T9840B.001 serial number 461000024201 at path /dev/rmt/6cbn on attach host

12:32:00.792 [24878] <4> FillDriveStatusArray: [i = 2] DriveName: STK.T9940B.000, DrivePath: /dev/rmt/1cbn(), DriveIndex: 2

12:32:00.792 [24878] <6> WriteEntry: Updating drive STK.T9940B.000 serial number 479002010801 at path /dev/rmt/1cbn on attach host

12:32:00.840 [24878] <4> FillDriveStatusArray: [i = 3] DriveName: STK.T9940B.001, DrivePath: /dev/rmt/0cbn(), DriveIndex: 3

12:32:00.840 [24878] <6> WriteEntry: Updating drive STK.T9940B.001 serial number 479000017639 at path /dev/rmt/0cbn on attach host

12:32:00.885 [24878] <4> FillDriveStatusArray: [i = 4] DriveName: STK.T9940B.002, DrivePath: /dev/rmt/2cbn(), DriveIndex: 4

12:32:00.886 [24878] <6> WriteEntry: Updating drive STK.T9940B.002 serial number 479000030016 at path /dev/rmt/2cbn on attach host

12:32:00.934 [24878] <4> FillDriveStatusArray: [i = 5] DriveName: STK.T9940B.003, DrivePath: /dev/rmt/3cbn(), DriveIndex: 5

12:32:00.935 [24878] <6> WriteEntry: Updating drive STK.T9940B.003 serial number 479002029305 at path /dev/rmt/3cbn on attach host

12:32:00.999 [24878] <4> SendEmmHeartbeat: Detected change in MachineState...

12:32:00.999 [24878] <4> SendEmmHeartbeat: Detected change in MachineState... going Active

12:32:00.999 [24878] <4> SetRobotStatuses: ...checking for UP libraries

12:32:00.999 [24878] <4> start_device_daemons: Found 1 robots - starting robotic daemons now

12:32:03.009 [24878] <4> start_device_daemons: starting avrd daemon now

12:32:03.011 [24878] <4> LtidProcCmd: Pid=24933, Data.Pid=24933, Type=50, Param1=2, Param2=8, LongParam=0

12:32:03.026 [24878] <16> emmlib_SetRobotStatus: (0) Path < /dev/sg/c0tw500104f0005859c8l0 >

12:32:03.026 [24878] <16> emmlib_SetRobotStatus: (0) Host < njbkupmaster >, RobNum < 0 >, RobotStatus <0>

12:32:03.026 [24878] <16> emmlib_SetRobotStatus: (0) NdmpHost <  >

12:32:03.052 [24878] <16> emmlib_SetRobotStatus: (0) Path < /dev/sg/c0tw500104f0005859c8l0 >

12:32:03.052 [24878] <16> emmlib_SetRobotStatus: (0) Host < njbkupmaster >, RobNum < 0 >, RobotStatus < 0

Neil,

sg.conf does not have an entry for this wwn. It has only for the 6 drives.

Scott,

This is a fiber  robot and it does not use any scsi-fiber bridge. Sun engineer was onsite yesterday and he did everything that could be possibly done with the library/robot. Library panel shows 6 drives and 84 slots and this was confirmed 3-4 times with Sun engineer.

Thanks,

Dushyant Mehta

 

On Thu, Feb 11, 2010 at 4:21 PM, Chapman, Scott <Scott.Chapman AT icbc DOT com> wrote:

Dushyant, you shouldn’t be seeing:

c3::500104f0005859c8        unavailable      connected     configured      failed

 

Until that is resolved you will not see things properly in netbackup. 

 

I run L700’s which are very similar, and I also run newer Emulex cards and this is what I see from cfgadm –al:

c6::500104f00052ed2e           med-changer  connected    configured   unknown

 

Ports that have tape drives plugged into them, used to need to be configured as G ports rather than the default F port. I don’t think that is the case for robots though.  This robot has fiber control?  It is an older robot so I just want to make sure it doesn’t have a scsi bridge or something like that.

 

Scott Chapman

Senior Technical Specialist

Storage and Database Administration

ICBC - Victoria

Ph:  250.414.7650  Cell:  250.213.9295

From: veritas-bu-bounces AT mailman.eng.auburn DOT edu [mailto:veritas-bu-bounces AT mailman.eng.auburn DOT edu] On Behalf Of Dushyant Mehta
Sent: Thursday, February 11, 2010 12:18 PM
To: veritas-bu AT mailman.eng.auburn DOT edu
Subject: [Veritas-bu] Robot disappeared from OS

 

All,



We have STK L180 tape library connected to a Solaris 10 master/media server.Until 3 days back we were able to see tape drives and robots  and now the robot has disappeared.  Initially when we were seeing the robot, we were not able to configure it in netbackup. when we did a scan -changer on that robot it used to report "0" drives and "0" slots whereas the library has 6 drives and 84 slots. To fix that problem Symantec suggested rebuilding the sg driver and told me to remove everything from /dev/rmt/*cbn and /dev/sg directories. After doing that and rebooting the master with a reconfiguration we lost the robot.

Everything is connected to McData switch and on the switch side I see all the ports logging in without any error.However when i run cfgadm -al it shows this for the robot wwn :

c3::500104f0005859c8        unavailable      connected     configured      failed

I have been working with Symantec and Sun and both are unable to identify the problem.

I have tried resetting hba, resetting the robot, rebooting the complete library etc but still sgscan and scan -changer does not show the robot.

I have 2 questions :

1) When the robot does not report on correct info in scan -changer command, what can be the issue ?
2) What can we do to see the robot back ?

We are on NBU 6.0 MP6, with Solaris 10 on master, Emulex HBA, Mc Data Switch, STK L180 tape library.

Thanks,

Dushyant Mehta


This email and any attachments are intended only for the named recipient and may contain confidential and/or privileged material. Any unauthorized copying, dissemination or other use by a person other than the named recipient of this communication is prohibited. If you received this in error or are not named as a recipient, please notify the sender and destroy all copies of this email immediately.

2010 Olympic and Paralympic Logo | ICBC Logo

 


This email and any attachments are intended only for the named recipient and may contain confidential and/or privileged material. Any unauthorized copying, dissemination or other use by a person other than the named recipient of this communication is prohibited. If you received this in error or are not named as a recipient, please notify the sender and destroy all copies of this email immediately.

2010 Olympic and Paralympic Logo | ICBC Logo


_______________________________________________
Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
<Prev in Thread] Current Thread [Next in Thread>