Veritas-bu

Re: [Veritas-bu] Robot disappeared from OS

2010-02-11 16:08:25
Subject: Re: [Veritas-bu] Robot disappeared from OS
From: Justin Piszcz <jpiszcz AT lucidpixels DOT com>
To: Dushyant Mehta <scorpio21179 AT gmail DOT com>
Date: Thu, 11 Feb 2010 16:08:21 -0500 (EST)
Hi,

What happens when you kill netbackup and just run robtest, I assume the 
same issues?

Have you enabled e.g. VERBOSE in /usr/openv/volmgr/debug and created all 
of the directorid, e.g. ltid to try and find out more information?

The robot you say was rebooted, what type of robot is it?  Can you login 
to the robot via SSH or robot console software to check the fiber link?

Justin.

On Thu, 11 Feb 2010, Dushyant Mehta wrote:

> Hi Justin,
>
> 1. Fibre cables were swapped so no issues there
> 2. Rebooted master/media server and robot also
> 3. Same as 1.
> 4. Switch cannot be rebooted as all the devices are logging in to switch.
> Even robot is logging in.
> 5. Zoning is correct. See the attached file.
>
>
>
> On Thu, Feb 11, 2010 at 3:59 PM, Justin Piszcz <jpiszcz AT lucidpixels DOT 
> com>wrote:
>
>> Hi,
>>
>> 1. Fiber cable problem?
>> 2. Have you tried rebooting the robot/master server?
>> 3. Replacing the fiber cable?
>> 4. Rebooting the fiber switch (if applicable)?
>> 5. Zoning issue?
>>
>> Justin.
>>
>>
>> On Thu, 11 Feb 2010, Dushyant Mehta wrote:
>>
>>  Also, /var/adm/messages shows :
>>>
>>> bash-3.00# tail /var/adm/messages
>>>
>>> Feb 11 15:41:40 njbkupmaster last message repeated 1 time
>>>
>>> Feb 11 15:41:40 njbkupmaster tldcd[24962]: [ID 406877 daemon.error] TLD(0)
>>> mode_sense ioctl() failed: I/O error
>>>
>>> Feb 11 15:41:40 njbkupmaster tldd[24933]: [ID 641686 daemon.notice]
>>> DecodeQuery() Actual status: Unable to sense robotic device
>>>
>>> Feb 11 15:41:40 njbkupmaster tldd[24933]: [ID 320639 daemon.error] TLD(0)
>>> unavailable: initialization failed: Unable to sense robotic device
>>>
>>> Feb 11 15:43:42 njbkupmaster tldcd[24962]: [ID 623364 daemon.notice]
>>> TLD(0)
>>> opening robotic path /dev/sg/c0tw500104f0005859c8l0
>>>
>>> Feb 11 15:43:42 njbkupmaster tldcd[24962]: [ID 498531 daemon.error] user
>>> scsi ioctl() failed, may be timeout, errno = 5, I/O error
>>>
>>> Feb 11 15:43:42 njbkupmaster last message repeated 1 time
>>>
>>> Feb 11 15:43:42 njbkupmaster tldcd[24962]: [ID 406877 daemon.error] TLD(0)
>>> mode_sense ioctl() failed: I/O error
>>>
>>> Feb 11 15:43:42 njbkupmaster tldd[24933]: [ID 641686 daemon.notice]
>>> DecodeQuery() Actual status: Unable to sense robotic device
>>>
>>> Feb 11 15:43:42 njbkupmaster tldd[24933]: [ID 320639 daemon.error] TLD(0)
>>> unavailable: initialization failed: Unable to sense robotic device
>>>
>>>
>>>
>>>
>>> On Thu, Feb 11, 2010 at 3:42 PM, Dushyant Mehta <scorpio21179 AT gmail DOT 
>>> com
>>>> wrote:
>>>
>>>  Scott,
>>>>
>>>> we have 2 dual port hba's and of the hba can see the robot properly..See
>>>> this:
>>>>
>>>> bash-3.00# fcinfo hba-port
>>>>
>>>> HBA Port WWN: 10000000c9517ec6
>>>>
>>>>        OS Device Name: /dev/cfg/c2
>>>>
>>>>        Manufacturer: Emulex
>>>>
>>>>        Model: LP10000
>>>>
>>>>        Firmware Version: 1.92a1
>>>>
>>>>        FCode/BIOS Version: 1.50a9
>>>>
>>>>        Type: N-port
>>>>
>>>>        State: online
>>>>
>>>>        Supported Speeds: 1Gb 2Gb
>>>>
>>>>        Current Speed: 2Gb
>>>>
>>>>        Node WWN: 20000000c9517ec6
>>>>
>>>> *HBA Port WWN: 10000000c9596c56*
>>>>
>>>>      *  OS Device Name: /dev/cfg/c3*
>>>>
>>>>        Manufacturer: Emulex
>>>>
>>>>        Model: LP10000
>>>>
>>>>        Firmware Version: 1.92a1
>>>>
>>>>        FCode/BIOS Version: 1.50a9
>>>>
>>>>        Type: N-port
>>>>
>>>>        State: online
>>>>
>>>>        Supported Speeds: 1Gb 2Gb
>>>>
>>>>        Current Speed: 2Gb
>>>>
>>>>        Node WWN: 20000000c9596c56
>>>>
>>>> HBA Port WWN: 10000000c95ede57
>>>>
>>>>        OS Device Name: /dev/cfg/c6
>>>>
>>>>        Manufacturer: Emulex
>>>>
>>>>        Model: LP10000
>>>>
>>>>        Firmware Version: 1.92a1
>>>>
>>>>        FCode/BIOS Version: 1.50a9
>>>>
>>>>        Type: N-port
>>>>
>>>>        State: online
>>>>
>>>>        Supported Speeds: 1Gb 2Gb
>>>>
>>>>        Current Speed: 2Gb
>>>>
>>>>        Node WWN: 20000000c95ede57
>>>>
>>>> HBA Port WWN: 10000000c95ee0be
>>>>
>>>>        OS Device Name: /dev/cfg/c5
>>>>
>>>>        Manufacturer: Emulex
>>>>
>>>>        Model: LP10000
>>>>
>>>>        Firmware Version: 1.92a1
>>>>
>>>>        FCode/BIOS Version: 1.50a9
>>>>
>>>>        Type: N-port
>>>>
>>>>        State: online
>>>>
>>>>        Supported Speeds: 1Gb 2Gb
>>>>
>>>>        Current Speed: 2Gb
>>>>
>>>>        Node WWN: 20000000c95ee0be
>>>>
>>>>  *From the HBA :*
>>>>
>>>> emlxadm> get_dev_list
>>>>
>>>>  -----------------------------------------------
>>>>
>>>> Device 0:
>>>>
>>>> Dtype: 0
>>>>
>>>> FC4_type[proto]: 0x00000100, 0x00000000, 0x00000000, 0x00000000,
>>>> 0x00000000, 0x00000000, 0x00000000, 0x00000000
>>>>
>>>> State: Logged_In
>>>>
>>>>  D_id: 615813
>>>>
>>>>  LILP: 0
>>>>
>>>>  Hard Addr: 0
>>>>
>>>>  WWPN: 500104f0005859d1
>>>>
>>>>  WWNN: 500104f0005859d0
>>>>
>>>> -----------------------------------------------
>>>>
>>>> Device 1:
>>>>
>>>> Dtype: 0
>>>>
>>>> FC4_type[proto]: 0x00000100, 0x00000000, 0x00000000, 0x00000000,
>>>> 0x00000000, 0x00000000, 0x00000000, 0x00000000
>>>>
>>>> State: Logged_In
>>>>
>>>> D_id: 615913
>>>>
>>>> LILP: 0
>>>>
>>>> Hard Addr: 0
>>>>
>>>> WWPN: 500104f0005859db
>>>>
>>>> WWNN: 500104f0005859d9
>>>>
>>>>  -----------------------------------------------
>>>>
>>>> Device 2:
>>>>
>>>> Dtype: 0
>>>>
>>>> FC4_type[proto]: 0x00000000, 0x00000000, 0x00000000, 0x00000000,
>>>> 0x00000000, 0x00000000, 0x00000000, 0x00000000
>>>>
>>>> *State: Logged_In*
>>>>
>>>> D_id: 615c13
>>>>
>>>> LILP: 0
>>>>
>>>> Hard Addr: 0
>>>>
>>>> *WWPN: 500104f0005859c8*
>>>>
>>>>  WWNN: 500104f0005859c7
>>>>
>>>>  -----------------------------------------------
>>>>
>>>> Device 3:
>>>>
>>>> Dtype: 0
>>>>
>>>> FC4_type[proto]: 0x00000100, 0x00000000, 0x00000000, 0x00000000,
>>>> 0x00000000, 0x00000000, 0x00000000, 0x00000000
>>>>
>>>> State: Logged_In
>>>>
>>>>  D_id: 615d13
>>>>
>>>> LILP: 0
>>>>
>>>> Hard Addr: 0
>>>>
>>>> WWPN: 500104f0005859d4
>>>>
>>>> WWNN: 500104f0005859d3
>>>> ---------------------------------------------------
>>>>
>>>> HBA sees the robot and the state is logged in. That means there is no
>>>> issue
>>>> with the HBA.
>>>>
>>>>
>>>>
>>>>
>>>> On Thu, Feb 11, 2010 at 3:32 PM, Chapman, Scott <Scott.Chapman AT icbc DOT 
>>>> com
>>>>> wrote:
>>>>
>>>>   You aren?t going to see anything with sgscan unless cfgadm ?al sees the
>>>>> changer and the tape.  I would work more with SUN to confirm there is
>>>>> nothing wrong with the HBA?maybe try a diff HBA port.
>>>>>
>>>>>
>>>>>
>>>>> *Scott Chapman*
>>>>>
>>>>> Senior Technical Specialist
>>>>>
>>>>> Storage and Database Administration
>>>>>
>>>>> ICBC - Victoria
>>>>>
>>>>> Ph:  250.414.7650  Cell:  250.213.9295
>>>>>
>>>>> *From:* veritas-bu-bounces AT mailman.eng.auburn DOT edu [mailto:
>>>>> veritas-bu-bounces AT mailman.eng.auburn DOT edu] *On Behalf Of *Dushyant
>>>>> Mehta
>>>>> *Sent:* Thursday, February 11, 2010 12:18 PM
>>>>>
>>>>> *To:* veritas-bu AT mailman.eng.auburn DOT edu
>>>>> *Subject:* [Veritas-bu] Robot disappeared from OS
>>>>>
>>>>>
>>>>>
>>>>> All,
>>>>>
>>>>>
>>>>> We have STK L180 tape library connected to a Solaris 10 master/media
>>>>> server.Until 3 days back we were able to see tape drives and robots  and
>>>>> now
>>>>> the robot has disappeared.  Initially when we were seeing the robot, we
>>>>> were
>>>>> not able to configure it in netbackup. when we did a scan -changer on
>>>>> that
>>>>> robot it used to report "0" drives and "0" slots whereas the library has
>>>>> 6
>>>>> drives and 84 slots. To fix that problem Symantec suggested rebuilding
>>>>> the
>>>>> sg driver and told me to remove everything from /dev/rmt/*cbn and
>>>>> /dev/sg
>>>>> directories. After doing that and rebooting the master with a
>>>>> reconfiguration we lost the robot.
>>>>>
>>>>> Everything is connected to McData switch and on the switch side I see
>>>>> all
>>>>> the ports logging in without any error.However when i run cfgadm -al it
>>>>> shows this for the robot wwn :
>>>>>
>>>>> c3::500104f0005859c8        unavailable      connected     configured
>>>>> failed
>>>>>
>>>>> I have been working with Symantec and Sun and both are unable to
>>>>> identify
>>>>> the problem.
>>>>>
>>>>> I have tried resetting hba, resetting the robot, rebooting the complete
>>>>> library etc but still sgscan and scan -changer does not show the robot.
>>>>>
>>>>> I have 2 questions :
>>>>>
>>>>> 1) When the robot does not report on correct info in scan -changer
>>>>> command, what can be the issue ?
>>>>> 2) What can we do to see the robot back ?
>>>>>
>>>>> We are on NBU 6.0 MP6, with Solaris 10 on master, Emulex HBA, Mc Data
>>>>> Switch, STK L180 tape library.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Dushyant Mehta
>>>>>
>>>>>  ------------------------------
>>>>> *
>>>>> This email and any attachments are intended only for the named recipient
>>>>> and may contain confidential and/or privileged material. Any
>>>>> unauthorized
>>>>> copying, dissemination or other use by a person other than the named
>>>>> recipient of this communication is prohibited. If you received this in
>>>>> error
>>>>> or are not named as a recipient, please notify the sender and destroy
>>>>> all
>>>>> copies of this email immediately.
>>>>> *
>>>>>
>>>>> [image: 2010 Olympic and Paralympic Logo | ICBC Logo] <http://icbc.com/
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>
_______________________________________________
Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu