Networker

Re: [Networker] nsrmmd process does not stop

2005-02-09 16:58:30
Subject: Re: [Networker] nsrmmd process does not stop
From: Robert Maiello <robert.maiello AT PFIZER DOT COM>
To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
Date: Wed, 9 Feb 2005 16:55:47 -0500
Interesting...one can reset the device in Solaris with cfgadm.   I'm
concerned though..will this reset the other drives on the same bus; ie.
reset the entire scsi bus?   I don't want to interfere with other drives
on the bus but would like to "unwedge" a drive if possible. This would be
1 or 2 other LTO1/2 drives on the same fiber port/bus.

Robert Maiello
Pioneer Data Systems

On Wed, 9 Feb 2005 12:43:06 -0500, thierry.faidherbe AT HP DOT COM wrote:

>In most of time, you can force an hung read() or write() syscall to end
>by resetting the tape device. Power off is often used but some OS have
>commands to reset the device from a software level :
>
>first locate the nsrmmd pid :
> # fuser /dev/rmt/.....
> note its PID
>
>from solaris 8 and higher :
>
> # cfgadm -al
> retrieve the tape device
> eg : c4::rmt/4      tape     connected    configured   unknown
> # cfgadm -x reset_device c4::rmt/4
>
>from Tru64 Unix :
> # hwmgr -show scsi |grep tape4
>  retrieve bus target and lun info
> # scu
> scu> sbtl [bus] [target] [lun]
> scu> reset device
> scu> reset device.
>
>Then, finish by a kill of the nsrmmd.
>
>Good luck
>HTH,
>
>Th
>
>>> Hello,
>>>
>>> Networker Server: Solaris 8
>>> Networker Software: 7.1.2
>>>
>>> Have run into a strange problem where I have two nsrmmd process that
>>> will
>>> not be killed on the networker server. I stopped networker and all of
>>> the
>>> other networker process shutdown normall. Prior to this happening I had
>>> a
>>> backup session that was writing to two lto1 fiber drives. The backup
>>> that
>>> was writing to drives was hung. I stopped the backup that was hung, then
>>> proceeded to stop networker. I noticed the two hung nsrmmd processes by
>>> running a ps -ef|grep nsr. I waited 30 minutes and the two processes
>>> were
>>> still in the process table. Checking the daemon.log and
>>> /var/adm/messages I
>>> don't see any tape drive errors or any reason why the backup would have
>>> hung in the first place. Does anyone have any suggestions as to why this
>>> would be happening.
>>
>> If a device has an issue, then commands and process which access the
>> device can hang.  If you truss the processes, are they in a read() or
>> write() system call?
>>
>> If you run fuser on the tape devices, do those processes have them open?
>>
>> If so, they're probably in the system call.  They cannot be killed or
>> have any signal operate on them until the call returns back to user
>> space.
>>
>> Often it's difficult to force that to occur.  Sometimes power cycling
>> the drive is enough to kick the driver back to some bit of sanity.
>> Sometimes you just have to reboot.  It depends on the device and on the
>> driver.
>>
>>
>> --
>> Darren Dunham                                           ddunham AT taos DOT 
>> com
>> Senior Technical Consultant         TAOS            http://www.taos.com/
>> Got some Dr Pepper?                           San Francisco, CA bay area
>>          < This line left intentionally blank to confuse you. >
>>
>> --
>> Note: To sign off this list, send a "signoff networker" command via email
>> to listserv AT listserv.temple DOT edu or visit the list's Web site at
>> http://listserv.temple.edu/archives/networker.html where you can
>> also view and post messages to the list. Questions regarding this list
>> should be sent to stan AT temple DOT edu
>> =*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=
>>
>
>--
>Note: To sign off this list, send a "signoff networker" command via email
>to listserv AT listserv.temple DOT edu or visit the list's Web site at
>http://listserv.temple.edu/archives/networker.html where you can
>also view and post messages to the list. Questions regarding this list
>should be sent to stan AT temple DOT edu
>=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=

--
Note: To sign off this list, send a "signoff networker" command via email
to listserv AT listserv.temple DOT edu or visit the list's Web site at
http://listserv.temple.edu/archives/networker.html where you can
also view and post messages to the list. Questions regarding this list
should be sent to stan AT temple DOT edu
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=