Unfortunately, no, and that's been fustrating me further since I could
at least replace the drive on suspicion (any one DLT drive of ours tends
to get replaced every 6-9 months as we take them well over their duty
cycle - hence an LTO upgrade ;-).
What *has* appeared to have worked a bit is going through every client
and dropping every client's parallelism to 2 (or 1 for one filesystem
cases). With the four remaining drives on 11 each, and server parallel
to 20, a group as I have them layed out should only kick enough initial
savesets to initiate no more than three simultaneous nsrjb-s at any one
time. I'm also rotating/forcing out 'overused' tapes since there's
nothing in messages but the library is pulling for cleaning tapes a lot
recently too.
I'll continue to tweak and monitor; Any other suggestions are
appreciated.
--TSK
________________________________
From: Darren Dunham [mailto:ddunham AT TAOS DOT COM]
Sent: Tue 2/7/06 20:45
Subject: Re: Unusual activity on local DLT drives
> I'm getting a recent condition where an occasional DLT drive on the
server
> just 'hangs' on 'setting up for writing, moving forward . . .' etc.
>
> Both the nsrjb and nsrmmd are sitting there, doing nothing.
>
> I can kill the nsrjb, it clears out. Cannot kill the nsrmmd, even
with -9;
> It will eventually clean out of its own accord by actions of nsrd, but
it
> can take up to an hour or more.
I'm not sure that nsrd is doing it. This is a pretty standard
description of a driver that is hanging up, taking user processes with
it. You can't kill a process waiting on a kernel call return. Don't
think of the nsrmmd as the problem, it's (usually) just a symptom of a
problem with the driver.
Do you ever get any log messages (SCSI/fiber channel) associated with
these events?
--
Darren Dunham ddunham AT taos DOT com
Senior Technical Consultant TAOS http://www.taos.com/
Got some Dr Pepper? San Francisco, CA bay area
< This line left intentionally blank to confuse you. >
To sign off this list, send email to listserv AT listserv.temple DOT edu and
type "signoff networker" in the
body of the email. Please write to networker-request AT listserv.temple DOT edu
if you have any problems
wit this list. You can access the archives at
http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER
|