Networker

Re: [Networker] Odd nsrd crash after savegrp cancel - reading empty tape drive

2006-03-23 04:13:36
Subject: Re: [Networker] Odd nsrd crash after savegrp cancel - reading empty tape drive
From: "Maarten Boot (CWEU-USERS/CWNL)" <Maarten.Boot AT NL.COMPUWARE DOT COM>
To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
Date: Thu, 23 Mar 2006 10:07:28 +0100
As there is a core 

run dbx and take a stack trace it will help legato to fix 
and may reveal where and why the nsrd crashed

Maarten 


On Thursday 23 March 2006 09:47, T S Kimball wrote:
> We had a rather unusual experience here tonight.  Searching the archive is
> not finding anything specific to this, and calls to both Sun and Legato are
> pending.
>
> Pool ran out of tapes during a backup.  It paged me (a bit late) while I
> was at home, I logged in to see which group it was.  Not an important one,
> so cancelled the group pending a fix and restart later.  The tape drive (on
> a storage node) had already ejected the now-full tape; This was confirmed
> in daemon.log.
>
> Ten minutes later, nsrd core dumps.  The last messages in log is for it
> failing to read from that same tape drive on the storage node (huh?).  This
> is an AlphaStor library, so I checked that log; No drives on the node were
> requesting load or unload at that time - three empty drives.
>
> Anyway, a restart of Networker was needed, and things have been relatively
> happy since (three DLT drives that were spinning at the time have gone
> 'sour' and are being rotated out - we have enough spares).
>
> However, I'm very concerned about this.  It's the first time I've
> experienced it, and though its likely just a fluke I'm looking for any
> input as to what the general cause may be (so we don't repeat it).  I've
> run across other odd situations that make me feel its related to not enough
> CPU resources for the nsrd process, but can't readily prove it.
>
> Specs:
>   Server - Sun E450 (4x480 MHz), Solaris 8, Sun EBS 7.1.3, 4 x DLT7000
> (only three enabled right now), library control (SCSI), gigabit (fiber).
> SN1 - Sun V240 (2x1.2Ghz), Solaris 8, Sun EBS 7.1.3 SN, AlphaStor Server, 2
> x LTO-2 (fiber) and 1 x DLT7000, 1 network port enabled in gigabit mode.
> SN2 - Sun V880 (8x900 Mhz), Solaris 8, Sun EBS 7.1.3 SN, 3 x LTO-2
> (LVD-SCSI).
>
> We also have some large adv_file disk on the Server and SN1, but it was not
> in use at the time and this config has been stable for awhile.
>
> Thanks in advance,
> --TSK
>
> To sign off this list, send email to listserv AT listserv.temple DOT edu and 
> type
> "signoff networker" in the body of the email. Please write to
> networker-request AT listserv.temple DOT edu if you have any problems wit this
> list. You can access the archives at
> http://listserv.temple.edu/archives/networker.html or via RSS at
> http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER

-- 
Maarten Boot, 
Compuware Europe B.V.
Hoogoorddreef 5
1101 BA Amsterdam
Tel: +31 20 312 6511

To sign off this list, send email to listserv AT listserv.temple DOT edu and 
type "signoff networker" in the
body of the email. Please write to networker-request AT listserv.temple DOT edu 
if you have any problems
wit this list. You can access the archives at 
http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER

<Prev in Thread] Current Thread [Next in Thread>