Networker

Re: [Networker] Problem with nsrjb processes

2003-05-25 09:26:01
Subject: Re: [Networker] Problem with nsrjb processes
From: Yura Pismerov <ypismerov AT TUCOWS DOT COM>
To: NETWORKER AT LISTMAIL.TEMPLE DOT EDU
Date: Sun, 25 May 2003 09:25:59 -0400
I don't see nsrjb in your "top" report. In any case, ask for LGTpa50956
hotfix.
That patch helped in our case (see below).
nsrjb hung when it was waiting for a new tape and there were blank tapes
eligible for labeling.


Subject:
        RE: [Networker] "Waiting for 1 writable volumes"
   Date:
        Tue, 22 Apr 2003 14:44:05 -0700
   From:
        Terry Clayton <tclayton AT legato DOT com>
     To:
        Yura Pismerov <ypismerov AT TUCOWS DOT COM>




If you are running NW 6.1.3, and using Auto Media Management you may be
advised to contact Legato support and enquire if the
hotfix for DDTS entry LGTpa50956 sounds as if it addresses your problem

> -----Original Message-----
> From: Yura Pismerov [mailto:ypismerov AT TUCOWS DOT COM]
> Sent: Tuesday, April 22, 2003 2:24 PM
> To: NETWORKER AT LISTMAIL.TEMPLE DOT EDU
> Subject: Re: [Networker] "Waiting for 1 writable volumes"
>
>
> "Porotikov, Vitaly [IT]" wrote:
> >
> > Yuriy,
> >
> >    Did you try to kill nsrmm process which is responsible
> for this tape
> > drive?
> >
> > give a try:
> >
> > kill -9 `lsof |grep /dev/rmt/<idle devcie>|awk '{print $2}'`
>
>
> Thanks. My point is there should be no manual intervention in the
> process.
> I have analyzed the messages (nsradmin->NSR jukebox->messages) and I
> found that the problem occurs only when nsrjb request involves volume
> labeling (ie. a new/blank tape is requested).
> Last night backup did not label any tapes so it went smoothly.
> My question is, is it expected behavior when request for labeling is
> treated like this ?
> It looks like Networker proceeds with it only in case there
> is an empty
> drive by that time.
> Otherwise it waits, no matter if the tape that is currently
> in the drive
> is idle and can be unloaded.
>
>
>
> >
> >  It helps for me. (Linux RedHat Legato 6.1.3)
> >
> >   Regards, Vitaly
> >
> > -----Original Message-----
> > From: Yura Pismerov [mailto:ypismerov AT TUCOWS DOT COM]
> > Sent: Monday, April 21, 2003 10:23 AM
> > To: NETWORKER AT LISTMAIL.TEMPLE DOT EDU
> > Subject: [Networker] "Waiting for 1 writable volumes"
> >
> >         Sometimes (not always) after a group start
> Networker keeps asking
> > for a
> > tape that could be mounted on second drive that is idle at
> the moment
> > but for some reason it does not eject the tape from another
> pool that is
> > already in the drive.
> > So instead of using 2 drives it ends up with one drive. I
> watch nsrjb
> > process that is issued for the second drive and keeps
> looping/trying.
> > Does anybody have an idea what is wrong ? There is not much
> in the logs
> > files that could shed light on the problem.
> > The version of Networker is 6.1.3 (running on Solaris 9).
> > Interesting thing is, if I kill the queued nsrjb process and issue
> > manual tape umount (nsrjb -u), the requested tape gets mounted right
> > away next time it retries.
> > What do I do to troubleshoot the problem ?
> >
> >         TIA.
> >
> > --
> > Note: To sign off this list, send a "signoff networker"
> command via email
> > to listserv AT listmail.temple DOT edu or visit the list's Web site at
> > http://listmail.temple.edu/archives/networker.html where you can
> > also view and post messages to the list.
> > =*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=
>
> --
> Note: To sign off this list, send a "signoff networker"
> command via email
> to listserv AT listmail.temple DOT edu or visit the list's Web site at
> http://listmail.temple.edu/archives/networker.html where you can
> also view and post messages to the list.
> =*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=

Stan Horwitz wrote:
>
> About a week ago, I did a migratation of our the Legato server here from a
> system running Tru64 Unix 4.0f with NetWorker Power Edition 6.1.1 to Power
> Edition 6.1.3 under Solaris 9 on a Sun Enterprise 450, For a variety of
> reasons, this migration has been very painful. Now, for the past three days,
> I have had a problem with where nsrjb processes seem to hang. No tapes are
> mounted, none are unmounted, yet there are numerous tapes that are eligible
> for use, at least 15 of which are brand new. The problem seems to start at
> about 8:00pm. Until then, backups run fine and tapes mount and unmount
> fine. Then the system slows to a crawl where Legato is concerned, such as
> taking several minutes to open up nsrwatch.
>
> This is with a 600 slot Qualstar 412600 tape library that has 6 AIT-2 tape
> drives in the left side and six new ones that I am about to configure (but
> haven't yet) in the right side. I opened a call with Legato about this
> problem on Thursday morning, but Legato has not provided a solution yet,
> ahtough they have requested lots of daemon.log and debugging data. So far,
> they say they see nothing wrong with the debug data, but the debug session
> was run during the day when this situation does not happen.
>
> I am wondering if anyone on this list has any suggestions on how I might deal
> with this problem.
>
> When this happens, system utilization seems normal, as in this "top"
> display:
>
> last pid: 11172;  load averages:  1.54,  1.53,  1.62
> 458 processes: 442 sleeping, 2 running, 13 zombie, 1 on cpu
> CPU states:  0.0% idle, 76.9% user, 23.1% kernel,  0.0% iowait,  0.0% swap
> Memory: 1024M real, 372M free, 505M swap in use, 2220M swap free
>
>    PID USERNAME THR PRI NICE  SIZE   RES STATE    TIME    CPU COMMAND
>  13845 root       1  20    0   77M   76M run     21.2H 61.11% nsrd
>  13851 root       1   3   10   46M   45M run    529:19 27.03% nsrmmdbd
>  11166 root       1  39    0 3024K 1904K cpu      0:00  1.40% top
>  13859 root       2  59  -15   14M 9800K sleep   53:25  1.33% nsrmmd
>  13839 root       1  59    0 5408K 4000K sleep    1:38  0.06% nsrexecd
>  13838 root       1  59    0 4248K 2624K sleep    1:55  0.06% nsrexecd
>   3821 root       1  59    0 3296K 2120K sleep    0:00  0.05% nsrexec
>   7449 root       1  59    0 3296K 2120K sleep    0:00  0.04% nsrexec
>    292 root       1  59    0 3536K 2008K sleep    0:24  0.03% sshd2
>    428 root       1  59  -15 4992K 3752K sleep    0:00  0.02% nsrmmd
>  22973 root       1  59    0 3296K 2120K sleep    0:01  0.02% nsrexec
>  24242 root       1  59    0 3296K 2120K sleep    0:01  0.02% nsrexec
>  23028 root       1  59    0 3296K 2120K sleep    0:01  0.02% nsrexec
>  21304 root       1  59    0 3296K 2120K sleep    0:01  0.02% nsrexec
>  21468 root       1  59    0 3296K 2120K sleep    0:01  0.02% nsrexec
>
> --
> Note: To sign off this list, send a "signoff networker" command via email
> to listserv AT listmail.temple DOT edu or visit the list's Web site at
> http://listmail.temple.edu/archives/networker.html where you can
> also view and post messages to the list.
> =*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=

--
Note: To sign off this list, send a "signoff networker" command via email
to listserv AT listmail.temple DOT edu or visit the list's Web site at
http://listmail.temple.edu/archives/networker.html where you can
also view and post messages to the list.
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=

<Prev in Thread] Current Thread [Next in Thread>