Networker

Re: [Networker] nsrjobd fubar in 7.3.4 and 7.4 SP2?

2008-05-27 10:15:11
Subject: Re: [Networker] nsrjobd fubar in 7.3.4 and 7.4 SP2?
From: Fazil Saiyed <Fazil.Saiyed AT ANIXTER DOT COM>
To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
Date: Tue, 27 May 2008 09:05:00 -0500
Hello,
Sounds like in Windows what i refer to as "Leaky App\process", could this 
be OS specific , has any one noticed this on different platforms, i 
believe OSCAR is using Solaris ( correct me if i am wrong).
Can you use tools such as "Process explorer" ( Windows) to troubleshoot , 
what it's doing?
Thanks




Oscar Olsson <spam1 AT QBRANCH DOT SE> 
Sent by: EMC NetWorker discussion <NETWORKER AT LISTSERV.TEMPLE DOT EDU>
05/27/2008 02:26 AM
Please respond to
EMC NetWorker discussion <NETWORKER AT LISTSERV.TEMPLE DOT EDU>; Please respond 
to
Oscar Olsson <spam1 AT QBRANCH DOT SE>


To
NETWORKER AT LISTSERV.TEMPLE DOT EDU
cc

Subject
[Networker] nsrjobd fubar in 7.3.4 and 7.4 SP2?






We've had an ongoing issue with 7.3.4 and 7.4 SP2 (both with the security 
fix applied) where nsrjobd seems to have some troubles. The process takes 
up 100% CPU on one thread (looping thread is my guess), and it also grows 
continously, up to about 500MB allocated memory. After a while, processes 
seem to be experiencing communications problems with nsrjobd which makes 
processes wait, which means that groups don't run, recover sessions take 
long until they can browse, up until the backup system is rendered 
useless. A restart doesn't really help either since nsrjobd is quite 
unresponsive even after the restart and quickly allocates memory and 
starts using lots of CPU right away. I also don't see any messages about 
that it has managed to purge old records anymore. Ofcourse this is a 
serious problem.

And why am I telling the list this? Well, as usual EMC support is quite 
unresponsive and takes a long time to collect the information needed to 
troubleshoot this. For instance, we've had a P1 case opened from monday to 

friday and they still needed more info after an escalation was opened on 
Friday, even though the reasons for the troubles were quite clearly 
located within nsrjobd. I'm guessing this is a more or less general 
problem with nsrjobd, so I would like to receive feedback from other users 

in the community that might experience similar issues. Contact me on- or 
off-list. I don't mind sharing experiences about this on the list as well.

//Oscar

To sign off this list, send email to listserv AT listserv.temple DOT edu and 
type 
"signoff networker" in the body of the email. Please write to 
networker-request AT listserv.temple DOT edu if you have any problems with this 
list. You can access the archives at 
http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER



To sign off this list, send email to listserv AT listserv.temple DOT edu and 
type "signoff networker" in the body of the email. Please write to 
networker-request AT listserv.temple DOT edu if you have any problems with this 
list. You can access the archives at 
http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER