Frank,
Considering the many NetWorker server configurations out in the field,
expecting EMC's software testing to uncover every possible issue on every
platform with every OS version and every hardware driver is unrealistic. I
suspect that EMC does is test the most common configurations in their software
certification efforts, which gives rise to the potential for errors to occur on
less common platforms.
I too am one of the people who is having trouble with 7.5SP3 (on a Red Hat
Linux AS 32-bit) running on a Dell 2950 with a Qualstar tape library that has
four LTO-3 tape drives in it. I am running build 533 and the new savegrp
binary, but I still find one core dump file per day in /nsr/cores/savegrp and
my daily cron that runs "savegrp -O" continues to generate output like this …
4690:savegrp: puss-index-backup waiting for 113 job(s) to complete
4690:savegrp: puss-index-backup waiting for 81 job(s) to complete
4690:savegrp: puss-index-backup waiting for 74 job(s) to complete
4690:savegrp: puss-index-backup waiting for 1 job(s) to complete
*** glibc detected *** corrupted double-linked list: 0x0984ff60 ***
/bin/sh: line 1: 13098 Aborted /usr/sbin/savegrp -O -l full -G
puss-index-backup
I tried the suggestion of running a trace on savegrp, but I had no clue of what
I was looking at in the output. I am wondering if anyone else with 7.5SP3 is
getting savegrp core dumps. In case you don't know, NetWorker stashes core dump
files in /nsr/cores. If your NetWorker server is generating savegrp core files
and you haven't opened a case with EMC about it, please consider doing so. The
more core dump files EMC has, the more likely they can uncover the cause of the
problem and fix it. What is also very confusing to me is that there seems to be
no pattern for what time of day these core files are generated on my server.
On 08 30, 2010, at 7:05 AM, Francis Swasey wrote:
> My goodness! All that time in development, testing by QA, I assume there was
> a beta program as
> well... and still it takes three fixes to get it to operate as it goes out
> the door? That is
> depressing news.
>
> Perhaps EMC needs to look to their customers who are reporting these problems
> and provide
> incentives to them to take part in a beta program since it appears their own
> QA group is not
> quite up to the task.
>
> Frank
>
> On 8/30/10 5:38 AM, Jóhannes Karl Karlsson wrote:
>> We had some problems with 7.5.3 to begin with when we installed build 514.
>> Groups of Oracle clients not finishing properly (hanging).
>>
>> EMC then released build 531 and few days later build 533. We installed
>> NetWorker 7.5.3.1 build 533 and our problems got even worse.
>>
>> EMC then released a patched version of savegrp.exe build 533. After
>> installing that patched savegrp.exe binary we have not had any problems.
>>
>> NetWorker 7.5.3.1 build 533 with patched savegrp.exe binary seems to be
>> stable and good.
>>
>> Johannes
>>
>>
>>
>> -----Original Message-----
>> From: EMC NetWorker discussion [mailto:NETWORKER AT LISTSERV.TEMPLE DOT EDU]
>> On Behalf Of Len Philpot
>> Sent: 17. ágúst 2010 15:26
>> To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
>> Subject: Re: [Networker] NSR 75SP3 : Stable for prod ?
>>
>>> STANLEY R. HORWITZ
>>>
>>> What ulimit settings are you using and how many clients are you backing
>> up?
>>>
>>> Here's what I have …
>>>
>>> [root@puss nsr_scripts]# ulimit -a
>>> core file size (blocks, -c) 0
>>> data seg size (kbytes, -d) unlimited
>>> file size (blocks, -f) unlimited
>>> pending signals (-i) 1024
>>> max locked memory (kbytes, -l) 32
>>> max memory size (kbytes, -m) unlimited
>>> open files (-n) 1024
>>> pipe size (512 bytes, -p) 8
>>> POSIX message queues (bytes, -q) 819200
>>> stack size (kbytes, -s) 10240
>>> cpu time (seconds, -t) unlimited
>>> max user processes (-u) 143360
>>> virtual memory (kbytes, -v) unlimited
>>> file locks (-x) unlimited
>>
>> Your's looks like Solaris 10, but this is on 9 (SPARC):
>>
>> # ulimit -a
>> core file size (blocks) unlimited
>> data seg size (kbytes) unlimited
>> file size (blocks) unlimited
>> open files unlimited
>> pipe size (512 bytes) 10
>> stack size (kbytes) 8192
>> cpu time (seconds) unlimited
>> max user processes 29995
>> virtual memory (kbytes) unlimited
>>
>> The two groups that were abending had 41 and 25 clients each (not huge)
>> and we have a little over 100 clients total. However, the old ulimit
>> settings (which I don't recall) were from the original Solaris 8
>> installation back in 2003 (Networker 6.1). So, they weren't exactly big.
>>
>> To sign off this list, send email to listserv AT listserv.temple DOT edu and
>> type "signoff networker" in the body of the email. Please write to
>> networker-request AT listserv.temple DOT edu if you have any problems with
>> this list. You can access the archives at
>> http://listserv.temple.edu/archives/networker.html or
>> via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER
>
> --
> Frank Swasey | http://www.uvm.edu/~fcs
> Sr Systems Administrator | Always remember: You are UNIQUE,
> University of Vermont | just like everyone else.
> "I am not young enough to know everything." - Oscar Wilde (1854-1900)
>
> To sign off this list, send email to listserv AT listserv.temple DOT edu and
> type "signoff networker" in the body of the email. Please write to
> networker-request AT listserv.temple DOT edu if you have any problems with
> this list. You can access the archives at
> http://listserv.temple.edu/archives/networker.html or
> via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER
To sign off this list, send email to listserv AT listserv.temple DOT edu and
type "signoff networker" in the body of the email. Please write to
networker-request AT listserv.temple DOT edu if you have any problems with this
list. You can access the archives at
http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER
|