Networker

Re: [Networker] NSR 75SP3 : Stable for prod ?

2010-08-30 09:36:02
Subject: Re: [Networker] NSR 75SP3 : Stable for prod ?
From: "STANLEY R. HORWITZ" <stan AT TEMPLE DOT EDU>
To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
Date: Mon, 30 Aug 2010 09:34:26 -0400
Thierry,

This is in conjunction with service request 36150044. I do not have a specific 
patch reference, but EMC is referencing this savegrp bug by the code NW120811.


On 08 30, 2010, at 9:20 AM, Thierry Faidherbe wrote:

> Stanley,
> 
> Any patch reference for the patched savegrp.exe on top of 7.5.3.1 Build 533 ?
> 
> I have an opened case with emc now but my contact cannot located
> patched savegrp bin (for winX64)
> 
> Thanks
> 
> Th
> 
> Kind regards - Bien cordialement - Vriendelijke groeten,
> 
> Thierry FAIDHERBE
> Backup/Storage & System Management
> 
> LE FOREM - Administration Centrale
> Département des Systèmes d'Information
> 
> Boulevard Tirou, 104  Tel: + 32 (0)71/206730
> B-6000 CHARLEROI      Fax: + 32 (0)71/206199 
> BELGIUM               Mail : Thierry.faidherbe<at>forem.be
> 
> 
> -----Original Message-----
> From: EMC NetWorker discussion [mailto:NETWORKER AT LISTSERV.TEMPLE DOT EDU] 
> On Behalf Of STANLEY R. HORWITZ
> Sent: lundi 30 août 2010 15:09
> To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
> Subject: Re: [Networker] NSR 75SP3 : Stable for prod ?
> 
> Frank,
> 
> Considering the many NetWorker server configurations out in the field, 
> expecting EMC's software testing to uncover every possible issue on every 
> platform with every OS version and every hardware driver is unrealistic. I 
> suspect that EMC does is test the most common configurations in their 
> software certification efforts, which gives rise to the potential for errors 
> to occur on less common platforms. 
> 
> I too am one of the people who is having trouble with 7.5SP3 (on a Red Hat 
> Linux AS 32-bit) running on a Dell 2950 with a Qualstar tape library that has 
> four LTO-3 tape drives in it. I am running build 533 and the new savegrp 
> binary, but I still find one core dump file per day in /nsr/cores/savegrp and 
> my daily cron that runs "savegrp -O" continues to generate output like this …
> 
> 4690:savegrp: puss-index-backup waiting for 113 job(s) to complete
> 4690:savegrp: puss-index-backup waiting for 81 job(s) to complete
> 4690:savegrp: puss-index-backup waiting for 74 job(s) to complete
> 4690:savegrp: puss-index-backup waiting for 1 job(s) to complete
> *** glibc detected *** corrupted double-linked list: 0x0984ff60 ***
> /bin/sh: line 1: 13098 Aborted                 /usr/sbin/savegrp -O -l full 
> -G puss-index-backup
> 
> I tried the suggestion of running a trace on savegrp, but I had no clue of 
> what I was looking at in the output. I am wondering if anyone else with 
> 7.5SP3 is getting savegrp core dumps. In case you don't know, NetWorker 
> stashes core dump files in /nsr/cores. If your NetWorker server is generating 
> savegrp core files and you haven't opened a case with EMC about it, please 
> consider doing so. The more core dump files EMC has, the more likely they can 
> uncover the cause of the problem and fix it. What is also very confusing to 
> me is that there seems to be no pattern for what time of day these core files 
> are generated on my server.
> 
> 
> On 08 30, 2010, at 7:05 AM, Francis Swasey wrote:
> 
>> My goodness!  All that time in development, testing by QA, I assume there 
>> was a beta program as
>> well... and still it takes three fixes to get it to operate as it goes out 
>> the door?  That is
>> depressing news.
>> 
>> Perhaps EMC needs to look to their customers who are reporting these 
>> problems and provide
>> incentives to them to take part in a beta program since it appears their own 
>> QA group is not
>> quite up to the task.
>> 
>> Frank
>> 
>> On 8/30/10 5:38 AM, Jóhannes Karl Karlsson wrote:
>>> We had some problems with 7.5.3 to begin with when we installed build 514. 
>>> Groups of Oracle clients not finishing properly (hanging).
>>> 
>>> EMC then released build 531 and few days later build 533. We installed 
>>> NetWorker 7.5.3.1 build 533 and our problems got even worse.
>>> 
>>> EMC then released a patched version of savegrp.exe build 533. After 
>>> installing that patched savegrp.exe binary we have not had any problems.
>>> 
>>> NetWorker 7.5.3.1 build 533 with patched savegrp.exe binary seems to be 
>>> stable and good.
>>> 
>>> Johannes
>>> 
>>> 
>>> 
>>> -----Original Message-----
>>> From: EMC NetWorker discussion [mailto:NETWORKER AT LISTSERV.TEMPLE DOT 
>>> EDU] On Behalf Of Len Philpot
>>> Sent: 17. ágúst 2010 15:26
>>> To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
>>> Subject: Re: [Networker] NSR 75SP3 : Stable for prod ?
>>> 
>>>> STANLEY R. HORWITZ 
>>>> 
>>>> What ulimit settings are you using and how many clients are you backing 
>>> up?
>>>> 
>>>> Here's what I have …
>>>> 
>>>> [root@puss nsr_scripts]# ulimit -a
>>>> core file size          (blocks, -c) 0
>>>> data seg size           (kbytes, -d) unlimited
>>>> file size               (blocks, -f) unlimited
>>>> pending signals                 (-i) 1024
>>>> max locked memory       (kbytes, -l) 32
>>>> max memory size         (kbytes, -m) unlimited
>>>> open files                      (-n) 1024
>>>> pipe size            (512 bytes, -p) 8
>>>> POSIX message queues     (bytes, -q) 819200
>>>> stack size              (kbytes, -s) 10240
>>>> cpu time               (seconds, -t) unlimited
>>>> max user processes              (-u) 143360
>>>> virtual memory          (kbytes, -v) unlimited
>>>> file locks                      (-x) unlimited
>>> 
>>> Your's looks like Solaris 10, but this is on 9 (SPARC):
>>> 
>>> # ulimit -a
>>> core file size (blocks)     unlimited
>>> data seg size (kbytes)      unlimited
>>> file size (blocks)          unlimited
>>> open files                  unlimited
>>> pipe size (512 bytes)       10
>>> stack size (kbytes)         8192
>>> cpu time (seconds)          unlimited
>>> max user processes          29995
>>> virtual memory (kbytes)     unlimited
>>> 
>>> The two groups that were abending had 41 and 25 clients each (not huge) 
>>> and we have a little over 100 clients total. However, the old ulimit 
>>> settings (which I don't recall) were from the original Solaris 8 
>>> installation back in 2003 (Networker 6.1). So, they weren't exactly big.
>>> 
>>> To sign off this list, send email to listserv AT listserv.temple DOT edu 
>>> and type "signoff networker" in the body of the email. Please write to 
>>> networker-request AT listserv.temple DOT edu if you have any problems with 
>>> this list. You can access the archives at 
>>> http://listserv.temple.edu/archives/networker.html or
>>> via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER
>> 
>> -- 
>> Frank Swasey                    | http://www.uvm.edu/~fcs
>> Sr Systems Administrator        | Always remember: You are UNIQUE,
>> University of Vermont           |    just like everyone else.
>> "I am not young enough to know everything." - Oscar Wilde (1854-1900)
>> 
>> To sign off this list, send email to listserv AT listserv.temple DOT edu and 
>> type "signoff networker" in the body of the email. Please write to 
>> networker-request AT listserv.temple DOT edu if you have any problems with 
>> this list. You can access the archives at 
>> http://listserv.temple.edu/archives/networker.html or
>> via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER
> 
> To sign off this list, send email to listserv AT listserv.temple DOT edu and 
> type "signoff networker" in the body of the email. Please write to 
> networker-request AT listserv.temple DOT edu if you have any problems with 
> this list. You can access the archives at 
> http://listserv.temple.edu/archives/networker.html or
> via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER
> 
> To sign off this list, send email to listserv AT listserv.temple DOT edu and 
> type "signoff networker" in the body of the email. Please write to 
> networker-request AT listserv.temple DOT edu if you have any problems with 
> this list. You can access the archives at 
> http://listserv.temple.edu/archives/networker.html or
> via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER

To sign off this list, send email to listserv AT listserv.temple DOT edu and 
type "signoff networker" in the body of the email. Please write to 
networker-request AT listserv.temple DOT edu if you have any problems with this 
list. You can access the archives at 
http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER