Networker

Re: [Networker] Problem with scheduled savegrp -O job

2010-08-10 21:02:44
Subject: Re: [Networker] Problem with scheduled savegrp -O job
From: Craig Faller <craigf AT XSIDATA.COM DOT AU>
To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
Date: Wed, 11 Aug 2010 10:56:32 +1000
it only fails when triggered from cron ? what about when its run
manually from command line ?

If the manually run command line savegrp works, the only difference is
your login, .profile, try changing the cron job to be a wrapper script
around the savegrp command.

Also try running it with increase verbosity and debugging, it may supply
more depth on the issue.
" savegrp -vvv -D4 -l full puss-index-backup"

Just a thought...the " waiting for 125 job(s) to complete" may mean that
the savegrp has spawned all the backups at the same time and nsrjobd
cant handle it....try running the savegrp with parallelism set...worth a
shot.
" savegrp -vvv -D4 -N 4 -l full puss-index-backup"

-----Original Message-----
From: EMC NetWorker discussion [mailto:NETWORKER AT LISTSERV.TEMPLE DOT EDU] On
Behalf Of STANLEY R. HORWITZ
Sent: Wednesday, 11 August 2010 3:33 AM
To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
Subject: Re: [Networker] Problem with scheduled savegrp -O job

Sorry, my mistake, it all should say "puss-index-backup" but I screwed
up in typing the message.
I haven't thought about trying the other ideas you suggested. I will do
that.

On 08 10, 2010, at 1:26 PM, George Sinclair wrote:

> STANLEY R. HORWITZ wrote:
>> Greetings everyone;
>> 
>> I updated my Red Hat Linux NetWorker server from 7.5.2 to 7.5.3
yesterday. I also updated the NetWorker Console on a separate Linux
server to 7.5.3. Thus far, my backup and restore testing is working
fine, but I am seeing on problem with this new installation. Every
morning at 6:45 I have a cron job that runs ...
>> 
>> savegrp -O -l full server-index-backup
> 
> Why does it say 'server-index-backup' here but 'puss-index-backup' 
> below" ???
> 
> Have you tried running the savegrp command on a non-index group with, 
> say, just a single client like: 'savegrp -l incr group' or maybe 
> 'savegrp -l full group'? What does that do? Or how about forcing it to

> process a single client like 'savegrp -l level group -c client'.
> 
> How about an estimate, e.g. 'savegrp -n -l incr group'?
> 
> Just curious if other incantations of savegrp work properly and it's 
> just an issue with the '-O' option?
> 
> George
> 
>> 
>> where server-index-backup is a deactivated group that has every
active client on the server as a member (125 clients). The first
scheduled run of this job yielded the results. I tried running the same
job again by adjusting the job's start time in cron and I still got
results like what you see below. I tried shutting down this NetWorker
server and I deleted /nsr/tmp and I restarted it and reran the savegrp
-O job again.
>> 
>> This data zone is a fairly simple set-up. It consists of one Dell
2950 connected to a Qualstar tape library via fiber channel to four
LTO-3 tape drives. Neither drive nor library sharing is use. There is
also not any disk-to-tape backup technology involved here. I checked
/nsr/logs/daemon.raw just after the most recent savegrp -O run and
nothing out of the ordinary appeared. It is also still appearing to
generate a bootstrap, which is certainly a good thing. Also, the actual
emailed savegrp report I receive from NetWorker for this group shows no
errors of any kind.
>> 
>> Has anyone else experienced this issue? If so, how did you resolve
it?
>> 
>> 4690:savegrp: puss-index-backup waiting for 125 job(s) to complete
>> 4690:savegrp: puss-index-backup waiting for 119 job(s) to complete
>> 4690:savegrp: puss-index-backup waiting for 49 job(s) to complete
>> 4690:savegrp: puss-index-backup waiting for 1 job(s) to complete
>> 80250:savegrp: Failed to process job exit status: No save instance
found with job id 2112129
>> 76638:savegrp: Failed to handle job sdtio message: No save instance
found with job id 211213876638:savegrp: Failed to handle job sdtio
message: No save instance found with job id 211213880250:savegrp: Failed
to process job exit status: No save instance found with job id 2112130
>> 80250:savegrp: Failed to process job exit status: No save instance
found with job id 2112131
>> 80250:savegrp: Failed to process job exit status: No save instance
found with job id 2112132
>> 80250:savegrp: Failed to process job exit status: No save instance
found with job id 2112138
>> 80250:savegrp: Failed to process job exit status: No save instance
found with job id 2112128
>> 80250:savegrp: Failed to process job exit status: No save instance
found with job id 2112134
>> 80250:savegrp: Failed to process job exit status: No save instance
found with job id 2112136
>> 80250:savegrp: Failed to process job exit status: No save instance
found with job id 2112135
>> 80250:savegrp: Failed to process job exit status: No save instance
found with job id 2112133
>> 80250:savegrp: Failed to process job exit status: No save instance
found with job id 2112139
>> 80250:savegrp: Failed to process job exit status: No save instance
found with job id 2112140
>> 80250:savegrp: Failed to process job exit status: No save instance
found with job id 2112137
>> 80250:savegrp: Failed to process job exit status: No save instance
found with job id 2112141
>> 80250:savegrp: Failed to process job exit status: No save instance
found with job id 2112142
>> 80250:savegrp: Failed to process job exit status: No save instance
found with job id 2112143
>> 80250:savegrp: Failed to process job exit status: No save instance
found with job id 2112145
>> 80250:savegrp: Failed to process job exit status: No save instance
found with job id 2112144
>> 80250:savegrp: Failed to process job exit status: No save instance
found with job id 2112146
>> 80250:savegrp: Failed to process job exit status: No save instance
found with job id 2112147
>> 80250:savegrp: Failed to process job exit status: No save instance
found with job id 2112148
>> *** glibc detected *** corrupted double-linked list: 0x08d688a8 ***
>> /bin/sh: line 1:  9078 Aborted                 /usr/sbin/savegrp -O
-l full puss-index-backup
>> 
>> To sign off this list, send email to listserv AT listserv.temple DOT edu and
type "signoff networker" in the body of the email. Please write to
networker-request AT listserv.temple DOT edu if you have any problems with this
list. You can access the archives at
http://listserv.temple.edu/archives/networker.html or
>> via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER
>> 
> 
> 
> -- 
> George Sinclair
> Voice: (301) 713-3284 x210
> - The preceding message is personal and does not reflect any official
or 
> unofficial position of the United States Department of Commerce -
> - Any opinions expressed in this message are NOT those of the US Govt.
-
> 
> To sign off this list, send email to listserv AT listserv.temple DOT edu and
type "signoff networker" in the body of the email. Please write to
networker-request AT listserv.temple DOT edu if you have any problems with this
list. You can access the archives at
http://listserv.temple.edu/archives/networker.html or
> via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER

To sign off this list, send email to listserv AT listserv.temple DOT edu and
type "signoff networker" in the body of the email. Please write to
networker-request AT listserv.temple DOT edu if you have any problems with this
list. You can access the archives at
http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER

To sign off this list, send email to listserv AT listserv.temple DOT edu and 
type "signoff networker" in the body of the email. Please write to 
networker-request AT listserv.temple DOT edu if you have any problems with this 
list. You can access the archives at 
http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER