Networker

Re: [Networker] NMDA Oracle RMAN backups fail only when when nsrexecd is running

2011-08-20 07:42:05
Subject: Re: [Networker] NMDA Oracle RMAN backups fail only when when nsrexecd is running
From: jee <jee AT ERESMAS DOT NET>
To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
Date: Sat, 20 Aug 2011 12:38:11 +0100
 

One of my customers has the same issue with NMO 5.0 and NW 7.5..*.
NW server is windows 2003 and the client is linux (RHEL 5.*)

An SR has been opened for 9 months now and never resolved. Still trying to 
have Engineering engaged and get a debug binary to troubleshoot.  Sadly, the 
request has been systematically ignored...

I have worked with the DBA for months and we tried everything. At some point 
we added more channels to let the backups retry on a different channel when 
the error occurred. That wouldn't work in all occasions but it was better 
than having failures every night.

We manage to get rid of the error using cron. When the RMAN script is executed 
by cron on the linux client the error doesn't occur.

However this is just a workaround and not the way to schedule networker 
backups. I wonder if there could be some connection between the workaround 
using cron and the nsrexecd issue mentioned on this thread.

BTW, the error produced by NMO has a typo and makes it easy to identify. It 
says:
   "...See privious error messages. (3:3:11)" instead of "previous".
Was that corrected on NMDA?



jee 



On Thursday 18 August 2011 22:34:29 Ron Benton wrote:
> I have a case open with EMC Support right now for a similar issue, using
> NMO 5.0 and NSR 6.4.1.
>
> When we run with the recommended setting of filesperset=1 for all three
> sections of the RMAN script (archive log, data files, archive logs), it
> will fail every time with the sequential file error for the Oracle instance
> and client I have been using for the test.
>
> The debug log files they have me collecting indicate that it is timing out
> connecting to the NSR server for the client file index. If we run with
> filesperset of 10/40/10, which has been our standard config, it only fails
> intermittently. We need to run with filesperset of 1 to get better
> deduplication on the Data Domain library.
>
> We've been working on this off and on for a month. EMC engineering just got
> involved but has no answers yet.
>
> Ron
>
> -----Original Message-----
> From: EMC NetWorker discussion [mailto:NETWORKER AT LISTSERV.TEMPLE DOT EDU] 
> On
> Behalf Of Paul Robertson Sent: Thursday, August 18, 2011 10:42 AM
> To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
> Subject: [Networker] NMDA Oracle RMAN backups fail only when when nsrexecd
> is running
>
> We're running oracle backups via NMDA 1.1 on Solaris x86, and we've
> run into an odd issue. First, the client details:
>
>   $ uname -a
>   SunOS client 5.10 Generic_142910-17 i86pc i386 i86pc
>
>   $ pkginfo -x LGTOclnt
>   LGTOclnt  NetWorker Client
>             (i386) 7.6.2.2.Build.651
>
>   $ pkginfo -x LGTOnmda
>   LGTOnmda  NetWorker Module for Databases and Applications
>             (i386) 1.1 [LNMs_2010.Build.231]
>
> The NSR server is a Solaris 10 sparc server running 7.6.2.2.Build.651.
>
> The issue we see is that Oracle backups via NMDA succeed only when the
> nsrexecd process is not running. When nsrexecd is running, we get the
> following error in the RMAN logs:
>
>   RMAN-03009: failure of backup command on ch00 channel at 08/17/2011
> 15:57:04 ORA-19506: failed to create sequential file, name="09mk7rku_1_1",
> parms="" ORA-27028: skgfqcre: sbtbackup returned error
>   ORA-19511: Error received from media manager layer, error text:
>   lnm_index_cfx_connect retry failed. See previous error messages. (3:3:11)
>   channel ch00 disabled, job failed on it will be run on another channel
>
> We didn't have this issue with the older NMO 5.X package. So now we have to
> start nsrexecd to do "normal" filesystem backups, and then stop it to run
> the RMAN NMDA backups -- clearly not an optimal approach :-)
>
> In case it's relevant, the client is a Solaris non-global zone. Let me know
> if you need any additional information, but any insight is
> appreciated.
>
> Cheers,
>
> Paul
>
> To sign off this list, send email to listserv AT listserv.temple DOT edu and 
> type
> "signoff networker" in the body of the email. Please write to
> networker-request AT listserv.temple DOT edu if you have any problems with 
> this
> list. You can access the archives at
> http://listserv.temple.edu/archives/networker.html or via RSS at
> http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER
>
> To sign off this list, send email to listserv AT listserv.temple DOT edu and 
> type
> "signoff networker" in the body of the email. Please write to
> networker-request AT listserv.temple DOT edu if you have any problems with 
> this
> list. You can access the archives at
> http://listserv.temple.edu/archives/networker.html or via RSS at
> http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER

To sign off this list, send email to listserv AT listserv.temple DOT edu and 
type "signoff networker" in the body of the email. Please write to 
networker-request AT listserv.temple DOT edu if you have any problems with this 
list. You can access the archives at 
http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER