Veritas-bu

[Veritas-bu] RMAN Failures - Was Error 54 Solaris client

2002-04-15 07:16:49
Subject: [Veritas-bu] RMAN Failures - Was Error 54 Solaris client
From: Philip.Weber AT egg DOT com (Weber, Philip)
Date: Mon, 15 Apr 2002 12:16:49 +0100
Dear all,

Thankyou for all the replies about "Error 54"s.  We have had some network
maintenance carried out and for now we are not getting 54s.  I'm not
convinced the problem has gone away but it does point to a networking issue.
Frustratingly can't pin down exactly what has changed though.

Are there any NetBackup/Oracle DB Ext/RMAN experts out there?  We are now
getting RMAN backups (Oracle 8 - 8.1.7) failing regularly with error code 6
on the same server on which we were previously getting EBU (Oracle 7.3.4)
backups failing with code 54.  Maybe still pointing to a network issue but
the RMAN logs aren't clear :

RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-10035: exception raised in RPC: O
RMAN-10031: ORA-19624 occurred during call to
DBMS_BACKUP_RESTORE.BACKUPPIECECREATE
RMAN-03006: non-retryable error occurred during execution of command: backup
RMAN-07004: unhandled exception during command execution on channel c1
RMAN-10032: unhandled exception during execution of job step 1: ORA-19583:
conversation terminated due to error
ORA-27206: requested file not found in media management catalog
ORA-06512: at "SYS.DBMS_BACKUP_RESTORE", line 498
ORA-06512: at "SYS.DBMS_BACKUP_RESTORE", line 464
ORA-06512: at line 275
RMAN-10035: exception raised in RPC: ORA-19583: conversation terminated due
to error
ORA-27206: requested file not found in media management catalog
ORA-06512: at "SYS.DBMS_BACKUP_RESTORE", line 498
ORA-06512: at "SYS.DBMS_BACKUP_RESTORE", line 464
RMAN-10031: ORA-19583 occurred during call to
DBMS_BACKUP_RESTORE.BACKUPPIECECREATE

Any ideas?

Phil Weber
IT Infrastructure Unix Systems Engineer

Phone: 01384 26 4136
Mobile: n/a



-----Original Message-----
From: Weber, Philip [mailto:Philip.Weber AT egg DOT com]
Sent: 04 April 2002 10:28
To: 'larry.kingery AT veritas DOT com'; 'veritas-bu AT mailman.eng.auburn DOT 
edu'
Subject: RE: [Veritas-bu] Error 54 Solaris client 


> Do you get them (54 and 50) together?  Just wondering if you get a 54
> (because the db was busy enough that it didn't start the stream in
> time) on the NBU side, then the db sees something goofy is up and
> starts killing the others (but not in a nice way that NBU
> understands).  

> Just a thought.

Yes it seems that way; we tend to get one 54 plus a set of 50's.  The end
times for the 50's seem to be before the end time for the 54 (in xbpmon).
The "auto" schedule that started all of these off finishes with 0 - when
this occurs it is all over in a matter of minutes.  Doesn't always work that
way - sometimes just get one stream failing with 54 and the remainder
completing OK, or some failing and some OK.  It looks to me like what is
happening is once one stream incurs a 54 error that signals for the rest to
fail.

The EBU log seems to bear this out.  I get some tablespaces put into backup
mode, followed by some "Starting BFS" statements for data files being backed
up, follwed by "EBU-4306: Error occured while writing data to tape",
followed by some "BFS in progress xxx cancelled" messages and eventually an
"EBU-2012: Job 12262 failed due to tape management error".  Not sure I
believe these errors as sometimes the same backup can be rerun successfully,
tapes are getting mounted etc.

At the moment of failure recorded in the EBU log the media servers' bpbrm
log records :

04:28:16 [15883] <2> bpbrm listen_for_client_timeout: timed out listening
for th
e client
04:28:16 [15883] <16> bpbrm listen_for_client: listen for client timeout
during
accept from data listen socket after 60 seconds
04:28:16 [15883] <2> bind_on_port_addr: bound to port 47498
04:28:16 [15883] <2> check_authentication: no authentication required
04:28:16 [15883] <2> bpbrm kill_child_process: start
04:28:16 [15883] <4> bpbrm Exit: client backup EXIT STATUS 54: timed out
connect
ing to client

According to Veritas this 60 second timeout is BPCD_ACCEPT_TIMEOUT and is
not tunable.

Phil Weber
Egg



This private and confidential e-mail has been sent to you by Egg.
The Egg group of companies comprises Prudential Banking plc
(registered no. 2999842), Egg Financial Products Ltd (registered
no. 3319027) and Egg Investments Ltd (registered no. 3403963) which
carries out investment business on behalf of Egg and is regulated
by the Financial Services Authority.  All members of the Egg group
are registered in England and Wales. Registered offices: 142
Holborn Bars, London EC1N 2NH

If you are not the intended recipient of this e-mail and have
received it in error, please notify the sender by replying with
'received in error' as the subject and then delete it from your
mailbox.

_______________________________________________
Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu

<Prev in Thread] Current Thread [Next in Thread>
  • [Veritas-bu] RMAN Failures - Was Error 54 Solaris client, Weber, Philip <=