Networker

Re: [Networker] Backup hung when encounter a bad tape

2008-06-02 22:12:17
Subject: Re: [Networker] Backup hung when encounter a bad tape
From: Matthew Huff <mhuff AT OX DOT COM>
To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
Date: Mon, 2 Jun 2008 22:07:59 -0400
I had a long running case with Legato about this (almost a full year) with
no real resolution, just explanations. The problem only exists with Legato,
Networker Module for Oracle, and bad media. Evidently there is a design
issue where the communication back and forth breaks down. Recently I
finished migrating to 7.4.2, Oracle 10g and NMO 4.2. We haven't had the
issue in a while now, but since we haven't had bad media either, I'm not
sure it's been fixed.

Make sure that you check to see if there are still oracle sessions (oracle
shadow processes running on the database server ) hanging around after you
kill the backup, because they can make the issue worse. Also, make sure in
your RMAN script you are using a non-MTS connection string (SVRV=DEDICATED)

 


On 6/2/08 7:00 PM, "Long Nguyen" <lnguyen AT CALPOLY DOT EDU> wrote:

> NW guru,
> 
> Do you have any issue of NW backup hung, if one of the tape is bad? Here is
> the Networker server log (/nsr/logs/daemon.log)
> --------------------------------------------------------------------
> 
> May 31 06:52:48 prmgls01 logger: NetWorker media: (warning) /dev/nst5
> moving: fsr 2878: Input/output error
> May 31 06:52:48 prmgls01 logger: NetWorker media: (emergency) could not
> position 000268L2 to file 113, record 2880
> May 31 06:52:48 prmgls01 logger: NetWorker media: (warning) /dev/nst5
> reading: Input/output error
> May 31 06:52:49 prmgls01 last message repeated 4 times
> May 31 06:52:49 prmgls01 logger: NetWorker media: (notice) Volume "000268L2"
> on device "/dev/nst5": Cannot decode block. Verify the device configuration.
> Tape positioning by record is disabled.
> May 31 06:52:49 prmgls01 logger: NetWorker media: (warning) /dev/nst5
> reading: Input/output error
> May 31 06:52:49 prmgls01 logger: NetWorker media: (warning) verification of
> volume "000268L2", volid 1665199573 failed, can not read record 2880 of file
> 113 on LTO Ultrium-2 tape 000268L2
> May 31 06:52:49 prmgls01 logger: NetWorker media: (notice) verification of
> volume "000268L2", volid 1665199573 failed, volume is being marked as full.
> May 31 06:52:49 prmgls01 logger: NetWorker media: (notice) Save set
> (440489257) prcsdb21-i:RMAN:/home/oracle/rman_scripts/itsprd_NMO volume
> 000268L2 on /dev/nst5 is being terminated because: Media verification failed
> May 31 06:52:49 prmgls01 logger: NetWorker media: (notice) Save set
> (457266465) prcsdb21-i:RMAN:/home/oracle/rman_scripts/itsprd_NMO volume
> 000268L2 on /dev/nst5 is being terminated because: Media verification failed
> May 31 06:52:49 prmgls01 logger: NetWorker media: (notice) Save set
> (474043677) prcsdb21-i:RMAN:/home/oracle/rman_scripts/itsprd_NMO volume
> 000268L2 on /dev/nst5 is being terminated because: Media verification failed
> May 31 06:52:49 prmgls01 logger: NetWorker media: (notice) Save set
> (490820893) prcsdb21-i:RMAN:/home/oracle/rman_scripts/itsprd_NMO volume
> 000268L2 on /dev/nst5 is being terminated because: Media verification failed
> May 31 06:52:49 prmgls01 logger: NetWorker media: (notice) Save set
> (524374977) prcsdb11-i:RMAN:/home/oracle/rman_scripts/cpprd_NMO volume
> 000268L2 on /dev/nst5 is being terminated because: Media verification failed
> May 31 06:52:49 prmgls01 logger: NetWorker media: (notice) Save set
> (608260984) prcsdb11-i:RMAN:/home/oracle/rman_scripts/hcprd_NMO volume
> 000268L2 on /dev/nst5 is being terminated because: Media verification failed
> May 31 06:52:49 prmgls01 logger: NetWorker media: (notice) Save set
> (641815388) prcsdb01-i:RMAN:/home/oracle/rman_scripts/dwprddb_NMO volume
> 000268L2 on /dev/nst5 is being terminated because: Media verification failed
> May 31 06:52:49 prmgls01 logger: NetWorker media: (notice) Save set
> (658592604) prcsdb01-i:RMAN:/home/oracle/rman_scripts/dwprddb_NMO volume
> 000268L2 on /dev/nst5 is being terminated because: Media verification failed
> May 31 06:52:56 prmgls01 logger: NetWorker media: (waiting) Waiting for 1
> writable volumes to backup pool
> 
> 
> Here is Oracle database log
> -----------------------------------------
> Database backups got hung up on May 31st and were still running in the
> database on Jun 1st.
> 
> Found the below entry in the $ORACLE_BASE/admin/dwprddb/udump/sbtio.log.
>  (prcsdb01)
> 
> SBT-17101 (0) 05/31/08 06:52:48 lnm_stream_write: asdf_output_section1()
> failed
> xdr=0x0x66c3530: bp=0x0x2a9bd7e048: send_len=262144: type=12800:
> fhand=0x0x66d0190: wrapper=0x(nil): directp=0x0x2a9a82de00 (1:4:0)
> SBT-17095 (0) 05/31/08 06:52:48 lnm_stream_write: asdf_output_section1()
> failed
> xdr=0x0x66c3530: bp=0x0x2a9bd7e048: send_len=262144: type=12800:
> fhand=0x0x66d0190: wrapper=0x(nil): directp=0x0x2a9a7dde00 (1:4:0)
> SBT-17095 (0) 05/31/08 06:52:49 lnm_stream_write: asdf_output_section1()
> failed
> xdr=0x0x66c3530: bp=0x0x2a9bd7e048: send_len=262144: type=12800:
> fhand=0x0x66d0190: wrapper=0x(nil): directp=0x0x2a9a82de00 (1:4:0)
> SBT-17101 (0) 05/31/08 06:52:49 lnm_stream_write: asdf_output_section1()
> failed
> xdr=0x0x66c3530: bp=0x0x2a9bd7e048: send_len=262144: type=12800:
> fhand=0x0x66d0190: wrapper=0x(nil): directp=0x0x2a9a88de00 (1:4:0)
> SBT-17101 (0) 05/31/08 06:52:49 lnm_stream_end_file: The savefile_fini()
> call failed. (1:5:0)
> SBT-17095 (0) 05/31/08 06:52:49 lnm_stream_write: asdf_output_section1()
> failed
> xdr=0x0x66c3530: bp=0x0x2a9bd7e048: send_len=262144: type=12800:
> fhand=0x0x66d0190: wrapper=0x(nil): directp=0x0x2a9a88de00 (1:4:0)
> SBT-17095 (0) 05/31/08 06:52:49 lnm_stream_end_file: The savefile_fini()
> call failed. (1:5:0)
> SBT-17095 (0) 05/31/08 06:52:54 lnm_index_remove_SSID: The removal of SSID
> '641815388' failed with error: Delete save set
> operation already in progress' (2:2:0)
> 
> --
> Long Nguyen
> email: lnguyen AT calpoly DOT edu
> http://www.calpoly.edu/~lnguyen
> work phone: 805-756-1550
> 
> To sign off this list, send email to listserv AT listserv.temple DOT edu and 
> type
> "signoff networker" in the body of the email. Please write to
> networker-request AT listserv.temple DOT edu if you have any problems with 
> this list.
> You can access the archives at
> http://listserv.temple.edu/archives/networker.html or
> via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER

To sign off this list, send email to listserv AT listserv.temple DOT edu and 
type "signoff networker" in the body of the email. Please write to 
networker-request AT listserv.temple DOT edu if you have any problems with this 
list. You can access the archives at 
http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER

<Prev in Thread] Current Thread [Next in Thread>