Veritas-bu

[Veritas-bu] NB4.5 Datacenter bad? tape problem + L700 cleaning

2003-05-25 05:37:10
Subject: [Veritas-bu] NB4.5 Datacenter bad? tape problem + L700 cleaning
From: rafamiga AT uucp.polbox DOT pl (rafal wiosna)
Date: Sun, 25 May 2003 11:37:10 +0200
        This morning I've discovered a third frozen tape in my L700. That's
strange, the library sits in a comfortable place, nobody is touching, it
works on NB autopilot. Bad tapes?...

        This FROZEN tape I got I can read the images from. When greping logs
I got this:

08:16:21.665 [1443] <4> write_backup: begin writing backup id 
tdcf03_1053839725, copy 1, fragment 2, to media id 000140 on drive index 2
08:16:21.666 [1443] <2> signal_parent: sending SIGUSR1 to bpbrm (pid = 1439)
08:16:21.666 [1443] <2> write_data: twin_index: 0 active: 1 dont_process: 0 
wrote_backup_hdr: 0 finished_buff: 0 saved_cindex: 0
08:16:21.672 [1443] <2> write_data: absolute block position prior to writing 
backup header(s) is 389156, copy 1
08:16:21.672 [1443] <2> write_data: block position check: actual 389156, 
expected 504267
08:16:21.695 [1443] <16> write_data: FREEZING media id 000140, External event 
caused rewind during write, all data on media is lost
08:16:21.731 [1443] <2> log_media_error: successfully wrote to error file - 
05/25/03 08:16:21 000140 2 WRITE_ERROR
08:16:21.743 [1443] <2> check_error_history: called from bptm line 14685, 
EXIT_Status = 84
08:16:21.752 [1443] <2> check_error_history: drive index = 2, media id = 
000140, time = 05/25/03 08:16:21, both_match = 0, media_match = 0, drive_match 
= 0
08:16:21.752 [1443] <2> io_close: closing 
/usr/openv/netbackup/db/media/tpreq/000140, from bptm.c.12914
08:16:21.753 [1443] <2> tpunmount: tpunmount'ing 
/usr/openv/netbackup/db/media/tpreq/000140
08:16:21.754 [1443] <2> TpUnmountWrapper: SCSI RELEASE

        What external event at 8am on sunday?!? Good thing I belive I can
unfreeze the tape and get some contents out of it anyway. I will unfreeze
the tape and see what happens.

        But with another tape [frozen earlier] the matter is worse. When I
try to read the catalog off the media I get:

10:31:34.879 [9424] <2> io_open: SCSI RESERVE
10:31:34.936 [9424] <2> io_open: file 
/usr/openv/netbackup/db/media/tpreq/000028 successfully opened
10:31:34.937 [9424] <2> load_media: media id 000028 mounted on drive index 1
10:31:34.965 [9424] <2> io_read_media_header: drive index 1, reading media 
header, buflen = 32768, buff = 0x22aaa0, copy 1
10:31:35.222 [9424] <2> io_ioctl: command (5)MTREW 1 from (bptm.c.6182) on 
drive index 1
10:31:46.129 [9424] <16> io_read_media_header: tapemark or blank tape 
encountered reading media header, drive index 1
10:31:46.474 [9424] <2> check_error_history: called from bptm line 2724, 
EXIT_Status = 85
10:31:46.483 [9424] <2> check_error_history: drive index = 1, media id = 
000028, time = 05/25/03 10:31:46, both_match = 0, media_match = 0, drive_match 
= 0
10:31:46.483 [9424] <2> io_close: closing 
/usr/openv/netbackup/db/media/tpreq/000028, from bptm.c.12914
10:31:46.483 [9424] <2> tpunmount: tpunmount'ing 
/usr/openv/netbackup/db/media/tpreq/000028
10:31:46.485 [9424] <2> TpUnmountWrapper: SCSI RELEASE
10:31:46.544 [9424] <2> bptm: EXITING with status 85 <----------

        Good thing it consists of 8 GB [out of 100 GB] of images expiring
June 2nd so it's not so relevant to the data consistency in my backup
library. I can live without it. But the question is this: how can I remove
this tape from the library and re-use it? With the previous tape I had
problem with I tried to expire the tape but in the frozen state it was
impossible [or was I doint something in wrong order]. Only when I got the
replacement tape [since it was first mount that caused the troubles I
managed to replace the tape for free] I was able to re-claim the tape as a
clean one, with no contents.

        Any suggestions what should I do now? This tape is like 3 months
old. I know, this is magnetic media and problems can happens so is there any
way to check the tape [don't care about erasing the contents] and/or re-use
it?

        Also for cleaning, I belive my Storagetek L700 is managing the
cleaning process. It has autoclean feature on. Okay, but the cleaning tapes
were never used for 8 months. I wish I can clean the drives once a 3-4
months JUST IN CASE to prevent an error and not wait for the first error to
cause cleaning. The errorcount on either of the 3 drives is 0 -- and that's
probably why the cleaning never happened. And I wish the L700 won't place
cleaning in the place NB cannot reach them [it does not see it in the
robot].

-- 
__________________________________________________________________________
rafal wiosna * ¦wiat Internet S.A. * Polbox * In ARP we trust * AR164 V6TB
RAFD-RIPE * Dom som surfar på nättet, får leka med musen * PGPID: 3CDCB7A9

<Prev in Thread] Current Thread [Next in Thread>