Veritas-bu

[Veritas-bu] media position error/media position error

2005-01-04 11:51:25
Subject: [Veritas-bu] media position error/media position error
From: DTeklu AT amlaw DOT com (Daniel Teklu)
Date: Tue, 4 Jan 2005 11:51:25 -0500
        I am using Netbackup Server 5.0 on Solaris 9.

        I recently changed a failed tape drive (L9) and the new one seems to
be working fine. I can inventory tapes/load and unload tapes etc. I also did
a user backup from a client server with no problem. However, the full
backups keep failing with error 84 (Media write error) and 86 (Media
position error) and subsequently downing the drive.

        This is the error I see on bptm log. I even tried to change the
tapes and I get the same problem.

        23:45:08.704 [6239] <2> bptm: INITIATING (VERBOSE = 0):
-delete_expired
23:45:08.722 [6239] <2> db_lock_media: unable to lock media at offset 3
(000015)
23:45:08.724 [6239] <2> bptm: EXITING with status 0 <----------
23:45:38.102 [4080] <2> signal_parent: sending SIGUSR1 to bpbrm (pid = 4060)
23:45:38.104 [4080] <2> write_data: attempting write error recovery, err = 5
23:45:38.104 [4080] <2> tape_error_rec: error recovery to block 52918
requested
23:45:38.104 [4080] <2> tape_error_rec: attempting error recovery, delay 3
minutes before next attempt, tries left = 5
23:46:55.079 [6247] <2> bptm: INITIATING (VERBOSE = 0): -count -cmd -rt 8 -rn
0 -stunit TapeL9 -den 13 -mt 2 -masterversion 500000
23:46:55.082 [6247] <2> bptm: EXITING with status 0 <----------
23:48:38.104 [4080] <2> io_ioctl: command (0)MTWEOF 0 from (overwrite.c.191)
on drive index 0
23:48:38.104 [4080] <2> io_ioctl: MTWEOF failed during error recovery, I/O
error
23:48:38.108 [4080] <2> tape_error_rec: SCSI RESERVE
23:48:38.110 [4080] <2> tape_error_rec: absolute block position after error
is 52665
23:48:38.110 [4080] <2> set_job_details: Done
23:48:38.147 [4080] <16> write_data: cannot write image to media id 000015,
drive index 0, I/O error
23:48:38.193 [4080] <2> log_media_error: successfully wrote to error file -
01/03/05 23:48:38 000015 0 WRITE_ERROR
23:48:38.216 [4080] <2> check_error_history: called from bptm line 17142,
EXIT_Status = 84
23:48:38.223 [4080] <2> check_error_history: drive index = 0, media id =
000015, time = 01/03/05 23:48:38, both_match = 0, media_match = 0,
drive_match = 2
23:48:38.223 [4080] <2> io_close: closing
/usr/openv/netbackup/db/media/tpreq/000015, from bptm.c.15265
23:48:38.223 [4080] <2> tpunmount: tpunmount'ing
/usr/openv/netbackup/db/media/tpreq/000015
23:48:38.224 [4080] <2> TpUnmountWrapper: SCSI RELEASE
23:48:38.228 [4080] <2> TpUnmountWrapper: retrying open, errno = I/O error
23:49:02.171 [4080] <2> set_job_details: Done
23:49:02.203 [4080] <8> check_error_history: DOWN'ing drive index 0, it has
had at least 3 errors in last 12 hour(s) 


        I also see this on /var/adm/messages

        Jan  3 23:45:38 backup scsi: [ID 107833 kern.warning] WARNING:
/pci@1f,0/pci@1/scsi@5/st@1,0 (st15):
Jan  3 23:45:38 backup  Error for Command: write                   Error
Level: Fatal
Jan  3 23:45:38 backup scsi: [ID 107833 kern.notice]    Requested Block:
52916                     Error Block: 52916
Jan  3 23:45:38 backup scsi: [ID 107833 kern.notice]    Vendor: QUANTUM
Serial Number:  ;
Jan  3 23:45:38 backup scsi: [ID 107833 kern.notice]    Sense Key: Media
Error
Jan  3 23:45:38 backup scsi: [ID 107833 kern.notice]    ASC: 0x80 (<vendor
unique code 0x80>), ASCQ: 0x1, FRU: 0x0
Jan  3 23:49:02 backup bptm[4080]: [ID 557615 daemon.error] Application
(NetBackup) has DOWN'ed drive index 0, see application error log for further
information



        I have done the tpautoconf after the drive was replaced. If I can do
a manual client backup , why do the full backups fail? Any ideas please?

        -Daniel



<Prev in Thread] Current Thread [Next in Thread>
  • [Veritas-bu] media position error/media position error, Daniel Teklu <=