Veritas-bu

[Veritas-bu] Drives keep getting downed.

2002-12-20 09:28:31
Subject: [Veritas-bu] Drives keep getting downed.
From: Octave Orgeron <oorgeron AT hibbertgroup DOT com> (Octave Orgeron)
Date: Fri, 20 Dec 2002 09:28:31 -0500 (EST)
Hi,

I recently upraded to Netbackup 4.5 Datacenter on our Solaris server. We 
currently have three old Exabyte EZ17 autoloaders connected up to the system. 
Since I activated the scheduals, I keep running into a problem where during the 
backups the tape drives error out and they get downed. As a result, I only get 
a 
few of my servers backed up, while the rest fail because none of the tape 
drives 
are available. I've cleaned the tape drives, and that was not the problem. The 
tape is brand new. Here are some of the error messages I see in the 
/var/adm/messages file:

Dec 19 19:05:49 hgsun27 scsi: [ID 243001 kern.info] 
/sbus@1f,0/QLGC,isp@3,10000/st@0,0 (st21):
Dec 19 19:05:49 hgsun27         Fixed record length (1024 byte blocks) I/O
Dec 19 19:08:21 hgsun27 bptm[1109]: [ID 557619 daemon.error] Application 
(NetBackup) has DOWN'ed drive index 2, see application error log for further 
information

This is what I see in the bptm log file:

19:08:20.547 [1109] <2> log_media_error: successfully wrote to error file - 
12/19/02 19:08:20 C02503 2 WRITE_ERROR
TIR file, size is 264810 bytes + 0 GB
19:08:20.504 [1109] <2> write_data_tir: absolute block position prior to 
writing 
backup header(s) is 36, copy 1
19:08:20.504 [1109] <2> write_data_tir: block position check: actual 36, 
expected 5
19:08:20.528 [1109] <16> write_data_tir: FREEZING media id C02503, too many 
data 
blocks written, check tape/driver block size configuration
19:08:20.547 [1109] <2> log_media_error: successfully wrote to error file - 
12/19/02 19:08:20 C02503 2 WRITE_ERROR
19:08:20.564 [1109] <2> check_error_history: called from bptm line 15872, 
EXIT_Status = 84
19:08:21.070 [1109] <2> check_error_history: drive index = 2, media id = 
C02503, 
time = 12/19/02 19:08:20, both_match = 0, media_match = 0, drive_match = 2
19:08:21.070 [1109] <2> io_close: closing 
/usr/openv/netbackup/db/media/tpreq/C02503, from bptm.c.12711
19:08:21.071 [1109] <2> tpunmount: tpunmount'ing 
/usr/openv/netbackup/db/media/tpreq/C02503
19:08:21.071 [1109] <2> TpUnmountWrapper: SCSI RELEASE
19:08:21.118 [1109] <8> check_error_history: DOWN'ing drive index 2, it has had
at least 3 errors in last 12 hour(s)
19:08:21.120 [1109] <2> bptm: EXITING with status 84 <----------

I see in these error messages that there seems to be a problem with the block 
size. Also, for the backups that do run before the drives get downed, they take 
a lot longer than they use to. I have a full backup that I started three days 
ago for a test system.. it's still running! Normally, I can do a full backup of 
all of my 25 systems within 7 hours. Nothing has changed hardware or network 
wise.  What can I do to fix this problem?

Thanks in advance for any help!


******************************************************
*     Octave J. Orgeron      *   Specializing in :   *
* Unix Systems Administrator *  Solaris/Tru64/Linux  *
*     The Hibbert Group      *   Certified Solaris   *
* oorgeron AT hibbertgroup DOT com  * Systems Administrator *
******************************************************



**********************************************************************
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
the system manager.

This footnote also confirms that this email message has been swept by
MIMEsweeper for the presence of computer viruses.

www.mimesweeper.com
**********************************************************************


<Prev in Thread] Current Thread [Next in Thread>
  • [Veritas-bu] Drives keep getting downed., Octave Orgeron <=