Veritas-bu

[Veritas-bu] STK Experts, need help

2002-07-01 15:22:18
Subject: [Veritas-bu] STK Experts, need help
From: Ajay.Sharma AT hostcentric DOT com (Ajay Sharma)
Date: Mon, 1 Jul 2002 15:22:18 -0400
I have a problem with our STK 9714. The cleaning tapes gets struck in
the drives and than we have to take the tape out manually. The auto
cleaning on the library is disabled. Netbackup keeps track of it. Are
there any recommendations ?

-----Original Message-----
From: Wilhelm, Patrick [mailto:pawilhelm AT davidson DOT edu]
Sent: Monday, July 01, 2002 11:59 AM
To: David A. Chapa; veritas-bu AT mailman.eng.auburn DOT edu
Subject: RE: [Veritas-bu] STK Experts, need help


David - 

Is the autoclean feature enabled on this STK?  We were told that it
shouldn't be on our L700 using NB 3.4.1.  Someone else can maybe
confirm.  Do you have syslog on the box controlling the STK?

-----Original Message-----
From: David A. Chapa [mailto:david AT datastaff DOT com]
Sent: Monday, July 01, 2002 11:54 AM
To: veritas-bu AT mailman.eng.auburn DOT edu
Subject: [Veritas-bu] STK Experts, need help



Attached is an excerpt from a bptm log for a particular bptm process
that was 
running over the weekend.  What eventually happened is one of the drives
was 
down'd by NBU during the duplication process.  I was wondering if it was
a 
media problem, but then I noticed that the media that was in use was
used by 
another duplication stream, then another, etc.  It just finished writing
a few 
minutes ago, successfully I might add.  Now I'm wondering if it is the
physical 
drive.

The entries in particular that I'm interested in hashing out are the 
tape_error_rec: entries in the bptm log.

What does that mean?

What's with the delay 3 minutes before next attempt, tries left = 5

[that means a total of 18 minutes of possible delays, what do the delays

constitute?]

And then the entries at 
03:57:16 <2> tape_error_rec: attempting error recovery, delay 3 minutes
before 
next attempt, tries left = 3
04:00:16 <2> tape_error_rec: absolute block position after error is
280103
04:00:16 <2> tape_error_rec: locating to absolute block number 280103
for error 
recovery
^^^^^^^
What kind of recovery is it attempting???

04:01:02 <2> tape_error_rec: locate failed in error recovery, locate
scsi 
command failed, key = 0x4, asc = 0x44, ascq = 0xb6
^^^^^^^^^^^^^^
Failed the recovery with a SCSI command failure?  Does this point to the

DRIVE?  ACS/LS (incidentally ACS/LS has been installed and working great
with 
not problems for quite some time)

04:01:02 <2> tape_error_rec: attempting error recovery, delay 3 minutes
before 
next attempt, tries left = 2
04:04:02 <2> tape_error_rec: absolute block position after error is
280035
04:04:02 <16> write_data: cannot write image to media id ZA0962, drive
index 
104, I/O error
04:04:02 <2> log_media_error: successfully wrote to error file -
07/01/02 
04:04:02 ZA0962 104 WRITE_ERROR
04:04:02 <2> wait_for_sigcld: waiting for child to exit, timeout is 300
04:04:02 <2> check_error_history: called from bptm line 12312,
EXIT_Status = 84
04:04:03 <2> check_error_history: drive index = 104, media id = ZA0962,
time = 
07/01/02 04:04:02, both_match = 0, media_match = 0, drive_match = 2
04:04:03 <2> tpunmount: tpunmount'ing
/usr/openv/netbackup/db/media/tpreq/ZA0962
04:04:03 <8> check_error_history: DOWN'ing drive index 104, it has had
at least 
3 errors in last 12 hour(s)

I've also attached a small excerpt from the messages file as well.

Any ideas would be greatly appreciated.

TIA
David
_______________________________________________
Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu

<Prev in Thread] Current Thread [Next in Thread>