Veritas-bu

[Veritas-bu] Tape/media errors/HP LTO-3

2011-12-14 12:30:39
Subject: [Veritas-bu] Tape/media errors/HP LTO-3
From: tape0999 <nbu-forum AT backupcentral DOT com>
To: VERITAS-BU AT MAILMAN.ENG.AUBURN DOT EDU
Date: Wed, 14 Dec 2011 08:54:08 -0800
The External Event Caused Rewind error is not what it appears to be at first 
blush, and its a particularly bad error. A couple of years ago we noticed we 
had dissapearing backups, data was lost and it wasn't readily apparent why. I 
worked on this for almost a year in total because it can get very bad in some 
instances. An SCSI error generally returns a normal write error or hardware 
error OpCode, but sometimes either the event is so bad that there is a natural 
reason, or there is a protocol error that triggers an unnatural event, and a 
full rewind OpCode is sent instead. This rewinds the tape and kicks it out. 
Well, the big deal here is that the next time that cartridge is loaded by a 
drive, the images on it are overwritten. So, if that was the last backup being 
written to that cartridge, you could have lost another terabyte of other data 
at the same time it occurred. Not to mention any images that spanned multiple 
tapes are now defunct because the parts of them on that cartr
 idge are gone the instant the rewind occurs.

A rewind event can be a protocol error, in our case the #1 cause was a problem 
in certain SAN card firmwares that triggered a protocol problem, I won't 
mention a vendor because its been fixed. Another cause can be a drive whose 
main board is going bad and rather than throwing write errors, its throwing 
protocol errors instead. A lot of things can cause this problem, but it always 
occurs between the interface and the drive (including the drivers on the host, 
which can be a part of the problem). And in some cases, the only way to track 
down the real cause is with a sniffer if its on a fabric. A good tool to use 
for diagnosis is to have the drive vendor examine the drive's log buffer after 
a rewind and before the drive's error buffer is overwritten.

But, yes, if there's a chance it could be the drive, then replace it right 
away. The error is caussing you data loss during your backup period so its a 
no-brainer.

K-
--Tape is dead. Long live the tape.

+----------------------------------------------------------------------
|This was sent by kevin.trotman AT citi DOT com via Backup Central.
|Forward SPAM to abuse AT backupcentral DOT com.
+----------------------------------------------------------------------


_______________________________________________
Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu

<Prev in Thread] Current Thread [Next in Thread>
  • [Veritas-bu] Tape/media errors/HP LTO-3, tape0999 <=