Veritas-bu

Re: [Veritas-bu] Frozen Tapes

2009-09-03 10:05:15
Subject: Re: [Veritas-bu] Frozen Tapes
From: "Martin, Jonathan" <JMARTI05 AT intersil DOT com>
To: <veritas-bu AT mailman.eng.auburn DOT edu>
Date: Thu, 3 Sep 2009 10:01:55 -0400
Not to be a prude, but I don't concur with simply destroying any tape
that ever gets an error. 10 minutes of troubleshooting isn't going to
kill you and I would add to your troubleshooting list to check the
problems report for more information. I've had media frozen because of
robotic / tape mount errors, scsi hba conflicts, lack of cleaning and
occasionally a bad write.  Back in the DLT tape days I saw write errors
all the time, but since our switch to LTO3 Media I've only had 5 or so
go bad, and that's with me shipping media all over the world and
supporting 15+ remote sites.

-Jonathan 

-----Original Message-----
From: veritas-bu-bounces AT mailman.eng.auburn DOT edu
[mailto:veritas-bu-bounces AT mailman.eng.auburn DOT edu] On Behalf Of bob944
Sent: Thursday, September 03, 2009 4:21 AM
To: veritas-bu AT mailman.eng.auburn DOT edu
Subject: Re: [Veritas-bu] Frozen Tapes

> 1) NetBackup detects Non-NetBackup data format [...]
> 
> 2) NetBackup detects that [it is a catalog tape...]
> 
> 3) NetBackup tried to read/write to the tape and [got write or 
> positioning errors...]

... if the barcode and recorded mediaID don't match...  if the tape
winds up in the wrong drive, ...

> Assuming I'm correct so far, then is the proper method of 
> troubleshooting Frozen media to:
> 
> 1) Ensure there isn't some catalog data on the tape.
> 
> 2) Ensure that the tapes aren't from some other commercial backup 
> product environment's tape pool (for those of you running multiple 
> commercial backup applications at a single site).
> 
> 3) Make sure your tape drives have been cleaned recently.

No matter what the reason, it should be in the logs; IMO, that should
always be your first troubleshooting step:  find out why it was frozen
and go from there.  Special mention to:

> 4) Use bpmedia -m <media id> -unfreeze to unfreeze the tape(s), make a

> note of the tape you're unfreezing, and leave it in the scratch pool 
> to see if it gets used for tonight's backups.

No.

Either toss it immediately, or, if you _must_ try to re-use it or do
root-cause, put it in the None pool until you can thoroughly test it
end-to-end error-free.  But even if it passes, how much of your time
does it take to exceed the cost of a replacement tape?  How much
time/money will you spend rerunning a backup that fails on that tape
again?  How much time/money/resume' will you spend if you cannot recover
a backup from that tape when you need it.  (I see Simon has commented on
this and I concur.)

> Now for my question: Assuming I was correct on my selection criteria 
> and my troubleshooting steps, am I correct in saying that if I came in

> tomorrow and that media from step 4 was frozen a second time, that it 
> indicates that the media is more than likely defective? Is there any 
> other troubleshooting steps anyone would care to add?

Kudos for doing the research you show above.  But why did you list all
those causes but not look in the logs to see which one caused the error
and address it directly?  

If it's a media-overwrite that you haven't allowed, there's no point in
re-running; it'll still be ANSI or whatever.  Then, "why do you have
tapes in inventory that you must preserve but rely on a method that's
only a mouse-click away from causing someone a disaster"
becomes a critical question.

If it was media errors, NetBackup already made the educated guess of
whether it was drive or media (see the manual), and that'll show up in
the logs.  

If it was a cold-catalog-backup tape, that's in the logs but why/how did
it get put into a scratch or data pool?


_______________________________________________
Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
_______________________________________________
Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu

<Prev in Thread] Current Thread [Next in Thread>