Veritas-bu

Re: [Veritas-bu] Frozen Tapes

2009-09-03 04:24:37
Subject: Re: [Veritas-bu] Frozen Tapes
From: "bob944" <bob944 AT attglobal DOT net>
To: <veritas-bu AT mailman.eng.auburn DOT edu>
Date: Thu, 3 Sep 2009 04:21:08 -0400
> 1) NetBackup detects Non-NetBackup data format [...]
> 
> 2) NetBackup detects that [it is a catalog tape...]
> 
> 3) NetBackup tried to read/write to the tape and
> [got write or positioning errors...]

... if the barcode and recorded mediaID don't match...  if the tape
winds up in the wrong drive, ...

> Assuming I'm correct so far, then is the proper method
> of troubleshooting Frozen media to:
> 
> 1) Ensure there isn't some catalog data on the tape.
> 
> 2) Ensure that the tapes aren't from some other
> commercial backup product environment's tape pool
> (for those of you running multiple commercial
> backup applications at a single site).
> 
> 3) Make sure your tape drives have been cleaned
> recently.

No matter what the reason, it should be in the logs; IMO, that
should always be your first troubleshooting step:  find out why it
was frozen and go from there.  Special mention to:

> 4) Use bpmedia -m <media id> -unfreeze to unfreeze the
> tape(s), make a note of the tape you're unfreezing, and
> leave it in the scratch pool to see if it gets used for
> tonight's backups.

No.

Either toss it immediately, or, if you _must_ try to re-use it or do
root-cause, put it in the None pool until you can thoroughly test it
end-to-end error-free.  But even if it passes, how much of your time
does it take to exceed the cost of a replacement tape?  How much
time/money will you spend rerunning a backup that fails on that tape
again?  How much time/money/resume' will you spend if you cannot
recover a backup from that tape when you need it.  (I see Simon has
commented on this and I concur.)

> Now for my question: Assuming I was correct on my selection
> criteria and my troubleshooting steps, am I correct in
> saying that if I came in tomorrow and that media from
> step 4 was frozen a second time, that it indicates that
> the media is more than likely defective? Is there any
> other troubleshooting steps anyone would care to add?

Kudos for doing the research you show above.  But why did you list
all those causes but not look in the logs to see which one caused
the error and address it directly?  

If it's a media-overwrite that you haven't allowed, there's no point
in re-running; it'll still be ANSI or whatever.  Then, "why do you
have tapes in inventory that you must preserve but rely on a method
that's only a mouse-click away from causing someone a disaster"
becomes a critical question.

If it was media errors, NetBackup already made the educated guess of
whether it was drive or media (see the manual), and that'll show up
in the logs.  

If it was a cold-catalog-backup tape, that's in the logs but why/how
did it get put into a scratch or data pool?


_______________________________________________
Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu

<Prev in Thread] Current Thread [Next in Thread>