Veritas-bu

[Veritas-bu] Re: Help! All Tapes From Scratch Gone - All jobs failed :(

2006-05-03 23:18:10
Subject: [Veritas-bu] Re: Help! All Tapes From Scratch Gone - All jobs failed :(
From: bob944 AT attglobal DOT net (bob944)
Date: Wed, 3 May 2006 23:18:10 -0400
> I have had this happen to me on multiple occasions. I have had
> no good explanation for this NBU behavior from Symantec/VERITAS. My 

Jeez, how hard is it to key in "frozen media" at support.veritas.com?
And the first one listed is... ta-da!... "How to troubleshoot frozen
media on Unix and Windows."  Just tryin' to help.  :-)

> best guess:
> NBU is unable to verify status or information from the robot for
> what ever the reason may be. Due to the lack of information it
> chooses to freeze the available media until an admin sorts things out 

It's not "lack of information," it's "errors."  NetBackup tried to use
the tape and a Bad Thing caused the operation to *fail*.  I'd suggest
troubleshooting the Bad Thing.

> I have a script that executes early in the morning daily to try and
> unfreeze all frozen media because I have lost multiple backups due to
> this problem. There are many reasons for frozen media. I choose not to
> investigate unless a media id shows up on the script output more than
> once.

Amazing.  So, in your environment, it's considered more cost-effective
to risk losing data or fail another backup than to throw out a $50 tape?
Sorry, that's a rude way to put it but it definitely seems like our
levels of risk tolerance in the "no do-overs" disaster-recovery world
are different.

IF everything has been running normally
   IF a tape failed last night in three drives
      THEN duplicate any data already on the tape
           throw the freakin' tape away
   ELSE
   IF a drive is down because three tapes failed using it
      THEN fix the drive
           up the drive
   ELSE
   IF a boatload of your tapes are frozen
      THEN put the tapes in right-side up
           turn off the write-protect tab
           back out yesterday's configuration change
           fix your fibre or SCSI
      IOW, your OS, drives or media are broken or misconfigured--
           fix the root cause so this never happens again

Here's my really easy rule:  on the first occurrence of a media error, I
toss the tape.  I think my time and my data are worth it.

Personally, I can't imagine being in a recovery scenario--whether it's a
user file or a full DR scenario--and saying "Hey, it took five tries to
get past that write error, but I finally got the backup to finish; beats
me why I can't restore it.  Must be stupid NetBackup's fault."



<Prev in Thread] Current Thread [Next in Thread>