Veritas-bu

[Veritas-bu] How to prevent NBU from immediately using a medi a that failed before

2005-10-28 20:46:33
Subject: [Veritas-bu] How to prevent NBU from immediately using a medi a that failed before
From: netbacker AT gmail DOT com (Sto Rage© )
Date: Fri, 28 Oct 2005 17:46:33 -0700
Thanks to all that replied. Looking at the issues we have been having,
I think setting
MEDIA_ERROR_THRESHOLD to 0 is the best option for us, i.e. freezing
the tape immediately.  We can then investigae the forzen tapes later
to see what indeed was the issue and unfreeze the media and reuse it
if needed. (Mark, would you mind send us the script you mentioned?)
We would like to freeze the tape the first time so that NBU doesn't
waste time using  the same tape  for the next 4 or 5 jobs in the
queue. Last time this happened, we lost lmore than 8 hours of backup
time. The fault on that tape was somewhere at the end, where it failed
to seek. So each job that failed wrote anywhere from 85GB to 100GB on
that tape before it failed (LTO-1 media).


-G

On 10/28/05, Mark.Donaldson AT cexp DOT com <Mark.Donaldson AT cexp DOT com> 
wrote:
> Frozen, though, isn't necessarily mean broken.  A media fault is possible
> but then there's the drive faults too, loader error, sunspots, plague.
>
> I've got a script that sweeps the frozen tapes, keeps a count, and unfreezes
> them if there hasn't been enough failures.  Any tape that freezes over 3
> times stays frozen.  I may be a method you could adapt.
>
> -M
>
> -----Original Message-----
> From: veritas-bu-admin AT mailman.eng.auburn DOT edu
> [mailto:veritas-bu-admin AT mailman.eng.auburn DOT edu]On Behalf Of
> ida3248b AT post.cybercity DOT dk
> Sent: Friday, October 28, 2005 2:28 AM
> To: Sto Rage(c); Veritas NBU Mailing List (E-mail)
> Subject: Re: [Veritas-bu] How to prevent NBU from immediately using a
> media that failed before
>
>
> Hi G
>
> You can under INSTALLPATH/netbackup created the files
>
> MEDIA_ERROR_THRESHOLD number of allowed errors
>
> TIME_WINDOW in which number of errors occurs (number of hours)
>
> If you put 0 the first file, the tape should get frozen at the first error
>
> Regards
> Michael
>
> On Thu, 27 Oct 2005 11:11:11 -0700, Sto Rage(c) wrote
> > Hi,
> >   Here's my problem, a backup job writes to a media and then fails
> > with write error/position error etc. The job then gets re-queued and
> > runs again, then NBU uses this very same tape and writes and fails
> > again, this happens till the max retires of the job is exceeded and
> > then the job fails.
> > Why does it reuse the same tape again and again for the same
> > job/policy? Is there a counter that we can set to prevent NBU from
> > retrying a media that errors out the first time?
> > The logs below from bptm show the media ID 001956 being repeatedly used.
> >
> > 02:01:58.703 [5842] <2> log_media_error: successfully wrote to error
> > file - 10/27/05 02:01:58 001956 13 POSITION_ERROR
> > 02:29:33.454 [21029] <2> log_media_error: successfully wrote to error
> > file - 10/27/05 02:29:33 001956 13 POSITION_ERROR
> > 03:19:20.128 [22766] <2> log_media_error: successfully wrote to error
> > file - 10/27/05 03:19:20 001956 13 POSITION_ERROR
> > 04:30:34.394 [25958] <2> log_media_error: successfully wrote to error
> > file - 10/27/05 04:30:34 001956 13 POSITION_ERROR
> >
> >   Ironically, the 5th time it successfully wrote to this tape and
> > continued with the job.
> > We run huge NDMP jobs (average size of each is 2 TB) so when this
> > happens say 70% into a job, NBU has to start from the beginning,
> > sadly checkpoint restart is not an option for NDMP backups. Is this
> > available in 6.0?
> >
> > -G
> >
> > _______________________________________________
> > Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
> > http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
>
>
> --
> Cybercity Webhosting (http://www.cybercity.dk)
>
> _______________________________________________
> Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
> http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
>


<Prev in Thread] Current Thread [Next in Thread>