Veritas-bu

[Veritas-bu] How to prevent NBU from immediately using a medi a that failed before

2005-10-29 02:07:54
Subject: [Veritas-bu] How to prevent NBU from immediately using a medi a that failed before
From: Ray.Hill AT ny.frb DOT org (Ray.Hill AT ny.frb DOT org)
Date: Sat, 29 Oct 2005 02:07:54 -0400
This is a multipart message in MIME format.
--=_alternative 0021AE7A852570A9_=
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: quoted-printable

Is it possible to get a copy of that script you describe below.
Thanks in advance





Ray H.
ext 8527
*****************************************************************
Communication Is the Gateway between Ideas and Results
*****************************************************************



Mark.Donaldson AT cexp DOT com=20
Sent by: veritas-bu-admin AT mailman.eng.auburn DOT edu
10/28/2005 12:16 PM

To
ida3248b AT post.cybercity DOT dk, netbacker AT gmail DOT com,=20
veritas-bu AT mailman.eng.auburn DOT edu
cc

Subject
RE: [Veritas-bu] How to prevent NBU from immediately using a medi       a=20
that failed before






Frozen, though, isn't necessarily mean broken.  A media fault is possible
but then there's the drive faults too, loader error, sunspots, plague.

I've got a script that sweeps the frozen tapes, keeps a count, and=20
unfreezes
them if there hasn't been enough failures.  Any tape that freezes over 3
times stays frozen.  I may be a method you could adapt.

-M

-----Original Message-----
From: veritas-bu-admin AT mailman.eng.auburn DOT edu
[mailto:veritas-bu-admin AT mailman.eng.auburn DOT edu]On Behalf Of
ida3248b AT post.cybercity DOT dk
Sent: Friday, October 28, 2005 2:28 AM
To: Sto Rage=A9; Veritas NBU Mailing List (E-mail)
Subject: Re: [Veritas-bu] How to prevent NBU from immediately using a
media that failed before


Hi G

You can under INSTALLPATH/netbackup created the files

MEDIA=5FERROR=5FTHRESHOLD number of allowed errors

TIME=5FWINDOW in which number of errors occurs (number of hours)

If you put 0 the first file, the tape should get frozen at the first error

Regards
Michael

On Thu, 27 Oct 2005 11:11:11 -0700, Sto Rage=A9 wrote
> Hi,
>   Here's my problem, a backup job writes to a media and then fails
> with write error/position error etc. The job then gets re-queued and
> runs again, then NBU uses this very same tape and writes and fails
> again, this happens till the max retires of the job is exceeded and
> then the job fails.
> Why does it reuse the same tape again and again for the same
> job/policy? Is there a counter that we can set to prevent NBU from
> retrying a media that errors out the first time?
> The logs below from bptm show the media ID 001956 being repeatedly used.
>=20
> 02:01:58.703 [5842] <2> log=5Fmedia=5Ferror: successfully wrote to error
> file - 10/27/05 02:01:58 001956 13 POSITION=5FERROR
> 02:29:33.454 [21029] <2> log=5Fmedia=5Ferror: successfully wrote to error
> file - 10/27/05 02:29:33 001956 13 POSITION=5FERROR
> 03:19:20.128 [22766] <2> log=5Fmedia=5Ferror: successfully wrote to error
> file - 10/27/05 03:19:20 001956 13 POSITION=5FERROR
> 04:30:34.394 [25958] <2> log=5Fmedia=5Ferror: successfully wrote to error
> file - 10/27/05 04:30:34 001956 13 POSITION=5FERROR
>=20
>   Ironically, the 5th time it successfully wrote to this tape and
> continued with the job.
> We run huge NDMP jobs (average size of each is 2 TB) so when this
> happens say 70% into a job, NBU has to start from the beginning,=20
> sadly checkpoint restart is not an option for NDMP backups. Is this=20
> available in 6.0?
>=20
> -G
>=20
> =5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=
=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F
> Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
> http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu


--
Cybercity Webhosting (http://www.cybercity.dk)

=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=
=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F
Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu

=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=
=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F
Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu


--=_alternative 0021AE7A852570A9_=
Content-Type: text/html; charset="ISO-8859-1"
Content-Transfer-Encoding: quoted-printable


<br><font size=3D2 face=3D"sans-serif">Is it possible to get a copy of that
script you describe below.</font>
<br><font size=3D2 face=3D"sans-serif">Thanks in advance</font>
<br><font size=3D2 face=3D"sans-serif"><br>
<br>
<br>
<br>
<br>
Ray H.<br>
ext 8527<br>
*****************************************************************<br>
Communication Is the Gateway between Ideas and Results<br>
*****************************************************************</font>
<br>
<br>
<br>
<table width=3D100%>
<tr valign=3Dtop>
<td width=3D40%><font size=3D1 face=3D"sans-serif"><b>[email protected]=
om</b>
</font>
<br><font size=3D1 face=3D"sans-serif">Sent by: veritas-bu-admin AT mailman DOT 
en=
g.auburn.edu</font>
<p><font size=3D1 face=3D"sans-serif">10/28/2005 12:16 PM</font>
<td width=3D59%>
<table width=3D100%>
<tr valign=3Dtop>
<td>
<div align=3Dright><font size=3D1 face=3D"sans-serif">To</font></div>
<td><font size=3D1 face=3D"sans-serif">ida3248b AT post.cybercity DOT dk, 
netbacke=
r AT gmail DOT com,
veritas-bu AT mailman.eng.auburn DOT edu</font>
<tr valign=3Dtop>
<td>
<div align=3Dright><font size=3D1 face=3D"sans-serif">cc</font></div>
<td>
<tr valign=3Dtop>
<td>
<div align=3Dright><font size=3D1 face=3D"sans-serif">Subject</font></div>
<td><font size=3D1 face=3D"sans-serif">RE: [Veritas-bu] How to prevent NBU
from immediately using a medi &nbsp; &nbsp; &nbsp; &nbsp;a that
failed before</font></table>
<br>
<table>
<tr valign=3Dtop>
<td>
<td></table>
<br></table>
<br>
<br>
<br><font size=3D2><tt>Frozen, though, isn't necessarily mean broken. &nbsp=
;A
media fault is possible<br>
but then there's the drive faults too, loader error, sunspots, plague.<br>
<br>
I've got a script that sweeps the frozen tapes, keeps a count, and unfreeze=
s<br>
them if there hasn't been enough failures. &nbsp;Any tape that freezes
over 3<br>
times stays frozen. &nbsp;I may be a method you could adapt.<br>
<br>
-M<br>
<br>
-----Original Message-----<br>
From: veritas-bu-admin AT mailman.eng.auburn DOT edu<br>
[mailto:veritas-bu-admin AT mailman.eng.auburn DOT edu]On Behalf Of<br>
ida3248b AT post.cybercity DOT dk<br>
Sent: Friday, October 28, 2005 2:28 AM<br>
To: Sto Rage=A9; Veritas NBU Mailing List (E-mail)<br>
Subject: Re: [Veritas-bu] How to prevent NBU from immediately using a<br>
media that failed before<br>
<br>
<br>
Hi G<br>
<br>
You can under INSTALLPATH/netbackup created the files<br>
<br>
MEDIA=5FERROR=5FTHRESHOLD number of allowed errors<br>
<br>
TIME=5FWINDOW in which number of errors occurs (number of hours)<br>
<br>
If you put 0 the first file, the tape should get frozen at the first error<=
br>
<br>
Regards<br>
Michael<br>
<br>
On Thu, 27 Oct 2005 11:11:11 -0700, Sto Rage=A9 wrote<br>
&gt; Hi,<br>
&gt; &nbsp; Here's my problem, a backup job writes to a media and then
fails<br>
&gt; with write error/position error etc. The job then gets re-queued and<b=
r>
&gt; runs again, then NBU uses this very same tape and writes and fails<br>
&gt; again, this happens till the max retires of the job is exceeded and<br>
&gt; then the job fails.<br>
&gt; Why does it reuse the same tape again and again for the same<br>
&gt; job/policy? Is there a counter that we can set to prevent NBU from<br>
&gt; retrying a media that errors out the first time?<br>
&gt; The logs below from bptm show the media ID 001956 being repeatedly
used.<br>
&gt; <br>
&gt; 02:01:58.703 [5842] &lt;2&gt; log=5Fmedia=5Ferror: successfully wrote
to error<br>
&gt; file - 10/27/05 02:01:58 001956 13 POSITION=5FERROR<br>
&gt; 02:29:33.454 [21029] &lt;2&gt; log=5Fmedia=5Ferror: successfully wrote
to error<br>
&gt; file - 10/27/05 02:29:33 001956 13 POSITION=5FERROR<br>
&gt; 03:19:20.128 [22766] &lt;2&gt; log=5Fmedia=5Ferror: successfully wrote
to error<br>
&gt; file - 10/27/05 03:19:20 001956 13 POSITION=5FERROR<br>
&gt; 04:30:34.394 [25958] &lt;2&gt; log=5Fmedia=5Ferror: successfully wrote
to error<br>
&gt; file - 10/27/05 04:30:34 001956 13 POSITION=5FERROR<br>
&gt; <br>
&gt; &nbsp; Ironically, the 5th time it successfully wrote to this tape
and<br>
&gt; continued with the job.<br>
&gt; We run huge NDMP jobs (average size of each is 2 TB) so when this<br>
&gt; happens say 70% into a job, NBU has to start from the beginning, <br>
&gt; sadly checkpoint restart is not an option for NDMP backups. Is this
<br>
&gt; available in 6.0?<br>
&gt; <br>
&gt; -G<br>
&gt; <br>
&gt; =5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=
=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F<br>
&gt; Veritas-bu maillist &nbsp;- &nbsp;Veritas-bu AT mailman.eng.auburn DOT 
edu<br>
&gt; http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu<br>
<br>
<br>
--<br>
Cybercity Webhosting (http://www.cybercity.dk)<br>
<br>
=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=
=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F<br>
Veritas-bu maillist &nbsp;- &nbsp;Veritas-bu AT mailman.eng.auburn DOT edu<br>
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu<br>
<br>
=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=
=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F<br>
Veritas-bu maillist &nbsp;- &nbsp;Veritas-bu AT mailman.eng.auburn DOT edu<br>
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu<br>
</tt></font>
<br>
--=_alternative 0021AE7A852570A9_=--

<Prev in Thread] Current Thread [Next in Thread>