Veritas-bu

[Veritas-bu] Re: Help! All Tapes From Scratch

2006-05-05 02:01:03
Subject: [Veritas-bu] Re: Help! All Tapes From Scratch
From: simon.weaver AT astrium.eads DOT net (WEAVER, Simon)
Date: Fri, 5 May 2006 07:01:03 +0100
Nor was I bob :-)
I was creating a scenario about tapes with status 84 or 86 and the ability
to recover ALL data from those tapes :-)

Regards

Simon Weaver
3rd Line Technical Support
Windows Domain Administrator 

EADS Astrium Limited, B32AA IM (DCS)
Anchorage Road, Portsmouth, PO3 5PU

Email: Simon.Weaver AT Astrium-eads DOT net



-----Original Message-----
From: bob944 [mailto:bob944 AT attglobal DOT net] 
Sent: 05 May 2006 00:53
To: veritas-bu AT mailman.eng.auburn DOT edu; WEAVER, Simon
Subject: RE: [Veritas-bu] Re: Help! All Tapes From Scratch


> I agree with your comments about throwing the tape away. However, can 
> we assume this scenario:
> 
> Full Backup Friday of critical Server. Completed all ok, but
> during the job a status appeared (Media Write Error, or
> Media Position Error) although only once.
> 
> Sunday Server Dies
> 
> Come in Monday, and you can ONLY restore from Fridays Backup! Are you 
> telling me that its VERY unlikely Netbackup will restore ANY Data from 
> the tapes it used?

Simon, I wasn't addressing your fried-robot situation earlier, so let me
clarify.  

Backups that were successful are restorable unless the tapes were damaged.
If the backups weren't successful (0 or 1), there is no data to restore.

(The rest of this note is probably too long, too pedantic and too boring for
anyone without a masochistic streak.  You've been warned.  :-)

If you were referring to "losing data" in

> > Amazing.  So, in your environment, it's considered more 
> > cost-effective to risk losing data or fail another backup than to 
> > throw out a $50 tape?

my point was to the nature of magnetic media[1].  A successful write doesn't
guarantee that you can read that block later (maybe a bit of oxide flakes
off, for instance).  And that's best case.  Now, when a tape has already
demonstrated that it has flaws (the write error you mentioned  above, for
instance), we _know_ it has at least one problem spot and that tape is much
more likely to cost you another backup versus a tape without known problems.
Or, worse, the drive's retry logic gets a successful write on the 17th
automatic retry, you think the tape is now "good," and next month you're
trying to recover that payroll master file and that block just isn't quite
good enough to read any more. Nobody's happy.

So, since an uncorrectable write error (the only ones you're going to see
since the drive/driver hide the self-corrected ones) means the backup job is
a failure--no data is retained, do I want to use tapes with problem
histories for that payroll master server?  Or anything else?  Since the
_only_ reason to do backups in the first place is to be able to restore
data--and if we're restoring, we must _need_ that data, I think not.  -  bob

p.s.  In another lifetime, I was involved on the vendor side with a customer
who cooked his mainframe.  Two different times.  Wound up replacing the
entire room full of equipment both times--the boxes that weren't flat dead
on Monday were flaky and intermittent so everything got tossed.  I don't
know the specifics of your weekend incident, but I wouldn't trust anything
in the room, hardware or media, without at least thoroughly testing it.

1.  Especially sequential media (tape versus disk).  There are many
mechanicals involved--head alignment, tracking, wear, drag, dirt--and these
change from drive to drive.  The same type of drive may be built by
different companies, or with different mechanical revisions, or firmware
changes.  The tape can stretch, wrinkle or get its edges damaged by use,
temperature/humidity and improper storage.  But the oxide... there's where
the variables really are.  It can flake, get scratched, pick up a thickness
of crud, be worn down and just plain be manufactured with flaws.  Always
has, which is why mag media devices/controllers/drivers all have error
compensation measures, including retry logic.  

You can write the same data to the same tape in the same drive and get
different results.  With a disk (and all start conditions identical), data X
gets written to sector Y, every single time.  With tape... all you can
guarantee is the order of the data.  There are gaps between blocks and
between tape marks.  A common drive/ctrlr/driver response to write errors is
to back up, wipe out the failed block, erase a bit of tape and try
again--now everything downstream is being written to a different place than
before.  Gaps are not all consistent.  Some drives vary the tape transport
speed to match the data rate...  Lots of variables.  Lots of possibilities
for a bad tape to pass a subsequent test--or a "good" one to fail tomorrow.
Some reasons to replace flaky hardware and media at the flaky stage--not
waiting until it is flat down.  And good reasons to make duplicates.

This email is for the intended addressee only.
If you have received it in error then you must not use, retain, disseminate or 
otherwise deal with it.
Please notify the sender by return email.
The views of the author may not necessarily constitute the views of EADS 
Astrium Limited.
Nothing in this email shall bind EADS Astrium Limited in any contract or 
obligation.

EADS Astrium Limited, Registered in England and Wales No. 2449259
Registered Office: Gunnels Wood Road, Stevenage, Hertfordshire, SG1 2AS, England

<Prev in Thread] Current Thread [Next in Thread>