Veritas-bu

[Veritas-bu] RE:Veritas-bu] End of Tape (from Nov. 2004)

2005-01-12 09:08:46
Subject: [Veritas-bu] RE:Veritas-bu] End of Tape (from Nov. 2004)
From: kfhemness AT ucdavis DOT edu (Kathryn Hemness)
Date: Wed, 12 Jan 2005 06:08:46 -0800 (PST)
Always.  When I had the checkpoint restart enabled for 15-minute checkpoints,
I would get them about 1/2 hour into the backup; I don't know why it
didn't fail on the first checkpoint.  I disabled the checkpoints and then
the backup would fail at the very end of the backup.

After the backup fails, the tape is frozen and NetBackup retries the
backup using a different tape.  Within the span of about 4 hours,
most of the tapes (I only have about 10 tapes in the library at this
point) are frozen and most of the drives are DOWNed because of the
3-failures-in-12-hours rule.

I'm not putting any limits on the fragment size of the tape drives
and I've tried disabling the specific LTO settings in my st.conf.

Because I'm getting the same type of failure with and without enabling the
checkpoint restart, I was thinking that maybe Veritas is now using a different
type of WRITE for the end of a backup.  That's why I keep asking about
the NetBackup version.  I'd like to hear from people who are successfully
using NetBackup 5.1. These are all problems I'm having with a brand new
install on new hardware and OS; I'm trying to migrate a successful NB45FP5
backup server using AIT2 libraries to NB51 using an LTO2 library and a
3TB storage array for staging backups; the LTO2 library has an internal
fiber-to-scsi bridge and is directly connected to an LSI Logic 1702 HBA
and the storage array is connected to a zoned Brocade switch to the other
LSI Logic HBA.  The fiber is 2gb.



On Tue, 11 Jan 2005, K Chapman wrote:

> Date: Tue, 11 Jan 2005 19:21:01 -0800 (PST)
> From: K Chapman <tech2187 AT yahoo DOT com>
> To: "Chapman, Scott" <Scott.Chapman AT icbc DOT com>,
>      'Kathryn Hemness' <kfhemness AT ucdavis DOT edu>
> Cc: 'K Chapman' <tech2187 AT yahoo DOT com>
> Subject: RE: [Veritas-bu] RE:Veritas-bu] End of Tape (from Nov. 2004)
>
> scott, have any relatives in ny??
>
> did ther error 84's just start or have you always been
> getting them kathryn?
>
> make sure your bridge/switch supports the
> reserver/release commands.  i hit this problem due to
> the patch in the doc below and support and i went
> through the variable length block mess and just ended
> up leaving the position check off.
>
> folks may suggest not turning this off if you use sso
> a lot (we dont).
> --- "Chapman, Scott" <Scott.Chapman AT icbc DOT com> wrote:
>
> > Kathryn, I didn't suggest the NO_POSITION_CHECK . .
> > . It was K Chapman
> > <tech2187 AT yahoo DOT com> . . . Turns out to be no
> > relation that I know of!!  ;-)
> >
> > I am afraid that I can't take the credit for
> > suggesting this one . . .
> >
> > -----Original Message-----
> > From: Kathryn Hemness [mailto:kfhemness AT ucdavis DOT edu]
> >
> > Sent: Tuesday, January 11, 2005 4:36 PM
> > To: veritas-bu AT mailman.eng.auburn DOT edu
> > Cc: scott.chapman AT icbc DOT com; song_1977 AT yahoo DOT com
> > Subject: RE: [Veritas-bu] RE:Veritas-bu] End of Tape
> > (from Nov. 2004)
> >
> >
> > Greetings --
> >
> > I ran a successful backup using the
> > /opt/openv/netbackup/db/config/NO_POSITION_CHECK
> > setting suggested by Scott Chapman.
> >
> > Then I google'd for NO_POSITION_CHECK and found the
> > following Veritas
> > Support patch readme which had a good explanation
> > for the behavior I'm
> > seeing:
> >
> > http://seer.support.veritas.com/docs/246368.htm
> >
> > What's really funny is that this readme is for NB3.4
> > in 2002.
> >
> > Now that I know the cause of the problem, I need to
> > determine a solution
> > which will enable me to use the checkpoint restart
> > feature of NetBackup 5.1.
> >
> > I welcome any suggestions.  I'm hoping there are
> > easy Solaris or LSI Logic
> > HBA commands for the final solution.
> >
> > On Tue, 11 Jan 2005, Chapman, Scott wrote:
> >
> > > Date: Tue, 11 Jan 2005 15:37:29 -0800
> > > From: "Chapman, Scott" <Scott.Chapman AT icbc DOT com>
> > > To: 'Kathryn Hemness' <kfhemness AT ucdavis DOT edu>
> > > Subject: RE: [Veritas-bu] RE:Veritas-bu] End of
> > Tape (from Nov. 2004)
> > >
> > >
> > > I am running 4.5fp5 and 5.0 at a different site.
> > You aren't running
> > > the IBM driver for the tape drives are you?  I
> > know that has caused
> > > some problems for people.
> > >
> > > What does "sgscan -v conf" show?  When I run that
> > it confirms that the
> > > drive config does not come from the st.conf by
> > putting
> > > "NOT-IN-ST-CONFIG-FILE" at the end of each tape
> > drive line . . .
> > >
> > > Scott Chapman
> > > ICBC - Victoria, Government St.
> > > Phone: 250.414.7650  Cell: 250.213.9295
> > >
> > >
> > >
> > > -----Original Message-----
> > > From: Kathryn Hemness
> > [mailto:kfhemness AT ucdavis DOT edu]
> > > Sent: Tuesday, January 11, 2005 3:13 PM
> > > To: K Chapman
> > > Cc: Chapman, Scott;
> > veritas-bu AT mailman.eng.auburn DOT edu;
> > > song_1977 AT yahoo DOT com
> > > Subject: RE: [Veritas-bu] RE:Veritas-bu] End of
> > Tape (from Nov. 2004)
> > >
> > >
> > >
> > >
> > > Turning off checkpoints was something I did early
> > in my
> > > troubleshooting attempts.
> > >
> > > I've just turned off a couple of Solaris storage
> > managment daemons
> > > (ssdgrptd and
> > > ssagent) on my server and am running another test
> > backup now.  It
> > > should finish in about 15 more minutes.
> > >
> > > I'll try the NO_POSITION_CHECK after this test
> > finishes and let you
> > > know what happens.
> > >
> > >
> > > On Tue, 11 Jan 2005, K Chapman wrote:
> > >
> > > > Date: Tue, 11 Jan 2005 14:54:31 -0800 (PST)
> > > > From: K Chapman <tech2187 AT yahoo DOT com>
> > > > To: Kathryn Hemness <kfhemness AT ucdavis DOT edu>,
> > > >      "Chapman, Scott" <Scott.Chapman AT icbc DOT com>
> > > > Cc: veritas-bu AT mailman.eng.auburn DOT edu,
> > song_1977 AT yahoo DOT com
> > > > Subject: RE: [Veritas-bu] RE:Veritas-bu] End of
> > Tape (from Nov.
> > > > 2004)
> > > >
> > > > as a test, can you try with the position check
> > turned
> > > > off?
> > > >
> > > > touch
> > /opt/openv/netbackup/db/config/NO_POSITION_CHECK
> > > >
> > > > --- Kathryn Hemness <kfhemness AT ucdavis DOT edu>
> > wrote:
> > > >
> > > > > Hi, Scott -
> > > > >
> > > > > Here's the output of my sgscan -v:
> > > > >
> > > > > /dev/sg/c0t3l2: Tape (/dev/rmt/0): "IBM
> > > > > ULTRIUM-TD2     4770"
> > > > > /dev/sg/c0t3l3: Tape (/dev/rmt/1): "IBM
> > > > > ULTRIUM-TD2     4770"
> > > > > /dev/sg/c0t3l4: Tape (/dev/rmt/2): "IBM
> > > > > ULTRIUM-TD2     4770"
> > > > >
> > > > > We got the library in October.  The drives
> > should be
> > > > > at the current FW level.
> > > > >
> > > > > Are you using NB51?
> > > > >
> > > > > On Tue, 11 Jan 2005, Chapman, Scott wrote:
> > > > >
> > > > > > Date: Tue, 11 Jan 2005 10:42:36 -0800
> > > > > > From: "Chapman, Scott"
> > <Scott.Chapman AT icbc DOT com>
> > > > > > To: 'Kathryn Hemness'
> > <kfhemness AT ucdavis DOT edu>,
> > > > > >      veritas-bu AT mailman.eng.auburn DOT edu
> > > > > > Cc: song_1977 AT yahoo DOT com
> > > > > > Subject: RE: [Veritas-bu] RE:Veritas-bu] End
> > of
> > > > > Tape (from Nov. 2004)
> > > > > >
> > > > > > Kathryn are you running current firmware on
> > the
> > > > > LTO2 drives?  I seem to
> > > > > > remember something about old firmware doing
> > > > > rewinds before netbackup was
> > > > > > done with the drive . . .
> > > > > > >From your logs:
> > > > > > 01/10/2005 13:48:50 albus.ucdavis.edu
> > > > > albus.ucdavis.edu  FREEZING media id
> > > > > > 040004, External event caused rewind during
> > write,
> > > > > all data on media is lost
> > > > > >
> > > > > > I am running IBM drives (we don't use the
> > LSI
> > > > > logic HBA's) and here is some
> > > > > > output from sgscan -v conf:
> > > > > > /dev/sg/c2t0l0: Tape (/dev/rmt/0): "IBM
> > > > > ULTRIUM-TD2     38D0" :
> > > > > > NOT-IN-ST-CONFIG-FILE
> > > > > > /dev/sg/c2t1l0: Tape (/dev/rmt/1): "IBM
> > > > > ULTRIUM-TD2     38D0" :
> > > > > > NOT-IN-ST-CONFIG-FILE
> > > > > > /dev/sg/c2t2l0: Tape (/dev/rmt/2): "IBM
> > > > > ULTRIUM-TD2     38D0" :
> > > > > > NOT-IN-ST-CONFIG-FILE
> > > > > > ...
> > > > > >
> > > > > > I don't have anything in the st.conf for the
> > > > > drives as they have been added
> > > > > > to the st several patches ago.  You might
> > check
> > > > > you st patch level as well .
> > > > > > . .
> > > > > >
> > > > > > Hope this helps.
> > > > > >
> > > > > > Scott Chapman
> > > > > > ICBC - Victoria, Government St.
> > > > > > Phone: 250.414.7650  Cell: 250.213.9295
> > > > > >
> > > > > >
> > > > > >
> > > > > > -----Original Message-----
> > > > > > From: Kathryn Hemness
> > > > > [mailto:kfhemness AT ucdavis DOT edu]
> > > > > > Sent: Tuesday, January 11, 2005 10:02 AM
> >
> === message truncated ===
>
>
>
> =====
> aaarrrggghhh!!!!
> FreeBSD rocks
>
>
>
> __________________________________
> Do you Yahoo!?
> The all-new My Yahoo! - What will yours do?
> http://my.yahoo.com
>

--kathy

===============================================================================
Kathryn Hemness                        kfhemness AT ucdavis DOT edu
System Administrator                   phone: 530.752.6547
Campus Data Center & Client Services   fax:   530.752.9154