Veritas-bu

[Veritas-bu] RE:Veritas-bu] End of Tape (from Nov. 2004)

2005-01-11 20:12:23
Subject: [Veritas-bu] RE:Veritas-bu] End of Tape (from Nov. 2004)
From: Dean <dean.deano AT gmail DOT com> (Dean)
Date: Wed, 12 Jan 2005 12:12:23 +1100
Be careful with this.You're probably still getting the errors, but not
seeing them because NetBackup isn't checking for them. Make sure you
test restore, several times from different tapes.




On Tue, 11 Jan 2005 16:36:27 -0800 (PST), Kathryn Hemness
<kfhemness AT ucdavis DOT edu> wrote:
> Greetings --
> 
> I ran a successful backup using the 
> /opt/openv/netbackup/db/config/NO_POSITION_CHECK
> setting suggested by Scott Chapman.
> 
> Then I google'd for NO_POSITION_CHECK and found the following Veritas Support
> patch readme which had a good explanation for the behavior I'm seeing:
> 
> http://seer.support.veritas.com/docs/246368.htm
> 
> What's really funny is that this readme is for NB3.4 in 2002.
> 
> Now that I know the cause of the problem, I need to determine a
> solution which will enable me to use the checkpoint restart feature
> of NetBackup 5.1.
> 
> I welcome any suggestions.  I'm hoping there are easy Solaris or LSI Logic HBA
> commands for the final solution.
> 
> On Tue, 11 Jan 2005, Chapman, Scott wrote:
> 
> > Date: Tue, 11 Jan 2005 15:37:29 -0800
> > From: "Chapman, Scott" <Scott.Chapman AT icbc DOT com>
> > To: 'Kathryn Hemness' <kfhemness AT ucdavis DOT edu>
> > Subject: RE: [Veritas-bu] RE:Veritas-bu] End of Tape (from Nov. 2004)
> >
> >
> > I am running 4.5fp5 and 5.0 at a different site.  You aren't running the IBM
> > driver for the tape drives are you?  I know that has caused some problems
> > for people.
> >
> > What does "sgscan -v conf" show?  When I run that it confirms that the drive
> > config does not come from the st.conf by putting "NOT-IN-ST-CONFIG-FILE" at
> > the end of each tape drive line . . .
> >
> > Scott Chapman
> > ICBC - Victoria, Government St.
> > Phone: 250.414.7650  Cell: 250.213.9295
> >
> >
> >
> > -----Original Message-----
> > From: Kathryn Hemness [mailto:kfhemness AT ucdavis DOT edu]
> > Sent: Tuesday, January 11, 2005 3:13 PM
> > To: K Chapman
> > Cc: Chapman, Scott; veritas-bu AT mailman.eng.auburn DOT edu; song_1977 AT 
> > yahoo DOT com
> > Subject: RE: [Veritas-bu] RE:Veritas-bu] End of Tape (from Nov. 2004)
> >
> >
> >
> >
> > Turning off checkpoints was something I did early in my troubleshooting
> > attempts.
> >
> > I've just turned off a couple of Solaris storage managment daemons (ssdgrptd
> > and
> > ssagent) on my server and am running another test backup now.  It should
> > finish
> > in about 15 more minutes.
> >
> > I'll try the NO_POSITION_CHECK after this test finishes and let you know
> > what happens.
> >
> >
> > On Tue, 11 Jan 2005, K Chapman wrote:
> >
> > > Date: Tue, 11 Jan 2005 14:54:31 -0800 (PST)
> > > From: K Chapman <tech2187 AT yahoo DOT com>
> > > To: Kathryn Hemness <kfhemness AT ucdavis DOT edu>,
> > >      "Chapman, Scott" <Scott.Chapman AT icbc DOT com>
> > > Cc: veritas-bu AT mailman.eng.auburn DOT edu, song_1977 AT yahoo DOT com
> > > Subject: RE: [Veritas-bu] RE:Veritas-bu] End of Tape (from Nov. 2004)
> > >
> > > as a test, can you try with the position check turned
> > > off?
> > >
> > > touch /opt/openv/netbackup/db/config/NO_POSITION_CHECK
> > >
> > > --- Kathryn Hemness <kfhemness AT ucdavis DOT edu> wrote:
> > >
> > > > Hi, Scott -
> > > >
> > > > Here's the output of my sgscan -v:
> > > >
> > > > /dev/sg/c0t3l2: Tape (/dev/rmt/0): "IBM
> > > > ULTRIUM-TD2     4770"
> > > > /dev/sg/c0t3l3: Tape (/dev/rmt/1): "IBM
> > > > ULTRIUM-TD2     4770"
> > > > /dev/sg/c0t3l4: Tape (/dev/rmt/2): "IBM
> > > > ULTRIUM-TD2     4770"
> > > >
> > > > We got the library in October.  The drives should be
> > > > at the current FW level.
> > > >
> > > > Are you using NB51?
> > > >
> > > > On Tue, 11 Jan 2005, Chapman, Scott wrote:
> > > >
> > > > > Date: Tue, 11 Jan 2005 10:42:36 -0800
> > > > > From: "Chapman, Scott" <Scott.Chapman AT icbc DOT com>
> > > > > To: 'Kathryn Hemness' <kfhemness AT ucdavis DOT edu>,
> > > > >      veritas-bu AT mailman.eng.auburn DOT edu
> > > > > Cc: song_1977 AT yahoo DOT com
> > > > > Subject: RE: [Veritas-bu] RE:Veritas-bu] End of
> > > > Tape (from Nov. 2004)
> > > > >
> > > > > Kathryn are you running current firmware on the
> > > > LTO2 drives?  I seem to
> > > > > remember something about old firmware doing
> > > > rewinds before netbackup was
> > > > > done with the drive . . .
> > > > > >From your logs:
> > > > > 01/10/2005 13:48:50 albus.ucdavis.edu
> > > > albus.ucdavis.edu  FREEZING media id
> > > > > 040004, External event caused rewind during write,
> > > > all data on media is lost
> > > > >
> > > > > I am running IBM drives (we don't use the LSI
> > > > logic HBA's) and here is some
> > > > > output from sgscan -v conf:
> > > > > /dev/sg/c2t0l0: Tape (/dev/rmt/0): "IBM
> > > > ULTRIUM-TD2     38D0" :
> > > > > NOT-IN-ST-CONFIG-FILE
> > > > > /dev/sg/c2t1l0: Tape (/dev/rmt/1): "IBM
> > > > ULTRIUM-TD2     38D0" :
> > > > > NOT-IN-ST-CONFIG-FILE
> > > > > /dev/sg/c2t2l0: Tape (/dev/rmt/2): "IBM
> > > > ULTRIUM-TD2     38D0" :
> > > > > NOT-IN-ST-CONFIG-FILE
> > > > > ...
> > > > >
> > > > > I don't have anything in the st.conf for the
> > > > drives as they have been added
> > > > > to the st several patches ago.  You might check
> > > > you st patch level as well .
> > > > > . .
> > > > >
> > > > > Hope this helps.
> > > > >
> > > > > Scott Chapman
> > > > > ICBC - Victoria, Government St.
> > > > > Phone: 250.414.7650  Cell: 250.213.9295
> > > > >
> > > > >
> > > > >
> > > > > -----Original Message-----
> > > > > From: Kathryn Hemness
> > > > [mailto:kfhemness AT ucdavis DOT edu]
> > > > > Sent: Tuesday, January 11, 2005 10:02 AM
> > > > > To: veritas-bu AT mailman.eng.auburn DOT edu
> > > > > Cc: song_1977 AT yahoo DOT com
> > > > > Subject: [Veritas-bu] RE:Veritas-bu] End of Tape
> > > > (from Nov. 2004)
> > > > >
> > > > >
> > > > > Good Morning --
> > > > >
> > > > > Was there ever a resolution to your NB5.0MP2/LTO
> > > > end of tape problem?
> > > > >
> > > > > I'm currently fighting with a new installation
> > > > NB5.1 on a Solaris 9 system
> > > > > using
> > > > > LTO2 tape drives.  My backups ALWAYS fail either
> > > > at a checkpoint-restart
> > > > > WRITE or
> > > > > at the very last WRITE of the backup, regardless
> > > > of how big the backup is.
> > > > >
> > > > > I've been told by my NetBackup tech support (via
> > > > Sun) that it was a hardware
> > > > > configuration problem.
> > > > >
> > > > > The backups always fail, regardless of any st.conf
> > > > modifications and I've
> > > > > even
> > > > > taken the fiber switch out of the mix.  Here's a
> > > > summary of my hardware and
> > > > > the
> > > > > types of errors I'm seeing (by the way, ufsdump
> > > > works just  fine....).
> > > > >
> > > > > Master: Solaris 9 version 4/04 on a Sun V240 with
> > > > 2 LSI Logic FC919X HBAs
> > > > > running
> > > > > NB5.1 Enterprise Server.  One LSI Logic HBA is
> > > > connected directly to the
> > > > > fiber/scsi
> > > > > bridge of a Qualstar 88264 LTO2 library, the other
> > > > to a Brocade 32-port
> > > > > fiber
> > > > > switch attached to a Sun 3511 storage array.
> > > > >
> > > > > I have tried at least 4 different st.conf LTO2
> > > > configurations with same
> > > > > failing
> > > > > results and am now not using any special LTO2
> > > > definitions.
> > > > >
> > > > > Here are the failure errors from both the
> > > > NetBackup reports and from the
> > > > > bptm logs:
> > > > >
> > > > > 01/10/2005 13:48:50 albus.ucdavis.edu
> > > > albus.ucdavis.edu  FREEZING media id
> > > > > 040004, External event caused rewind during write,
> > > > all data on media is lost
> > > > > 01/10/2005 13:48:54 albus.ucdavis.edu
> > > > albus.ucdavis.edu  CLIENT
> > > > > albus.ucdavis.edu  POLICY IR-ISM_02  SCHED
> > > > WeeklyFull  EXIT STATUS 84 (media
> > > > > write error)
> > > > > 01/10/2005 13:48:54 albus.ucdavis.edu
> > > > albus.ucdavis.edu  backup of client
> > > > > albus.ucdavis.edu exited with status 84 (media
> > > > write error)
> > > > >
> > > > > Here's the bptm log entry for the above error:
> > > > >
> > > > > 13:48:48.032 [1297] <2> write_backup: tp.tv_sec =
> > > > 1105393728, stp.tv_sec =
> > > > > 1105391634, tp.tv_usec = 27455, stp.tv_usec =
> > > > 544901, et = 2093483,
> > > > > mpx_total_kbytes[TWIN_INDEX = 0] = 21261376
> > > > > 13:48:48.075 [1297] <2> io_terminate_tape: writing
> > > > empty backup header,
> > > > > drive index 0, copy 1
> > > > > 13:48:48.091 [1297] <2> io_ioctl: command
> > > > (0)MTWEOF 1 from (bptm.c.7919) on
> > > > > drive index 0
> > > > > 13:48:48.645 [1297] <2> io_write_back_header:
> > > > drive index 0, empty_file,
> > > > > file num = 2, mpx_headers = 0, copy 1
> > > > > 13:48:48.650 [1297] <2> io_close: closing
> > > > > /usr/openv/netbackup/db/media/tpreq/040004, from
> > > > bptm.c.8046
> > > > > 13:48:50.848 [1297] <2> io_terminate_tape:
> > > > absolute block position prior to
> > > > > writing empty header is 332201, copy 1
> > > > > 13:48:50.848 [1297] <2> io_terminate_tape: block
> > > > position check: actual
> > > > > 332201, expected 332213
> > > > > 13:48:50.848 [1297] <2> set_job_details: Sending
> > > > Tfile jobid (907)
> > > > > 13:48:50.848 [1297] <2> set_job_details: LOG
> > > > 1105393730 16 bptm 1297
> > > > > FREEZING media id 040004, External event caused
> > > > rewind during write, all
> > > > > data on media is lost
> > > > >
> > > > > 13:48:50.848 [1297] <2> set_job_details: Done
> > > > > 13:48:50.880 [1297] <16> io_terminate_tape:
> > > > FREEZING media id 040004,
> > > > > External event caused rewind during write, all
> > > > data on media is lost
> > > > > 13:48:50.898 [1297] <2> log_media_error:
> > > > successfully wrote to error file -
> > > > > 01/10/05 13:48:50 040004 0 WRITE_ERROR
> > > > > 13:48:50.910 [1297] <2> check_error_history:
> > > > called from bptm line 17870,
> > > > > EXIT_Status = 84
> > > > > 13:48:50.911 [1297] <2> check_error_history: drive
> > > > index = 0, media id =
> > > > > 040004, time = 01/10/05 13:48:50, both_match = 0,
> > > > media_match = 0,
> > > > > drive_match = 0
> > > > > 13:48:50.911 [1297] <2> tpunmount:
> > > > Check_for_waiting = 0,
> > > > > No_tpunmount_after_restore = 0,
> > > > Media_Unmount_Delay = 0, MediaOffset = 4
> > > > > 13:48:50.911 [1297] <2> tpunmount: tpunmount'ing
> > > > > /usr/openv/netbackup/db/media/tpreq/040004
> > > > >
> > > > >
> > > > > Since ufsdump works, this is indicating a
> > > > NetBackup 5.1 problem.  Anyway, I
> > > > > notice
> > > > > in your post-November posts, you referred to NB4.5
> > > > servers.  Did you have to
> > > > > downgrade NetBackup in order to get your LTO
> > > > drives to work properly?
> > > >
> > > === message truncated ===
> > >
> > >
> > > =====
> > > aaarrrggghhh!!!!
> > > FreeBSD rocks
> > >
> > >
> > >
> > > __________________________________
> > > Do you Yahoo!?
> > > Yahoo! Mail - Find what you need with new enhanced search.
> > > http://info.mail.yahoo.com/mail_250
> > >
> >
> > --kathy
> >
> > ============================================================================
> > ===
> > Kathryn Hemness                        kfhemness AT ucdavis DOT edu
> > System Administrator                   phone: 530.752.6547
> > Campus Data Center & Client Services   fax:   530.752.9154
> >
> 
> --kathy
> 
> ===============================================================================
> Kathryn Hemness                        kfhemness AT ucdavis DOT edu
> System Administrator                   phone: 530.752.6547
> Campus Data Center & Client Services   fax:   530.752.9154
> _______________________________________________
> Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
> http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
>