Veritas-bu

[Veritas-bu] RE:Veritas-bu] End of Tape (from Nov. 2004)

2005-01-11 18:12:53
Subject: [Veritas-bu] RE:Veritas-bu] End of Tape (from Nov. 2004)
From: kfhemness AT ucdavis DOT edu (Kathryn Hemness)
Date: Tue, 11 Jan 2005 15:12:53 -0800 (PST)
Turning off checkpoints was something I did early in my troubleshooting
attempts.

I've just turned off a couple of Solaris storage managment daemons (ssdgrptd and
ssagent) on my server and am running another test backup now.  It should finish
in about 15 more minutes.

I'll try the NO_POSITION_CHECK after this test finishes and let you know
what happens.


On Tue, 11 Jan 2005, K Chapman wrote:

> Date: Tue, 11 Jan 2005 14:54:31 -0800 (PST)
> From: K Chapman <tech2187 AT yahoo DOT com>
> To: Kathryn Hemness <kfhemness AT ucdavis DOT edu>,
>      "Chapman, Scott" <Scott.Chapman AT icbc DOT com>
> Cc: veritas-bu AT mailman.eng.auburn DOT edu, song_1977 AT yahoo DOT com
> Subject: RE: [Veritas-bu] RE:Veritas-bu] End of Tape (from Nov. 2004)
>
> as a test, can you try with the position check turned
> off?
>
> touch /opt/openv/netbackup/db/config/NO_POSITION_CHECK
>
> --- Kathryn Hemness <kfhemness AT ucdavis DOT edu> wrote:
>
> > Hi, Scott -
> >
> > Here's the output of my sgscan -v:
> >
> > /dev/sg/c0t3l2: Tape (/dev/rmt/0): "IBM
> > ULTRIUM-TD2     4770"
> > /dev/sg/c0t3l3: Tape (/dev/rmt/1): "IBM
> > ULTRIUM-TD2     4770"
> > /dev/sg/c0t3l4: Tape (/dev/rmt/2): "IBM
> > ULTRIUM-TD2     4770"
> >
> > We got the library in October.  The drives should be
> > at the current FW level.
> >
> > Are you using NB51?
> >
> > On Tue, 11 Jan 2005, Chapman, Scott wrote:
> >
> > > Date: Tue, 11 Jan 2005 10:42:36 -0800
> > > From: "Chapman, Scott" <Scott.Chapman AT icbc DOT com>
> > > To: 'Kathryn Hemness' <kfhemness AT ucdavis DOT edu>,
> > >      veritas-bu AT mailman.eng.auburn DOT edu
> > > Cc: song_1977 AT yahoo DOT com
> > > Subject: RE: [Veritas-bu] RE:Veritas-bu] End of
> > Tape (from Nov. 2004)
> > >
> > > Kathryn are you running current firmware on the
> > LTO2 drives?  I seem to
> > > remember something about old firmware doing
> > rewinds before netbackup was
> > > done with the drive . . .
> > > >From your logs:
> > > 01/10/2005 13:48:50 albus.ucdavis.edu
> > albus.ucdavis.edu  FREEZING media id
> > > 040004, External event caused rewind during write,
> > all data on media is lost
> > >
> > > I am running IBM drives (we don't use the LSI
> > logic HBA's) and here is some
> > > output from sgscan -v conf:
> > > /dev/sg/c2t0l0: Tape (/dev/rmt/0): "IBM
> > ULTRIUM-TD2     38D0" :
> > > NOT-IN-ST-CONFIG-FILE
> > > /dev/sg/c2t1l0: Tape (/dev/rmt/1): "IBM
> > ULTRIUM-TD2     38D0" :
> > > NOT-IN-ST-CONFIG-FILE
> > > /dev/sg/c2t2l0: Tape (/dev/rmt/2): "IBM
> > ULTRIUM-TD2     38D0" :
> > > NOT-IN-ST-CONFIG-FILE
> > > ...
> > >
> > > I don't have anything in the st.conf for the
> > drives as they have been added
> > > to the st several patches ago.  You might check
> > you st patch level as well .
> > > . .
> > >
> > > Hope this helps.
> > >
> > > Scott Chapman
> > > ICBC - Victoria, Government St.
> > > Phone: 250.414.7650  Cell: 250.213.9295
> > >
> > >
> > >
> > > -----Original Message-----
> > > From: Kathryn Hemness
> > [mailto:kfhemness AT ucdavis DOT edu]
> > > Sent: Tuesday, January 11, 2005 10:02 AM
> > > To: veritas-bu AT mailman.eng.auburn DOT edu
> > > Cc: song_1977 AT yahoo DOT com
> > > Subject: [Veritas-bu] RE:Veritas-bu] End of Tape
> > (from Nov. 2004)
> > >
> > >
> > > Good Morning --
> > >
> > > Was there ever a resolution to your NB5.0MP2/LTO
> > end of tape problem?
> > >
> > > I'm currently fighting with a new installation
> > NB5.1 on a Solaris 9 system
> > > using
> > > LTO2 tape drives.  My backups ALWAYS fail either
> > at a checkpoint-restart
> > > WRITE or
> > > at the very last WRITE of the backup, regardless
> > of how big the backup is.
> > >
> > > I've been told by my NetBackup tech support (via
> > Sun) that it was a hardware
> > > configuration problem.
> > >
> > > The backups always fail, regardless of any st.conf
> > modifications and I've
> > > even
> > > taken the fiber switch out of the mix.  Here's a
> > summary of my hardware and
> > > the
> > > types of errors I'm seeing (by the way, ufsdump
> > works just  fine....).
> > >
> > > Master: Solaris 9 version 4/04 on a Sun V240 with
> > 2 LSI Logic FC919X HBAs
> > > running
> > > NB5.1 Enterprise Server.  One LSI Logic HBA is
> > connected directly to the
> > > fiber/scsi
> > > bridge of a Qualstar 88264 LTO2 library, the other
> > to a Brocade 32-port
> > > fiber
> > > switch attached to a Sun 3511 storage array.
> > >
> > > I have tried at least 4 different st.conf LTO2
> > configurations with same
> > > failing
> > > results and am now not using any special LTO2
> > definitions.
> > >
> > > Here are the failure errors from both the
> > NetBackup reports and from the
> > > bptm logs:
> > >
> > > 01/10/2005 13:48:50 albus.ucdavis.edu
> > albus.ucdavis.edu  FREEZING media id
> > > 040004, External event caused rewind during write,
> > all data on media is lost
> > > 01/10/2005 13:48:54 albus.ucdavis.edu
> > albus.ucdavis.edu  CLIENT
> > > albus.ucdavis.edu  POLICY IR-ISM_02  SCHED
> > WeeklyFull  EXIT STATUS 84 (media
> > > write error)
> > > 01/10/2005 13:48:54 albus.ucdavis.edu
> > albus.ucdavis.edu  backup of client
> > > albus.ucdavis.edu exited with status 84 (media
> > write error)
> > >
> > > Here's the bptm log entry for the above error:
> > >
> > > 13:48:48.032 [1297] <2> write_backup: tp.tv_sec =
> > 1105393728, stp.tv_sec =
> > > 1105391634, tp.tv_usec = 27455, stp.tv_usec =
> > 544901, et = 2093483,
> > > mpx_total_kbytes[TWIN_INDEX = 0] = 21261376
> > > 13:48:48.075 [1297] <2> io_terminate_tape: writing
> > empty backup header,
> > > drive index 0, copy 1
> > > 13:48:48.091 [1297] <2> io_ioctl: command
> > (0)MTWEOF 1 from (bptm.c.7919) on
> > > drive index 0
> > > 13:48:48.645 [1297] <2> io_write_back_header:
> > drive index 0, empty_file,
> > > file num = 2, mpx_headers = 0, copy 1
> > > 13:48:48.650 [1297] <2> io_close: closing
> > > /usr/openv/netbackup/db/media/tpreq/040004, from
> > bptm.c.8046
> > > 13:48:50.848 [1297] <2> io_terminate_tape:
> > absolute block position prior to
> > > writing empty header is 332201, copy 1
> > > 13:48:50.848 [1297] <2> io_terminate_tape: block
> > position check: actual
> > > 332201, expected 332213
> > > 13:48:50.848 [1297] <2> set_job_details: Sending
> > Tfile jobid (907)
> > > 13:48:50.848 [1297] <2> set_job_details: LOG
> > 1105393730 16 bptm 1297
> > > FREEZING media id 040004, External event caused
> > rewind during write, all
> > > data on media is lost
> > >
> > > 13:48:50.848 [1297] <2> set_job_details: Done
> > > 13:48:50.880 [1297] <16> io_terminate_tape:
> > FREEZING media id 040004,
> > > External event caused rewind during write, all
> > data on media is lost
> > > 13:48:50.898 [1297] <2> log_media_error:
> > successfully wrote to error file -
> > > 01/10/05 13:48:50 040004 0 WRITE_ERROR
> > > 13:48:50.910 [1297] <2> check_error_history:
> > called from bptm line 17870,
> > > EXIT_Status = 84
> > > 13:48:50.911 [1297] <2> check_error_history: drive
> > index = 0, media id =
> > > 040004, time = 01/10/05 13:48:50, both_match = 0,
> > media_match = 0,
> > > drive_match = 0
> > > 13:48:50.911 [1297] <2> tpunmount:
> > Check_for_waiting = 0,
> > > No_tpunmount_after_restore = 0,
> > Media_Unmount_Delay = 0, MediaOffset = 4
> > > 13:48:50.911 [1297] <2> tpunmount: tpunmount'ing
> > > /usr/openv/netbackup/db/media/tpreq/040004
> > >
> > >
> > > Since ufsdump works, this is indicating a
> > NetBackup 5.1 problem.  Anyway, I
> > > notice
> > > in your post-November posts, you referred to NB4.5
> > servers.  Did you have to
> > > downgrade NetBackup in order to get your LTO
> > drives to work properly?
> >
> === message truncated ===
>
>
> =====
> aaarrrggghhh!!!!
> FreeBSD rocks
>
>
>
> __________________________________
> Do you Yahoo!?
> Yahoo! Mail - Find what you need with new enhanced search.
> http://info.mail.yahoo.com/mail_250
>

--kathy

===============================================================================
Kathryn Hemness                        kfhemness AT ucdavis DOT edu
System Administrator                   phone: 530.752.6547
Campus Data Center & Client Services   fax:   530.752.9154