Veritas-bu

[Veritas-bu] RE:Veritas-bu] End of Tape (from Nov. 2004)

2005-01-12 13:40:31
Subject: [Veritas-bu] RE:Veritas-bu] End of Tape (from Nov. 2004)
From: kfhemness AT ucdavis DOT edu (Kathryn Hemness)
Date: Wed, 12 Jan 2005 10:40:31 -0800 (PST)
Hi  Tim -

I haven't done anything with tcopy yet, but I successfully stacked
3 separate ufsdumps onto one tape.  I just finished a ufsrestore from
the third ufsdump.

I'm checking on the firmware level of the HBAs now...but it seems like
old firmware would have affected the ufsdump and restore.

On Wed, 12 Jan 2005, Tim Hoke wrote:

> Date: Wed, 12 Jan 2005 09:41:47 -0600
> From: Tim Hoke <thoke AT northpeak DOT org>
> To: Kathryn Hemness <kfhemness AT ucdavis DOT edu>
> Subject: Re: [Veritas-bu] RE:Veritas-bu] End of Tape (from Nov. 2004)
>
> Kathryn,
>
> I don't know much about the HBA firmware/driver revisions, so can't
> help with that one.
>
> Were you able to run tcopy on one of the tapes?
>
> -Tim
>
> On Jan 12, 2005, at 9:04 AM, Kathryn Hemness wrote:
>
> >
> > I'm current up to 113277-26. I'll be looking at the HBA drivers next.
> >
> > Here's what the lsiutil reports for my HBAs:
> >
> >      Port Name         Chip Vendor/Type    MPT Rev  Firmware Rev
> >   1.  itmpt0            LSI Logic FC919X      103      01020000
> >   2.  itmpt1            LSI Logic FC919X      103      01020000
> >
> > These HBAs are also only 2 months old, so I'd expect the FW to
> > be current.
> >
> >
> >
> > On Wed, 12 Jan 2005, Tim Hoke wrote:
> >
> >> Date: Wed, 12 Jan 2005 08:43:36 -0600
> >> From: Tim Hoke <thoke AT northpeak DOT org>
> >> To: Kathryn Hemness <kfhemness AT ucdavis DOT edu>
> >> Cc: "Chapman, Scott" <Scott.Chapman AT icbc DOT com>
> >> Subject: Re: [Veritas-bu] RE:Veritas-bu] End of Tape (from Nov. 2004)
> >>
> >> When you say VERITAS Drivers, that would typically mean Windows
> >> platforms.  The only "VERITAS Driver" that is provided for Solaris is
> >> the sg driver (scsi passthru).
> >>
> >> So, you should be using the SUN st driver (SCSI Tape).
> >>
> >> According to my records (Sunsolve), the st driver is in SUN patch
> >> 113277 and the current revision is -26.  Native support for your IBM
> >> Ultrium-TD2 drives was introduced in the -10 release.  So, as long as
> >> you are at -10 or above, you shouldn't be using any st.conf entries.
> >> However, I don't work for Sun support, so you really should verify it
> >> with them.
> >>
> >> I'd also suspect any HBA drivers/firmware or any other fiber devices
> >> too.
> >>
> >> -Tim
> >>
> >> On Jan 12, 2005, at 8:12 AM, Kathryn Hemness wrote:
> >>
> >>> I've also googled and found that article.  I'm definitely using the
> >>> veritas drivers.
> >>>
> >>>
> >>> On Tue, 11 Jan 2005, Chapman, Scott wrote:
> >>>
> >>>> Date: Tue, 11 Jan 2005 20:07:10 -0800
> >>>> From: "Chapman, Scott" <Scott.Chapman AT icbc DOT com>
> >>>> To: 'Kathryn Hemness' <kfhemness AT ucdavis DOT edu>
> >>>> Subject: RE: [Veritas-bu] RE:Veritas-bu] End of Tape (from Nov.
> >>>> 2004)
> >>>>
> >>>> Kathryn, I did a google search with "External event caused rewind
> >>>> during"
> >>>> and the first hit mentions something from your logs:
> >>>>> From the 5.0mp2 patch release . . .
> >>>>    "For a checkpoint restart backup, in the rare case where bptm
> >>>> detects
> >>>>    that a position check problem occurred following a checkpoint
> >>>> because of
> >>>> a
> >>>>    misconfigured drive or a rewind from an external source, and the
> >>>> backup
> >>>> is
> >>>>    later resumed, the information on the tape prior to the
> >>>> checkpoint
> >>>> may be
> >>>>
> >>>>    invalid.
> >>>>
> >>>>    The bptm log would indicate the position check problem with one
> >>>> of
> >>>> the
> >>>>    following logs after a checkpoint:
> >>>>
> >>>>    08:39:57.969 [4393] <16> write_data: FREEZING media id 00011, too
> >>>> many
> >>>>    data blocks written, check tape/driver block size configuration
> >>>>
> >>>>    OR
> >>>>
> >>>>    log.041204:14:39:12.373 [6416] <16> write_data: FREEZING media id
> >>>> 00005,
> >>>> <<<< here is what your logs reflect also
> >>>>    External event caused rewind during write, all data on media is
> >>>> lost
> >>>>
> >>>>    The problem would occur if the same backup were resumed and
> >>>> completed
> >>>> with
> >>>>    a successful status."
> >>>>
> >>>> The one thing is this piece of information is that they mention a
> >>>> misconfigured drive.
> >>>>
> >>>> Question 1) Do you have the latest st driver patch installed on the
> >>>> backup
> >>>> server?
> >>>> Question 2) You are using the Veritas tape drivers right?  This is
> >>>> very
> >>>> important, as there doesn't seem to be many people having luck with
> >>>> non-veritas drivers.
> >>>>
> >>>> Here is the google search
> >>>> http://www.google.ca/search?
> >>>> hl=en&q=%22External+event+caused+rewind+during%2
> >>>> 2&meta=
> >>>>
> >>>>
> >>>> -----Original Message-----
> >>>> From: Kathryn Hemness [mailto:kfhemness AT ucdavis DOT edu]
> >>>> Sent: Tuesday, January 11, 2005 4:36 PM
> >>>> To: veritas-bu AT mailman.eng.auburn DOT edu
> >>>> Cc: scott.chapman AT icbc DOT com; song_1977 AT yahoo DOT com
> >>>> Subject: RE: [Veritas-bu] RE:Veritas-bu] End of Tape (from Nov.
> >>>> 2004)
> >>>>
> >>>>
> >>>> Greetings --
> >>>>
> >>>> I ran a successful backup using the
> >>>> /opt/openv/netbackup/db/config/NO_POSITION_CHECK
> >>>> setting suggested by Scott Chapman.
> >>>>
> >>>> Then I google'd for NO_POSITION_CHECK and found the following
> >>>> Veritas
> >>>> Support patch readme which had a good explanation for the behavior
> >>>> I'm
> >>>> seeing:
> >>>>
> >>>> http://seer.support.veritas.com/docs/246368.htm
> >>>>
> >>>> What's really funny is that this readme is for NB3.4 in 2002.
> >>>>
> >>>> Now that I know the cause of the problem, I need to determine a
> >>>> solution
> >>>> which will enable me to use the checkpoint restart feature of
> >>>> NetBackup 5.1.
> >>>>
> >>>> I welcome any suggestions.  I'm hoping there are easy Solaris or LSI
> >>>> Logic
> >>>> HBA commands for the final solution.
> >>>>
> >>>> On Tue, 11 Jan 2005, Chapman, Scott wrote:
> >>>>
> >>>>> Date: Tue, 11 Jan 2005 15:37:29 -0800
> >>>>> From: "Chapman, Scott" <Scott.Chapman AT icbc DOT com>
> >>>>> To: 'Kathryn Hemness' <kfhemness AT ucdavis DOT edu>
> >>>>> Subject: RE: [Veritas-bu] RE:Veritas-bu] End of Tape (from Nov.
> >>>>> 2004)
> >>>>>
> >>>>>
> >>>>> I am running 4.5fp5 and 5.0 at a different site.  You aren't
> >>>>> running
> >>>>> the IBM driver for the tape drives are you?  I know that has caused
> >>>>> some problems for people.
> >>>>>
> >>>>> What does "sgscan -v conf" show?  When I run that it confirms that
> >>>>> the
> >>>>> drive config does not come from the st.conf by putting
> >>>>> "NOT-IN-ST-CONFIG-FILE" at the end of each tape drive line . . .
> >>>>>
> >>>>> Scott Chapman
> >>>>> ICBC - Victoria, Government St.
> >>>>> Phone: 250.414.7650  Cell: 250.213.9295
> >>>>>
> >>>>>
> >>>>>
> >>>>> -----Original Message-----
> >>>>> From: Kathryn Hemness [mailto:kfhemness AT ucdavis DOT edu]
> >>>>> Sent: Tuesday, January 11, 2005 3:13 PM
> >>>>> To: K Chapman
> >>>>> Cc: Chapman, Scott; veritas-bu AT mailman.eng.auburn DOT edu;
> >>>>> song_1977 AT yahoo DOT com
> >>>>> Subject: RE: [Veritas-bu] RE:Veritas-bu] End of Tape (from Nov.
> >>>>> 2004)
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> Turning off checkpoints was something I did early in my
> >>>>> troubleshooting attempts.
> >>>>>
> >>>>> I've just turned off a couple of Solaris storage managment daemons
> >>>>> (ssdgrptd and
> >>>>> ssagent) on my server and am running another test backup now.  It
> >>>>> should finish in about 15 more minutes.
> >>>>>
> >>>>> I'll try the NO_POSITION_CHECK after this test finishes and let you
> >>>>> know what happens.
> >>>>>
> >>>>>
> >>>>> On Tue, 11 Jan 2005, K Chapman wrote:
> >>>>>
> >>>>>> Date: Tue, 11 Jan 2005 14:54:31 -0800 (PST)
> >>>>>> From: K Chapman <tech2187 AT yahoo DOT com>
> >>>>>> To: Kathryn Hemness <kfhemness AT ucdavis DOT edu>,
> >>>>>>      "Chapman, Scott" <Scott.Chapman AT icbc DOT com>
> >>>>>> Cc: veritas-bu AT mailman.eng.auburn DOT edu, song_1977 AT yahoo DOT 
> >>>>>> com
> >>>>>> Subject: RE: [Veritas-bu] RE:Veritas-bu] End of Tape (from Nov.
> >>>>>> 2004)
> >>>>>>
> >>>>>> as a test, can you try with the position check turned
> >>>>>> off?
> >>>>>>
> >>>>>> touch /opt/openv/netbackup/db/config/NO_POSITION_CHECK
> >>>>>>
> >>>>>> --- Kathryn Hemness <kfhemness AT ucdavis DOT edu> wrote:
> >>>>>>
> >>>>>>> Hi, Scott -
> >>>>>>>
> >>>>>>> Here's the output of my sgscan -v:
> >>>>>>>
> >>>>>>> /dev/sg/c0t3l2: Tape (/dev/rmt/0): "IBM
> >>>>>>> ULTRIUM-TD2     4770"
> >>>>>>> /dev/sg/c0t3l3: Tape (/dev/rmt/1): "IBM
> >>>>>>> ULTRIUM-TD2     4770"
> >>>>>>> /dev/sg/c0t3l4: Tape (/dev/rmt/2): "IBM
> >>>>>>> ULTRIUM-TD2     4770"
> >>>>>>>
> >>>>>>> We got the library in October.  The drives should be
> >>>>>>> at the current FW level.
> >>>>>>>
> >>>>>>> Are you using NB51?
> >>>>>>>
> >>>>>>> On Tue, 11 Jan 2005, Chapman, Scott wrote:
> >>>>>>>
> >>>>>>>> Date: Tue, 11 Jan 2005 10:42:36 -0800
> >>>>>>>> From: "Chapman, Scott" <Scott.Chapman AT icbc DOT com>
> >>>>>>>> To: 'Kathryn Hemness' <kfhemness AT ucdavis DOT edu>,
> >>>>>>>>      veritas-bu AT mailman.eng.auburn DOT edu
> >>>>>>>> Cc: song_1977 AT yahoo DOT com
> >>>>>>>> Subject: RE: [Veritas-bu] RE:Veritas-bu] End of
> >>>>>>> Tape (from Nov. 2004)
> >>>>>>>>
> >>>>>>>> Kathryn are you running current firmware on the
> >>>>>>> LTO2 drives?  I seem to
> >>>>>>>> remember something about old firmware doing
> >>>>>>> rewinds before netbackup was
> >>>>>>>> done with the drive . . .
> >>>>>>>>> From your logs:
> >>>>>>>> 01/10/2005 13:48:50 albus.ucdavis.edu
> >>>>>>> albus.ucdavis.edu  FREEZING media id
> >>>>>>>> 040004, External event caused rewind during write,
> >>>>>>> all data on media is lost
> >>>>>>>>
> >>>>>>>> I am running IBM drives (we don't use the LSI
> >>>>>>> logic HBA's) and here is some
> >>>>>>>> output from sgscan -v conf:
> >>>>>>>> /dev/sg/c2t0l0: Tape (/dev/rmt/0): "IBM
> >>>>>>> ULTRIUM-TD2     38D0" :
> >>>>>>>> NOT-IN-ST-CONFIG-FILE
> >>>>>>>> /dev/sg/c2t1l0: Tape (/dev/rmt/1): "IBM
> >>>>>>> ULTRIUM-TD2     38D0" :
> >>>>>>>> NOT-IN-ST-CONFIG-FILE
> >>>>>>>> /dev/sg/c2t2l0: Tape (/dev/rmt/2): "IBM
> >>>>>>> ULTRIUM-TD2     38D0" :
> >>>>>>>> NOT-IN-ST-CONFIG-FILE
> >>>>>>>> ...
> >>>>>>>>
> >>>>>>>> I don't have anything in the st.conf for the
> >>>>>>> drives as they have been added
> >>>>>>>> to the st several patches ago.  You might check
> >>>>>>> you st patch level as well .
> >>>>>>>> . .
> >>>>>>>>
> >>>>>>>> Hope this helps.
> >>>>>>>>
> >>>>>>>> Scott Chapman
> >>>>>>>> ICBC - Victoria, Government St.
> >>>>>>>> Phone: 250.414.7650  Cell: 250.213.9295
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> -----Original Message-----
> >>>>>>>> From: Kathryn Hemness
> >>>>>>> [mailto:kfhemness AT ucdavis DOT edu]
> >>>>>>>> Sent: Tuesday, January 11, 2005 10:02 AM
> >>>>>>>> To: veritas-bu AT mailman.eng.auburn DOT edu
> >>>>>>>> Cc: song_1977 AT yahoo DOT com
> >>>>>>>> Subject: [Veritas-bu] RE:Veritas-bu] End of Tape
> >>>>>>> (from Nov. 2004)
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Good Morning --
> >>>>>>>>
> >>>>>>>> Was there ever a resolution to your NB5.0MP2/LTO
> >>>>>>> end of tape problem?
> >>>>>>>>
> >>>>>>>> I'm currently fighting with a new installation
> >>>>>>> NB5.1 on a Solaris 9 system
> >>>>>>>> using
> >>>>>>>> LTO2 tape drives.  My backups ALWAYS fail either
> >>>>>>> at a checkpoint-restart
> >>>>>>>> WRITE or
> >>>>>>>> at the very last WRITE of the backup, regardless
> >>>>>>> of how big the backup is.
> >>>>>>>>
> >>>>>>>> I've been told by my NetBackup tech support (via
> >>>>>>> Sun) that it was a hardware
> >>>>>>>> configuration problem.
> >>>>>>>>
> >>>>>>>> The backups always fail, regardless of any st.conf
> >>>>>>> modifications and I've
> >>>>>>>> even
> >>>>>>>> taken the fiber switch out of the mix.  Here's a
> >>>>>>> summary of my hardware and
> >>>>>>>> the
> >>>>>>>> types of errors I'm seeing (by the way, ufsdump
> >>>>>>> works just  fine....).
> >>>>>>>>
> >>>>>>>> Master: Solaris 9 version 4/04 on a Sun V240 with
> >>>>>>> 2 LSI Logic FC919X HBAs
> >>>>>>>> running
> >>>>>>>> NB5.1 Enterprise Server.  One LSI Logic HBA is
> >>>>>>> connected directly to the
> >>>>>>>> fiber/scsi
> >>>>>>>> bridge of a Qualstar 88264 LTO2 library, the other
> >>>>>>> to a Brocade 32-port
> >>>>>>>> fiber
> >>>>>>>> switch attached to a Sun 3511 storage array.
> >>>>>>>>
> >>>>>>>> I have tried at least 4 different st.conf LTO2
> >>>>>>> configurations with same
> >>>>>>>> failing
> >>>>>>>> results and am now not using any special LTO2
> >>>>>>> definitions.
> >>>>>>>>
> >>>>>>>> Here are the failure errors from both the
> >>>>>>> NetBackup reports and from the
> >>>>>>>> bptm logs:
> >>>>>>>>
> >>>>>>>> 01/10/2005 13:48:50 albus.ucdavis.edu
> >>>>>>> albus.ucdavis.edu  FREEZING media id
> >>>>>>>> 040004, External event caused rewind during write,
> >>>>>>> all data on media is lost
> >>>>>>>> 01/10/2005 13:48:54 albus.ucdavis.edu
> >>>>>>> albus.ucdavis.edu  CLIENT
> >>>>>>>> albus.ucdavis.edu  POLICY IR-ISM_02  SCHED
> >>>>>>> WeeklyFull  EXIT STATUS 84 (media
> >>>>>>>> write error)
> >>>>>>>> 01/10/2005 13:48:54 albus.ucdavis.edu
> >>>>>>> albus.ucdavis.edu  backup of client
> >>>>>>>> albus.ucdavis.edu exited with status 84 (media
> >>>>>>> write error)
> >>>>>>>>
> >>>>>>>> Here's the bptm log entry for the above error:
> >>>>>>>>
> >>>>>>>> 13:48:48.032 [1297] <2> write_backup: tp.tv_sec =
> >>>>>>> 1105393728, stp.tv_sec =
> >>>>>>>> 1105391634, tp.tv_usec = 27455, stp.tv_usec =
> >>>>>>> 544901, et = 2093483,
> >>>>>>>> mpx_total_kbytes[TWIN_INDEX = 0] = 21261376 13:48:48.075 [1297]
> >>>>>>>> <2> io_terminate_tape: writing
> >>>>>>> empty backup header,
> >>>>>>>> drive index 0, copy 1
> >>>>>>>> 13:48:48.091 [1297] <2> io_ioctl: command
> >>>>>>> (0)MTWEOF 1 from (bptm.c.7919) on
> >>>>>>>> drive index 0
> >>>>>>>> 13:48:48.645 [1297] <2> io_write_back_header:
> >>>>>>> drive index 0, empty_file,
> >>>>>>>> file num = 2, mpx_headers = 0, copy 1
> >>>>>>>> 13:48:48.650 [1297] <2> io_close: closing
> >>>>>>>> /usr/openv/netbackup/db/media/tpreq/040004, from
> >>>>>>> bptm.c.8046
> >>>>>>>> 13:48:50.848 [1297] <2> io_terminate_tape:
> >>>>>>> absolute block position prior to
> >>>>>>>> writing empty header is 332201, copy 1
> >>>>>>>> 13:48:50.848 [1297] <2> io_terminate_tape: block
> >>>>>>> position check: actual
> >>>>>>>> 332201, expected 332213
> >>>>>>>> 13:48:50.848 [1297] <2> set_job_details: Sending
> >>>>>>> Tfile jobid (907)
> >>>>>>>> 13:48:50.848 [1297] <2> set_job_details: LOG
> >>>>>>> 1105393730 16 bptm 1297
> >>>>>>>> FREEZING media id 040004, External event caused
> >>>>>>> rewind during write, all
> >>>>>>>> data on media is lost
> >>>>>>>>
> >>>>>>>> 13:48:50.848 [1297] <2> set_job_details: Done 13:48:50.880
> >>>>>>>> [1297] <16> io_terminate_tape:
> >>>>>>> FREEZING media id 040004,
> >>>>>>>> External event caused rewind during write, all
> >>>>>>> data on media is lost
> >>>>>>>> 13:48:50.898 [1297] <2> log_media_error:
> >>>>>>> successfully wrote to error file -
> >>>>>>>> 01/10/05 13:48:50 040004 0 WRITE_ERROR
> >>>>>>>> 13:48:50.910 [1297] <2> check_error_history:
> >>>>>>> called from bptm line 17870,
> >>>>>>>> EXIT_Status = 84
> >>>>>>>> 13:48:50.911 [1297] <2> check_error_history: drive
> >>>>>>> index = 0, media id =
> >>>>>>>> 040004, time = 01/10/05 13:48:50, both_match = 0,
> >>>>>>> media_match = 0,
> >>>>>>>> drive_match = 0
> >>>>>>>> 13:48:50.911 [1297] <2> tpunmount:
> >>>>>>> Check_for_waiting = 0,
> >>>>>>>> No_tpunmount_after_restore = 0,
> >>>>>>> Media_Unmount_Delay = 0, MediaOffset = 4
> >>>>>>>> 13:48:50.911 [1297] <2> tpunmount: tpunmount'ing
> >>>>>>>> /usr/openv/netbackup/db/media/tpreq/040004
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Since ufsdump works, this is indicating a
> >>>>>>> NetBackup 5.1 problem.  Anyway, I
> >>>>>>>> notice
> >>>>>>>> in your post-November posts, you referred to NB4.5
> >>>>>>> servers.  Did you have to
> >>>>>>>> downgrade NetBackup in order to get your LTO
> >>>>>>> drives to work properly?
> >>>>>>>
> >>>>>> === message truncated ===
> >>>>>>
> >>>>>>
> >>>>>> =====
> >>>>>> aaarrrggghhh!!!!
> >>>>>> FreeBSD rocks
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> __________________________________
> >>>>>> Do you Yahoo!?
> >>>>>> Yahoo! Mail - Find what you need with new enhanced search.
> >>>>>> http://info.mail.yahoo.com/mail_250
> >>>>>>
> >>>>>
> >>>>> --kathy
> >>>>>
> >>>>> ===================================================================
> >>>>> ==
> >>>>> =
> >>>>> ======
> >>>>> ===
> >>>>> Kathryn Hemness                        kfhemness AT ucdavis DOT edu
> >>>>> System Administrator                   phone: 530.752.6547
> >>>>> Campus Data Center & Client Services   fax:   530.752.9154
> >>>>>
> >>>>
> >>>> --kathy
> >>>>
> >>>> ====================================================================
> >>>> ==
> >>>> ======
> >>>> ===
> >>>> Kathryn Hemness                        kfhemness AT ucdavis DOT edu
> >>>> System Administrator                   phone: 530.752.6547
> >>>> Campus Data Center & Client Services   fax:   530.752.9154
> >>>>
> >>>
> >>> --kathy
> >>>
> >>> =====================================================================
> >>> ==
> >>> ========
> >>> Kathryn Hemness                        kfhemness AT ucdavis DOT edu
> >>> System Administrator                   phone: 530.752.6547
> >>> Campus Data Center & Client Services   fax:   530.752.9154
> >>> _______________________________________________
> >>> Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
> >>> http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
> >>>
> >>
> >
> > --kathy
> >
> > =======================================================================
> > ========
> > Kathryn Hemness                        kfhemness AT ucdavis DOT edu
> > System Administrator                   phone: 530.752.6547
> > Campus Data Center & Client Services   fax:   530.752.9154
> >
>

--kathy

===============================================================================
Kathryn Hemness                        kfhemness AT ucdavis DOT edu
System Administrator                   phone: 530.752.6547
Campus Data Center & Client Services   fax:   530.752.9154