Veritas-bu

[Veritas-bu] EXITING with status 84

2002-11-05 22:57:33
Subject: [Veritas-bu] EXITING with status 84
From: gdrew AT deathstar DOT org (George Drew)
Date: Tue, 5 Nov 2002 22:57:33 -0500 (EST)
Actually, the error is spelled out for you. "external event caused rewind"
is telling you that something other than netbackup caused this tape to be
rewound. You need to ensure that netbackup has exclusive access to the
device during the course of a backup. In a SAN this can be a problem, as
many machines may have visibility to the drives.

You need to do a couple of things. First, turn off all external monitoring
software. Netbackup keeps pretty close track of the status of the drives,
and anything else would be redundant.

To ensure exclusive access you have a couple of options. You can configure
the operating system on every machine that sees these drives to set a
reservation every time the device is opened. In solaris, this involves
configuring /kernel/drv/st.conf. `man st` will give you a list of the
settings, and the relevant options bit is 0x20000 (this should not be set).

The second option is to let netbackup handle setting the reservation.
To do this, apply patch > 34_3 to all master/media servers in your
environment, and touch /usr/openv/netbackup/db/config/ENABLE_SCSI_RESERVE
on every machine that touches the drives. There are some caveats with
this approach, read the README for the patch.

George


On Tue, 5 Nov 2002, Steven L. Sesar wrote:

> Most of time when I see these, it indicates a SCIS error on one of our drives.
> Do you see anything suspect in  /var/adm/messages? grep for the string "sense
> key" and see if it returns anything.
>
>
>
>
> On Tuesday 05 November 2002 06:47 pm, Duncan Boccio wrote:
> > Hello Netbackup Gurus,
> >
> > I'm a Legato Networker admin that has been draffted to help
> > fix a Netbackup problem here at my company. My knowledge
> > of and experience with Netbackup is very limited so please
> > bear with me and my ignorance of this product. I have a situation
> > where backups will start and happily backup for a while and then exit
> > with a write error. The environment is:
> >
> > Netbackup 3.4 (no patches as far as I can tell)
> > STK 9710  library with 9840 drives
> > OS is Solaris 7
> > E420R server
> > Tape drives connected to the server via Crosspoint 4200
> >       Fiber channel to SCSI routers
> >
> > Below is an excerpt from the bptm log that starts with the line that says
> > it's starting to write through to the point where it exits with a write
> > error. If anyone
> > can shed some light on what is happening I'd appreciate it.  The tapes seem
> > to be randomly rewinding for some reason.  I have read that the
> > error I'm seeing can be caused when multiple servers are sharing the drives
> > in a SAN or SSO environment but I don't believe this is the problem. The
> > only thing the server is trying to backup is itself and the only apps that
> > are using the drives are Netbackup and Storage Migrator.
> >
> > Thanks alot,
> > Duncan
> >
> >
> > 13:06:30 [13089] <2> write_data: received first buffer (262144 bytes),
> > begin writing data
> > 14:17:32 [13089] <2> write_backup: write_data() returned, exit_status = 0,
> > CINDEX = 0, backup_status = -3
> > 14:17:32 [13089] <2> write_backup: tp.tv_sec = 1036448252, stp.tv_sec =
> > 1036443990, tp.tv_usec = 321332, stp.tv_usec = 809594, et = 4261512,
> > mpx_total_kbytes = 35708416
> > 14:17:32 [13089] <2> signal_parent: sending SIGUSR1 to bpbrm (pid = 13085)
> > 14:17:32 [13089] <2> io_close: closing
> > /usr/openv/netbackup/db/media/tpreq/HF0231, from bptm.c.13083
> > 14:17:32 [13089] <2> write_backup: block position check: actual 0, expected
> > 4280668319
> > 14:17:32 [13089] <2> getsockconnected: host=merlin.incyte.com service=bpdbm
> > address=10.33.0.54 protocol=tcp non-reserved port=13721
> > 14:17:32 [13089] <2> bind_on_port_addr: bound to port 46085
> > 14:17:32 [13089] <2> check_authentication: no authentication required
> > 14:17:33 [13089] <16> write_backup: FREEZING media id HF0231, External
> > event caused rewind during write, all data on media is lost
> > 14:17:33 [13089] <2> log_media_error: successfully wrote to error file -
> > 11/04/02 14:17:33 HF0231 7 WRITE_ERROR
> > 14:17:33 [13089] <2> check_error_history: called from bptm line 13122,
> > EXIT_Status = 84
> > 14:17:34 [13089] <2> check_error_history: drive index = 7, media id =
> > HF0231, time = 11/04/02 14:17:33, both_match = 0, media_match = 0,
> > drive_match = 0 14:17:34 [13089] <2> tpunmount: tpunmount'ing
> > /usr/openv/netbackup/db/media/tpreq/HF0231
> > 14:17:34 [13089] <2> TpUnmountWrapper: SCSI RELEASE
> > 14:17:34 [13089] <2> bptm: EXITING with status 84 <----------
> >
> >
> >
> > _______________________________________________
> > Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
> > http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
>
>
> _______________________________________________
> Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
> http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
>


<Prev in Thread] Current Thread [Next in Thread>