[Veritas-bu] EXITING with status 84

On Tuesday 05 November 2002 10:57 pm, George Drew wrote:
> Actually, the error is spelled out for you. "external event caused rewind"
> is telling you that something other than netbackup caused this tape to be
> rewound. You need to ensure that netbackup has exclusive access to the
> device during the course of a backup. In a SAN this can be a problem, as
> many machines may have visibility to the drives.
>
> You need to do a couple of things. First, turn off all external monitoring
> software. Netbackup keeps pretty close track of the status of the drives,
> and anything else would be redundant.
>
> To ensure exclusive access you have a couple of options. You can configure
> the operating system on every machine that sees these drives to set a
> reservation every time the device is opened. In solaris, this involves
> configuring /kernel/drv/st.conf. `man st` will give you a list of the
> settings, and the relevant options bit is 0x20000 (this should not be set).
>
> The second option is to let netbackup handle setting the reservation.
> To do this, apply patch > 34_3 to all master/media servers in your
> environment, and touch /usr/openv/netbackup/db/config/ENABLE_SCSI_RESERVE
> on every machine that touches the drives. There are some caveats with
> this approach, read the README for the patch.
>
> George
>
> On Tue, 5 Nov 2002, Steven L. Sesar wrote:
> > Most of time when I see these, it indicates a SCIS error on one of our
> > drives. Do you see anything suspect in  /var/adm/messages? grep for the
> > string "sense key" and see if it returns anything.
> >
> > On Tuesday 05 November 2002 06:47 pm, Duncan Boccio wrote:
> > > Hello Netbackup Gurus,
> > >
> > > I'm a Legato Networker admin that has been draffted to help
> > > fix a Netbackup problem here at my company. My knowledge
> > > of and experience with Netbackup is very limited so please
> > > bear with me and my ignorance of this product. I have a situation
> > > where backups will start and happily backup for a while and then exit
> > > with a write error. The environment is:
> > >
> > > Netbackup 3.4 (no patches as far as I can tell)
> > > STK 9710  library with 9840 drives
> > > OS is Solaris 7
> > > E420R server
> > > Tape drives connected to the server via Crosspoint 4200
> > >       Fiber channel to SCSI routers
> > >
> > > Below is an excerpt from the bptm log that starts with the line that
> > > says it's starting to write through to the point where it exits with a
> > > write error. If anyone
> > > can shed some light on what is happening I'd appreciate it.  The tapes
> > > seem to be randomly rewinding for some reason.  I have read that the
> > > error I'm seeing can be caused when multiple servers are sharing the
> > > drives in a SAN or SSO environment but I don't believe this is the
> > > problem. The only thing the server is trying to backup is itself and
> > > the only apps that are using the drives are Netbackup and Storage
> > > Migrator.
> > >
> > > Thanks alot,
> > > Duncan
> > >
> > >
> > > 13:06:30 [13089] <2> write_data: received first buffer (262144 bytes),
> > > begin writing data
> > > 14:17:32 [13089] <2> write_backup: write_data() returned, exit_status =
> > > 0, CINDEX = 0, backup_status = -3
> > > 14:17:32 [13089] <2> write_backup: tp.tv_sec = 1036448252, stp.tv_sec =
> > > 1036443990, tp.tv_usec = 321332, stp.tv_usec = 809594, et = 4261512,
> > > mpx_total_kbytes = 35708416
> > > 14:17:32 [13089] <2> signal_parent: sending SIGUSR1 to bpbrm (pid =
> > > 13085) 14:17:32 [13089] <2> io_close: closing
> > > /usr/openv/netbackup/db/media/tpreq/HF0231, from bptm.c.13083
> > > 14:17:32 [13089] <2> write_backup: block position check: actual 0,
> > > expected 4280668319
> > > 14:17:32 [13089] <2> getsockconnected: host=merlin.incyte.com
> > > service=bpdbm address=10.33.0.54 protocol=tcp non-reserved port=13721
> > > 14:17:32 [13089] <2> bind_on_port_addr: bound to port 46085
> > > 14:17:32 [13089] <2> check_authentication: no authentication required
> > > 14:17:33 [13089] <16> write_backup: FREEZING media id HF0231, External
> > > event caused rewind during write, all data on media is lost
> > > 14:17:33 [13089] <2> log_media_error: successfully wrote to error file
> > > - 11/04/02 14:17:33 HF0231 7 WRITE_ERROR
> > > 14:17:33 [13089] <2> check_error_history: called from bptm line 13122,
> > > EXIT_Status = 84
> > > 14:17:34 [13089] <2> check_error_history: drive index = 7, media id =
> > > HF0231, time = 11/04/02 14:17:33, both_match = 0, media_match = 0,
> > > drive_match = 0 14:17:34 [13089] <2> tpunmount: tpunmount'ing
> > > /usr/openv/netbackup/db/media/tpreq/HF0231
> > > 14:17:34 [13089] <2> TpUnmountWrapper: SCSI RELEASE
> > > 14:17:34 [13089] <2> bptm: EXITING with status 84 <----------
> > >
> > >
> > >
> > > _______________________________________________
> > > Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
> > > http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
> >
> > _______________________________________________
> > Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
> > http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
>
> _______________________________________________
> Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
> http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu