Veritas-bu

[Veritas-bu] Media Open error (83) having upgraded the OS

2003-07-17 06:47:43
Subject: [Veritas-bu] Media Open error (83) having upgraded the OS
From: paul.redman AT hp DOT com (Redman, Paul)
Date: Thu, 17 Jul 2003 11:47:43 +0100
Folks,

I think the word of the day is .... "hurrumph".

I followed Daniel's advice and removed/re-installed the device files; btw, 
here's the URL at SunSolve (remember to log in though, I couldn't find the 
InfoDoc when roaming as an anonymous person) :-

http://uk.sunsolve.sun.com/private-cgi/retrieve.pl?doc=infodoc%2F18579&zone_110=18579&wholewords=on

I fired off a bpbackup from cli before leaving the office and it completed 
successfully.  So I danced and skipped away from work to enjoy a night out in 
London, safe in the knowledge that all was well in the world.  Except ....

Last night's first scheduled class worked okay, but the second failed with the 
media open error.  My initial disappointment evaporated when I realised I 
hadn't refreshed the netbackup daemons after the device file re-install.  
However, a stop and re-start of the daemons hasn't made a jot of difference, I 
still can't get a class to work this morning.

As per David's reply, here is some juicy output from bpsched and bptm :-

========
bpsched
========
09:00:10 [28772] <2> start_bpbrm: /usr/openv/netbackup/bin/bpbrm bpbrm -backup 
-mt 2 -to 0 -S plonker -c apptest -hostname apptest -ru root -cl 
TFO_sys_apptest -sched daily -bt 1058428808 -dt 0 -st 0 -secure 1 -rl 1 -rp 
1209600 -b apptest_1058428808 -mediasvr plonker -jobid 134 -maxfrag 0 -v -D 13 
-rt 0 -rn -1 -stunit robot1 -cj 1 -pool NetBackup-kl 28 -fso -ct 0
09:00:10 [28772] <2> start_bpbrm: Received BPCD success message
09:00:10 [28772] <2> recv_runQ_msg: msgrcv(nodelay) stat -1  errno 35 (No 
message of desired type).  sigcld=0 sigalrm=0 sigusr1=0 sigusr2=0
09:00:10 [28772] <8> read_bpbrm_stderr: PID of bpbrm = 28780
09:00:11 [28772] <8> read_bpbrm_stderr: CONNECTING TO CLIENT FOR 
apptest_1058428808
09:00:11 [28772] <8> read_bpbrm_stderr: CONNECTED TO CLIENT FOR 
apptest_1058428808
09:00:13 [28772] <8> read_bpbrm_stderr: MOUNT CCT937
09:00:15 [28772] <8> read_bpbrm_stderr: POSITION CCT937 3
09:01:31 [28772] <2> readstring: EXIT STATUS 83
09:01:31 [28772] <4> read_bpbrm_stderr: bpbrm exit status 83 for birthtime 0
09:01:31 [28772] <2> clean_up_all: all active backups exiting with 83
09:01:31 [28772] <2> send_runQ_msg: dpid=1 spid=28772 rqtyp=14 status=83 
birth=0 stu=NULL cls=NULL clnt=NULL sched=NULL
09:01:31 [28744] <2> recv_reqQ_msg: msgrcv(delay) interrupted by signal --> 
sigcld=0 sigalrm=0 sigusr1=1 sigusr2=0
09:01:31 [28744] <2> recv_runQ_msg: msgrcv(nodelay) stat 532  errno 0 (Error 
0).  sigcld=0 sigalrm=0 sigusr1=1 sigusr2=0
09:01:31 [28744] <4> process_job_complete: clientjob with birthtime 1058428808 
completed with status 83
09:01:31 [28744] <4> process_job_complete: clientjob with birthtime of 
1058428808 was found in active list
09:01:31 [28744] <4> ?: ------------------------------------
09:01:31 [28744] <4> ?: <<  CLIENT: apptest (root) BIRTH: 1058428808  PRIORITY: 0
09:01:31 [28744] <4> ?:     CLASS: TFO_sys_apptest  SCHED: daily(1)   
09:01:31 [28744] <4> ?:     STUNIT: robot1 (plonker)
09:01:31 [28744] <4> ?:     PID: 28772  STATUS: 83  TRIES: 1  SPID: 28718 (REG)
09:01:31 [28744] <4> ?: ------------------------------------
09:01:31 [28744] <4> ?: apptest exited with status 83 (media open error)
09:01:32 [28744] <2> ?: /usr/openv/netbackup/bin/backup_exit_notify apptest 
TFO_sys_apptest daily FULL 83 0 &
09:01:32 [28744] <2> add_to_failure_history: successfully wrote to error 
history file--> 07/17/03 09:01:32 apptest TFO_sys_apptest daily 83 *NULL* 0 
1058428892
09:01:32 [28744] <2> get_failure_count: 1 matches in past 12 hours for 
apptest/TFO_sys_apptest/daily
09:01:32 [28744] <16> log_in_errorDB: backup of client apptest exited with 
status 83 (media open error)


======
bptm
======
09:00:11 [28783] <2> bptm: INITIATING: -w -c apptest -den 13 -rt 0 -rn -1 
-stunit robot1 -cl TFO_sys_apptest -bt 1058428808 -b apptest_1058428808 -st 0 
-cj 1 -p NetBackup -ru root -rclnt apptest -rclnthostname apptest -rl 1 -rp 
1209600 -sl daily -ct 0 -v -mediasvr plonker -jobid 134 
09:00:11 [28783] <2> io_set_recvbuf: setting receive network buffer to 32032 
bytes
09:00:11 [28783] <2> io_init: using 32768 data buffer size
09:00:11 [28783] <2> io_init: CINDEX 0, sched Kbytes for monitoring = 10000
09:00:11 [28783] <2> io_init: using 8 data buffers
09:00:11 [28783] <2> io_init: child delay = 20, parent delay = 30 (milliseconds)
09:00:11 [28783] <2> io_init: shm_size = 262340, buffer address = 0xff110000, 
buf control= 0xff150000, ready ptr = 0xff1500c0
09:00:11 [28783] <2> bptm: VSMInit () failed: 2d
09:00:11 [28783] <2> add_to_vmhost_list: added plonker to vmhost list

09:00:13 [28783] <2> vmdb_query_byID_getpool: server returned:  1 CCT937 ------ 
11 -------- -------- 0 -1 NONE --- 0 0 0 0 0 root root 1 NetBackup - 1058370253 
1058370247 1058370255 1058395198 0 0 3 0 0 - 0 0 32 0 0 0 0 0 - - Added by 
NetBackup

09:00:13 [28783] <2> vmdb_query_byID_getpool: server returned:  1 21 NetBackup 
ANYHOST 0 -2 the NetBackup pool

09:00:13 [28783] <2> standalone_select_media: found RVSN CCT937 in device 0
09:00:13 [28783] <2> vmdb_query_byID_getpool: server returned:  1 CCT937 ------ 
11 -------- -------- 0 -1 NONE --- 0 0 0 0 0 root root 1 NetBackup - 1058370253 
1058370247 1058370255 1058395198 0 0 3 0 0 - 0 0 32 0 0 0 0 0 - - Added by 
NetBackup

09:00:13 [28783] <2> vmdb_query_byID_getpool: server returned:  1 21 NetBackup 
ANYHOST 0 -2 the NetBackup pool
09:00:13 [28783] <2> db_byid: search for media id CCT937
09:00:13 [28783] <2> db_byid: CCT937 found at offset 5
09:00:13 [28783] <2> mount_open_media: Waiting for mount of media id CCT937 on 
server plonker.
09:00:13 [28783] <2> io_open: file 
/usr/openv/netbackup/db/media/tpreq/CCT937successfully opened
09:00:13 [28783] <2> io_close: closing 
/usr/openv/netbackup/db/media/tpreq/CCT937, fromstandalone.c.340
09:00:13 [28783] <2> select_media: selected media id CCT937 for backup, 
apptest(rl = 1)<----------
09:00:13 [28787] <2> fill_buffer: VSMInit () failed: 2d
09:00:13 [28783] <2> write_backup: backup child process is pid 28787
09:00:13 [28783] <2> io_open: file /usr/openv/netbackup/db/media/tpreq/CCT937 
successfully opened
09:00:13 [28783] <2> write_backup: media id CCT937 mounted on drive index 0, 
drivepath /dev/rmt/0cbn, drivename Drive0
09:00:13 [28783] <2> io_read_media_header: drive index 0, reading media header, 
buflen = 32768, buff = 0x3d4d20
09:00:13 [28783] <2> io_ioctl: command (5)MTREW 1 from (bptm.c.4691) on drive 
index 0
09:00:15 [28783] <2> io_ioctl: command (1)MTFSF 1 from (bptm.c.4863) on drive 
index 0
09:00:15 [28783] <2> io_position_for_write: position media id CCT937, current 
number images = 2
09:00:15 [28783] <2> io_close: closing 
/usr/openv/netbackup/db/media/tpreq/CCT937, from bptm.c.3965
09:00:15 [28783] <2> io_position_for_write: locating to absolute block number 
90630
09:01:14 [28783] <2> io_position_for_write: locate block is done
09:01:17 [28783] <2> io_open: retrying open, errno = I/O error
09:01:25 [28783] <2> io_open: retrying open, errno = I/O error
09:01:31 [28783] <16> io_open: cannot open file 
/usr/openv/netbackup/db/media/tpreq/CCT937, I/O error
09:01:31 [28783] <2> log_media_error: successfully wrote to error file - 
07/17/03 09:01:31 CCT937 0 OPEN_ERROR
09:01:31 [28783] <2> check_error_history: called from bptm line 13294, 
EXIT_Status = 83
09:01:31 [28783] <2> tpunmount: tpunmount'ing 
/usr/openv/netbackup/db/media/tpreq/CCT937
09:01:31 [28783] <2> bptm: EXITING with status 83 <----------
09:01:31 [28783] <2> wait_for_sigcld: released SIGCLD, oldmask = 0x0
09:01:32 [28809] <2> bptm: INITIATING: -count -cmd -rt 0 -rn 0 -stunit robot1 
-den 13 -mt 2 
09:01:32 [28809] <2> bptm: EXITING with status 0 <----------


Hopefully I haven't bludgeoned you all with too much output there, I know how 
it can be a pain to look through.  If anyone would like more detailed output to 
anaylse, feel free to badger me off-list.

Here's some more interesting points :-

(i) When I put in a different tape that I haven't used previously, it seems to 
backup two or three classes onto it okay; but I cannot backup a fourth one at 
all.  This seems to happen for *ANY* tape I put in.

(ii) Any classes that I do get onto tape can be restored from, no problem.

Further guidance would be very much appreciated

Super-cheers,
Paul


-----Original Message-----
From: Teklu, Daniel [mailto:daniel.teklu AT tfn DOT com]
Sent: 16 July 2003 15:50
To: 'David Chapa'; Redman, Paul; veritas-bu AT mailman.eng.auburn DOT edu
Subject: RE: [Veritas-bu] Media Open error (83) having upgraded the OS


Paul,

I had the same problem on 3.2 while upgrading from 2.6 to 8.0. I found the
fix on sunsolve (InfoDoc ID 18579). For some reason I can't find the link
now but here are the steps I did after the O/S upgarde:

remove the rmt devices: rm /dev/rmt/*
Remove the sg devices: rm /dev/sg/*
remove the sg and sg.conf drivers: rm /kernel/drv/sg ; rm
/kernel/drv/sg.conf 
Uninstall the sg driver: rem_drv sg
Reconfigure the rmt devices: drvconfig;tapes
Install the sg devices and driver: /usr/openv/volmgr/bin/driver/sg.install

That should do it.

Good Luck.
-Daniel


-----Original Message-----
From: David Chapa [mailto:david.chapa AT adic DOT com]
Sent: Wednesday, July 16, 2003 10:28 AM
To: Redman, Paul; veritas-bu AT mailman.eng.auburn DOT edu
Subject: RE: [Veritas-bu] Media Open error (83) having upgraded the OS


Did you preserve your st.conf file before applying the jumbo patch?

I would start there, since the jumbo patches are notorious for laying
down a "new" copy of the st.conf file.

After re-reading one of the sections of your email...its seems that this
may not be the case.  You mention that manual submissions seem to work,
but scheduled do not.  

I would created bpsched, bpdbm and bptm directories to start in
/usr/openv/netbackup/logs to see what information you may get from it. 

I would pay particular attention to bptm and bpsched.

David Chapa

-----Original Message-----
From: Redman, Paul [mailto:paul.redman AT hp DOT com] 
Sent: Wednesday, July 16, 2003 4:41 AM
To: veritas-bu AT mailman.eng.auburn DOT edu
Subject: [Veritas-bu] Media Open error (83) having upgraded the OS

Folks,

Hope all is well today.

Netbackup Business Server 3.2, Solaris 8 on a Sun Ultra-5 system, no MPs
of FPs, just the Solaris jumbo patch for Netbackup (108261-11).

I have been testing 3.2 and the effect of upgrading Solaris from 2.6 to
8 on a test system with a standalone tape drive.  Sun said all I would
need to do having upgraded was to install the jumbo patch mentioned
above.

One thing I did have to do having upgraded/patched the OS was to
re-install the sg driver as it was missing
(/opt/openv/volmgr/bin/driver/sg.install).

Backups are now a fairly unpredictable affair.  I think any manual
backup I kick off via bpadm works successfully, but anything that's
scheduled or run via cli (bpbackup) can fail with an error code of 83
(media open error).

I've searched the Veritas archive and uncovered nothing, the archive for
veritas-bu did have an entry from about 2001, which said to check
classes.  I have .... they're fine.

I'm now running the ltid daemon in verbose mode, but this is as good as
it gets :-

10:28:19 [19382] <2> bptm: INITIATING: -mlist -cmd 
10:28:19 [19382] <2> bptm: EXITING with status 0 <----------
10:29:17 [19358] <2> io_position_for_write: locate block is done
10:29:19 [19358] <2> io_open: retrying open, errno = I/O error
10:29:28 [19358] <2> io_open: retrying open, errno = I/O error
10:29:33 [19358] <16> io_open: cannot open file
/usr/openv/netbackup/db/media/tp
req/DD0036, I/O error
10:29:33 [19358] <2> log_media_error: successfully wrote to error file -
07/16/0
3 10:29:33 DD0036 0 OPEN_ERROR
10:29:33 [19358] <2> check_error_history: called from bptm line 13294,
EXIT_Stat
us = 83 

The above is a small section, I didn't want to stuff this e-mail with
output.

Have I got some kind of drive configuration problem post-upgrade or do I
need to look at the batch of media that I am using?  Do I need a
Netbackup patch of some kind?

Thank you in advance & have a splendid day,
Paul

> Paul Redman
> Unix Systems Admin - Hilton TFO Helpdesk
> Customer Support & Solutions Centre (CSSC)
> HP Services
> Hewlett Packard
> 
> Tel: External +44 (0)118 916 2075
> Email: Paul.Redman AT hp DOT com
http://www.hp.com



_______________________________________________
Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu

_______________________________________________
Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu