Networker

[Networker] Problems at Networker 6.1

2003-08-22 00:48:48
Subject: [Networker] Problems at Networker 6.1
From: Irwan Kurniawan <Irwan.Kurniawan AT SIEMENS DOT COM>
To: NETWORKER AT LISTMAIL.TEMPLE DOT EDU
Date: Fri, 22 Aug 2003 11:47:24 +0700
Hallo Scott (and All; specially from Legato.com),

Currently we have several networker servers.
We found this problems at one server (SUN Fire V880, Solaris 8, L20 jukebox,
DLT8000 drive - DLT IV media, SCSI connection).

* There are groups exited with return code 1. Here I quoted from 'daemon.log'.

"..
08/21/03 01:30:00 nsrd: savegroup info: starting Srv_paym1-br_DB_grp_arc (with 
1 client(s))
08/21/03 01:30:00 nsrd: savegroup info: starting Srv_paym2-br_DB_grp_arc (with 
1 client(s))
08/21/03 01:30:04 nsrd: savegroup info: paym1-br:paym1-br_ARC:: No full backups 
of this save set were found in the media database;
 performing a full backup
08/21/03 01:30:33 savegrp: group Srv_paym1-br_DB_grp_arc aborted.
08/21/03 01:30:33 savegrp: killing pid 12075
08/21/03 01:30:33 nsrd: savegroup alert: Srv_paym1-br_DB_grp_arc aborted, 1 
client(s) (paym1-br Failed)
08/21/03 01:30:34 nsrd: runq: NSR group Srv_paym1-br_DB_grp_arc exited with 
return code 1.
08/21/03 01:30:35 savegrp: group Srv_paym2-br_DB_grp_arc aborted.
08/21/03 01:30:35 savegrp: killing pid 12079
08/21/03 01:30:36 nsrd: savegroup alert: Srv_paym2-br_DB_grp_arc aborted, 1 
client(s) (paym2-br Failed)
08/21/03 01:30:36 nsrd: runq: NSR group Srv_paym2-br_DB_grp_arc exited with 
return code 1.
.."

Can you explain what really happened here and how to solve it?

* When we do inventory or labelling media and there are backup group running,
  inventory or labelling operation was stopped.
  In my analysis, backup group/task has higher priority than inventory or 
labelling.
  Is it true? Is there anything we can do to avoid this; so backup task must 
wait
  until inventory or labelling operation completed?

  In our case, it seem caused something odd happened to tape drive; during 
uncomplete
  labelling (on last tape media), networker try to load tape media needed by 
running
  group. It caused 'read open error' on the drive and make it need cleaning (as 
displayed
  on L20 panel). and networker try to load/unload tapes media.
  This undesirable errors repeated many times (and load/unload few tapes media 
label).

  Unfortunately, to solve this problem we must stop networker service after 
media writing
  at other drive finished. FYI, in that time we just replace all media (18 
tapes still new)
  and both tape drive was new one(s); since we have some activities that night.
  So I can assure you this is not caused by media problem. :D
  And after that we must clean the drive 3 times.

  I suspect this undesirable problem caused by above situation (labelling stop
  by running groups) and confusing tape drive how to handle the media.

"..
--------
08/21/03 00:35:15 nsrd: nsrjb notice: nsrjb -j L20 -O9 -L -g 
-bSrv_pay-backup_FD_pool -S 15-17
-------- Labelling 3 tapes media for spesific pool (and label template)

08/21/03 00:35:16 nsrd: /dev/rmt/1cbn  Eject operation in progress

08/21/03 00:42:44 nsrd: media warning: /dev/rmt/1cbn reading: I/O error
08/21/03 00:43:10 nsrmmd #2: /dev/rmt/1cbn: err code 70, sns key 08, asc 00, 
ascq 05
08/21/03 00:43:10 nsrmmd #2: information bytes             : 00 00 02 00
08/21/03 00:43:10 nsrd: media warning: /dev/rmt/1cbn reading: Tape label read: 
sense error: Blank Tape - End-of-data Detected
08/21/03 00:43:12 nsrd: /dev/rmt/1cbn  Label without mount operation in progress
08/21/03 00:43:12 nsrd: media info: dlt8000 tape  will be over-written
08/21/03 00:44:39 nsrd: /dev/rmt/1cbn  Eject operation in progress

08/21/03 00:51:32 nsrd: media warning: /dev/rmt/1cbn reading: I/O error
08/21/03 00:51:59 nsrmmd #2: /dev/rmt/1cbn: err code 70, sns key 08, asc 00, 
ascq 05
08/21/03 00:51:59 nsrmmd #2: information bytes             : 00 00 02 00
08/21/03 00:51:59 nsrd: media warning: /dev/rmt/1cbn reading: Tape label read: 
sense error: Blank Tape - End-of-data Detected
08/21/03 00:52:00 nsrd: /dev/rmt/1cbn  Label without mount operation in progress
08/21/03 00:54:34 nsrd: /dev/rmt/1cbn  Eject operation in progress
-------- It already labelling 2 tapes media (last tape media was not completed)

08/21/03 00:56:06 nsrd: /dev/rmt/1cbn  Verify label operation in progress

08/21/03 01:00:00 nsrd: savegroup info: starting Srv_pay11-backup_DB_grp_arc 
(with 1 client(s))
08/21/03 01:00:00 nsrd: savegroup info: starting Srv_pay12-backup_DB_grp_arc 
(with 1 client(s))
08/21/03 01:00:04 nsrd: savegroup info: pay12-backup:pay12-backup_ARC:: No full 
backups of this save set were found in the media database; performing a full 
backup

--------
08/21/03 01:00:16 nsrd: media warning: /dev/rmt/1cbn opening: I/O error
08/21/03 01:00:17 nsrd: media warning: /dev/rmt/1cbn reading: read open error, 
I/O error
08/21/03 01:00:17 nsrd: /dev/rmt/1cbn  Label without mount operation in progress
08/21/03 01:04:27 nsrd: media warning: /dev/rmt/1cbn opening: I/O error
-------- Undesirable 'I/O error' and repeated (see below).. :D

08/21/03 01:04:28 nsrd: media info: suggest mounting ArchiveLog.013 (BJ7205) on 
advcom for writing  to pool 'ArchiveLog'
08/21/03 01:04:30 nsrd: /dev/rmt/1cbn  Eject operation in progress
08/21/03 01:05:00 nsrd: savegroup info: starting Com_advcom-br_DB_grp_arc (with 
1 client(s))
08/21/03 01:05:04 nsrd: savegroup info: advcom-br:advcom-br_ARC:: No full 
backups of this save set were found in the media database; performing a full 
backup
08/21/03 01:05:10 nsrd: media waiting event: Waiting for 1 writable volumes to 
backup pool 'ArchiveLog' tape(s) on advcom
08/21/03 01:07:52 nsrd: savegroup info: starting Srv_pay-backup_DB_grp (with 1 
client(s))
08/21/03 01:07:59 nsrd: savegroup info: starting Srv_paym-br_DB_grp (with 1 
client(s))
08/21/03 01:08:13 nsrd: media waiting event: Waiting for 1 writable volumes to 
backup pool 'Srv_pay-backup_FD_pool' tape(s) on advcom
08/21/03 01:08:18 nsrd: media waiting event: Waiting for 1 writable volumes to 
backup pool 'Srv_paym-br_FD_pool' tape(s) on advcom
08/21/03 01:08:40 nsrd: media warning: /dev/rmt/1cbn opening: I/O error
08/21/03 01:08:43 nsrd: media info: suggest mounting Srv_pay_backup.006 
(BJ7219) on advcom for writing  to pool 'Srv_pay-backup_FD_pool'
08/21/03 01:08:44 nsrd: media info: loading volume Srv_pay_backup.006 into 
/dev/rmt/0cbn
08/21/03 01:08:46 nsrd: Jukebox 'L20' failed: All of the devices are in use by 
nsrmmd
08/21/03 01:08:48 nsrd: media info: suggest mounting Srv_paym_br.009 (BJ7216) 
on advcom for writing  to pool 'Srv_paym-br_FD_pool'
08/21/03 01:09:14 nsrd: Jukebox 'L20' failed: All of the devices are in use by 
nsrmmd
08/21/03 01:09:19 nsrd: /dev/rmt/0cbn  Verify label operation in progress
08/21/03 01:10:00 nsrd: savegroup info: starting Srv_paym2-br_DB_grp_arc (with 
1 client(s))
08/21/03 01:10:00 nsrd: savegroup info: starting Srv_paym1-br_DB_grp_arc (with 
1 client(s))
08/21/03 01:10:04 nsrd: savegroup info: paym1-br:paym1-br_ARC:: No full backups 
of this save set were found in the media database; performing a full backup
08/21/03 01:10:44 nsrd: /dev/rmt/0cbn  Mount operation in progress
08/21/03 01:10:49 nsrd: media event cleared: Waiting for 1 writable volumes to 
backup pool 'ArchiveLog' tape(s) on advcom
08/21/03 01:11:00 nsrd: media notice: Need a volume from pool `ArchiveLog', not 
pool `Srv_pay-backup_FD_pool'
08/21/03 01:11:01 nsrd: media waiting event: Waiting for 1 writable volumes to 
backup pool 'ArchiveLog' tape(s) on advcom
08/21/03 01:11:09 nsrd: media event cleared: Waiting for 1 writable volumes to 
backup pool 'Srv_pay-backup_FD_pool' tape(s) on advcom
08/21/03 01:11:20 nsrd: pay-backup:db_T001_level_full.orev9tbo_1_1 saving to 
pool 'Srv_pay-backup_FD_pool' (Srv_pay_backup.006)
08/21/03 01:11:37 nsrd: Jukebox 'L20' failed: All of the devices are in use by 
nsrmmd
08/21/03 01:11:39 nsrd: Jukebox 'L20' failed: All of the devices are in use by 
nsrmmd
08/21/03 01:11:47 nsrd: media info: loading volume ArchiveLog.013 into 
/dev/rmt/1cbn
08/21/03 01:12:28 nsrd: /dev/rmt/1cbn  Verify label operation in progress
08/21/03 01:16:20 nsrd: pay-backup:db_T001_level_full.orev9tbo_1_1 done saving 
to pool 'Srv_pay-backup_FD_pool' (Srv_pay_backup.006) 427 MB
08/21/03 01:16:39 nsrd: media warning: /dev/rmt/1cbn opening: I/O error
08/21/03 01:16:39 nsrd: media warning: /dev/rmt/1cbn reading: read open error, 
I/O error
 8/21/03  1:16:40 nsrjb #8743: Diagnostic: Ignoring error reading label for 
operation Verify label on ArchiveLog.013: 0(read open error, I/O error)
08/21/03 01:16:40 nsrd: /dev/rmt/1cbn  Eject operation in progress
08/21/03 01:16:51 nsrd: write completion notice: Writing to volume 
Srv_pay_backup.006 complete
08/21/03 01:19:04 nsrd: Jukebox 'L20' failed: expected volume 'ArchiveLog.013' 
got 'NULL'.
08/21/03 01:19:04 nsrd: media info: loading volume Srv_paym_br.009 into 
/dev/rmt/1cbn
08/21/03 01:19:43 nsrd: /dev/rmt/1cbn  Verify label operation in progress
08/21/03 01:19:44 nsrd: /dev/rmt/0cbn  Eject operation in progress
08/21/03 01:20:00 nsrd: savegroup alert: Group Srv_pay11-backup_DB_grp_arc 
aborted, savegrp is already running
08/21/03 01:20:00 nsrd: savegroup alert: Group Srv_pay12-backup_DB_grp_arc 
aborted, savegrp is already running
08/21/03 01:20:55 nsrd: media info: loading volume Cleaning Tape (7 uses left) 
CLN445 into /dev/rmt/0cbn
08/21/03 01:21:22 nsrd: device cleaned notice: device `/dev/rmt/0cbn' (0) in 
jukebox `L20' cleaned at `Thu Aug 21 01:21:22 2003'
08/21/03 01:23:20 nsrd: media critical event: Waiting for 1 writable volumes to 
backup pool 'Srv_paym-br_FD_pool' tape(s) on advcom
08/21/03 01:23:51 savegrp: group Srv_pay-backup_DB_grp aborted.
08/21/03 01:23:51 savegrp: killing pid 8433
08/21/03 01:23:51 nsrd: savegroup alert: Srv_pay-backup_DB_grp aborted, 1 
client(s) (pay-backup Failed)
08/21/03 01:23:52 nsrd: runq: NSR group Srv_pay-backup_DB_grp exited with 
return code 1.
08/21/03 01:23:54 nsrd: media warning: /dev/rmt/1cbn opening: I/O error
08/21/03 01:23:54 nsrd: media warning: /dev/rmt/1cbn reading: read open error, 
I/O error
 8/21/03  1:23:55 nsrjb #10535: Diagnostic: Ignoring error reading label for 
operation Verify label on Srv_paym_br.009: 0(read open error, I/O error)
08/21/03 01:23:55 nsrd: /dev/rmt/1cbn  Eject operation in progress
08/21/03 01:24:08 nsrd: media info: loading volume ArchiveLog.013 into 
/dev/rmt/0cbn
08/21/03 01:24:09 savegrp: group Srv_pay12-backup_DB_grp_arc aborted.
08/21/03 01:24:09 savegrp: killing pid 7608
08/21/03 01:24:09 nsrd: savegroup alert: Srv_pay12-backup_DB_grp_arc aborted, 1 
client(s) (pay12-backup Failed)
08/21/03 01:24:10 nsrd: runq: NSR group Srv_pay12-backup_DB_grp_arc exited with 
return code 1.
08/21/03 01:24:12 savegrp: group Srv_pay11-backup_DB_grp_arc aborted.
08/21/03 01:24:12 savegrp: killing pid 7607
08/21/03 01:24:12 nsrd: savegroup alert: Srv_pay11-backup_DB_grp_arc aborted, 1 
client(s) (pay11-backup Failed)
08/21/03 01:24:13 nsrd: runq: NSR group Srv_pay11-backup_DB_grp_arc exited with 
return code 1.
08/21/03 01:24:15 savegrp: group Srv_paym2-br_DB_grp_arc aborted.
08/21/03 01:24:15 savegrp: killing pid 8657
08/21/03 01:24:15 nsrd: savegroup alert: Srv_paym2-br_DB_grp_arc aborted, 1 
client(s) (paym2-br Failed)
08/21/03 01:24:16 nsrd: runq: NSR group Srv_paym2-br_DB_grp_arc exited with 
return code 1.
08/21/03 01:24:18 savegrp: group Srv_paym1-br_DB_grp_arc aborted.
08/21/03 01:24:18 savegrp: killing pid 8658
08/21/03 01:24:18 nsrd: savegroup alert: Srv_paym1-br_DB_grp_arc aborted, 1 
client(s) (paym1-br Failed)
08/21/03 01:24:19 nsrd: runq: NSR group Srv_paym1-br_DB_grp_arc exited with 
return code 1.
08/21/03 01:24:32 savegrp: group Srv_paym-br_DB_grp aborted.
08/21/03 01:24:32 savegrp: killing pid 8505
08/21/03 01:24:32 nsrd: savegroup alert: Srv_paym-br_DB_grp aborted, 1 
client(s) (paym-br Failed)
08/21/03 01:24:33 nsrd: runq: NSR group Srv_paym-br_DB_grp exited with return 
code 1.
08/21/03 01:24:51 nsrd: /dev/rmt/0cbn  Verify label operation in progress
08/21/03 01:25:00 nsrd: savegroup alert: Group Com_advcom-br_DB_grp_arc 
aborted, savegrp is already running
08/21/03 01:25:01 nsrd: Jukebox 'L20' failed: device '/dev/rmt/1cbn' is busy
08/21/03 01:26:01 nsrd: media critical event: Waiting for 1 writable volumes to 
backup pool 'ArchiveLog' tape(s) on advcom
08/21/03 01:26:18 nsrd: Jukebox 'L20' failed: expected volume 'Srv_paym_br.009' 
got 'NULL'.
08/21/03 01:27:45 nsrd: /dev/rmt/0cbn  Mount operation in progress
08/21/03 01:27:51 nsrd: media event cleared: Waiting for 1 writable volumes to 
backup pool 'ArchiveLog' tape(s) on advcom
08/21/03 01:27:52 nsrd: media info: loading volume Srv_paym_br.009 into 
/dev/rmt/1cbn
08/21/03 01:28:01 nsrd: media waiting event: Waiting for 1 writable volumes to 
backup pool 'ArchiveLog' tape(s) on advcom
08/21/03 01:28:23 nsrd: /dev/rmt/1cbn  Verify label operation in progress
08/21/03 01:28:23 nsrd: Jukebox 'L20' failed: device '/dev/rmt/1cbn' is busy
08/21/03 01:28:36 nsrd: media event cleared: Waiting for 1 writable volumes to 
backup pool 'ArchiveLog' tape(s) on advcom
08/21/03 01:28:36 nsrd: pay-backup:archivelog_T001.opev9stj_1_1 saving to pool 
'ArchiveLog' (ArchiveLog.013)
08/21/03 01:28:36 nsrd: pay-backup:archivelog_T001.oqev9stp_1_1 saving to pool 
'ArchiveLog' (ArchiveLog.013)
08/21/03 01:28:36 nsrd: advcom:archivelog_OAMDB.3sev9t6t_1_1 saving to pool 
'ArchiveLog' (ArchiveLog.013)
08/21/03 01:28:36 nsrd: paym-br:archivelog_T001.e0ev9tl0_1_1 saving to pool 
'ArchiveLog' (ArchiveLog.013)
08/21/03 01:28:37 nsrd: paym-br:archivelog_T001.e1ev9tl0_1_1 saving to pool 
'ArchiveLog' (ArchiveLog.013)
08/21/03 01:28:39 nsrd: pay-backup:archivelog_T001.opev9stj_1_1 done saving to 
pool 'ArchiveLog' (ArchiveLog.013)
08/21/03 01:28:39 nsrd: pay-backup:archivelog_T001.oqev9stp_1_1 done saving to 
pool 'ArchiveLog' (ArchiveLog.013)
08/21/03 01:28:39 nsrd: paym-br:archivelog_T001.e0ev9tl0_1_1 done saving to 
pool 'ArchiveLog' (ArchiveLog.013)
08/21/03 01:28:39 nsrd: paym-br:archivelog_T001.e1ev9tl0_1_1 done saving to 
pool 'ArchiveLog' (ArchiveLog.013)
08/21/03 01:28:44 nsrd: advcom:archivelog_OAMDB.3sev9t6t_1_1 done saving to 
pool 'ArchiveLog' (ArchiveLog.013) 44 MB
08/21/03 01:28:53 nsrd: advcom:archivelog_OAMDB.3tev9uik_1_1 saving to pool 
'ArchiveLog' (ArchiveLog.013)
08/21/03 01:29:00 nsrd: advcom:archivelog_OAMDB.3tev9uik_1_1 done saving to 
pool 'ArchiveLog' (ArchiveLog.013) 44 MB
08/21/03 01:29:19 nsrd: advcom:advcom-br_ARC: saving to pool 'ArchiveLog' 
(ArchiveLog.013)
08/21/03 01:29:21 nsrd: advcom:advcom-br_ARC: done saving to pool 'ArchiveLog' 
(ArchiveLog.013) 6 KB
08/21/03 01:29:22 nsrd: savegroup info: starting indexDB (with 1 client(s))
08/21/03 01:29:22 nsrd: savegroup notice: Com_advcom-br_DB_grp_arc completed, 1 
client(s) (All Succeeded)
08/21/03 01:29:26 nsrd: advcom:/dev/null saving to pool 'ArchiveLog' 
(ArchiveLog.013)
08/21/03 01:29:29 nsrd: advcom:/dev/null done saving to pool 'ArchiveLog' 
(ArchiveLog.013) 6 KB
08/21/03 01:29:30 nsrd: advcom:index:advcom saving to pool 'ArchiveLog' 
(ArchiveLog.013)
08/21/03 01:29:41 nsrd: advcom:index:advcom done saving to pool 'ArchiveLog' 
(ArchiveLog.013) 78 MB
08/21/03 01:29:42 nsrd: advcom:bootstrap saving to pool 'ArchiveLog' 
(ArchiveLog.013)
08/21/03 01:29:42 nsrmmdbd: media db is saving its data.  This may take a while.
08/21/03 01:29:43 nsrmmdbd: media db is open for business.
08/21/03 01:29:46 nsrd: advcom:bootstrap done saving to pool 'ArchiveLog' 
(ArchiveLog.013) 3871 KB
08/21/03 01:29:46 nsrd: savegroup notice: indexDB completed, 1 client(s) (All 
Succeeded)
08/21/03 01:30:00 nsrd: savegroup info: starting Srv_paym1-br_DB_grp_arc (with 
1 client(s))
08/21/03 01:30:00 nsrd: savegroup info: starting Srv_paym2-br_DB_grp_arc (with 
1 client(s))
08/21/03 01:30:04 nsrd: savegroup info: paym1-br:paym1-br_ARC:: No full backups 
of this save set were found in the media database; performing a full backup
08/21/03 01:30:17 nsrd: write completion notice: Writing to volume 
ArchiveLog.013 complete
08/21/03 01:30:33 savegrp: group Srv_paym1-br_DB_grp_arc aborted.
08/21/03 01:30:33 savegrp: killing pid 12075
08/21/03 01:30:33 nsrd: savegroup alert: Srv_paym1-br_DB_grp_arc aborted, 1 
client(s) (paym1-br Failed)
08/21/03 01:30:34 nsrd: runq: NSR group Srv_paym1-br_DB_grp_arc exited with 
return code 1.
08/21/03 01:30:35 savegrp: group Srv_paym2-br_DB_grp_arc aborted.
08/21/03 01:30:35 savegrp: killing pid 12079
08/21/03 01:30:36 nsrd: savegroup alert: Srv_paym2-br_DB_grp_arc aborted, 1 
client(s) (paym2-br Failed)
08/21/03 01:30:36 nsrd: runq: NSR group Srv_paym2-br_DB_grp_arc exited with 
return code 1.
08/21/03 01:31:12 nsrd: /dev/rmt/0cbn  Eject operation in progress
08/21/03 01:32:33 nsrd: media warning: /dev/rmt/1cbn opening: I/O error
08/21/03 01:32:33 nsrd: media warning: /dev/rmt/1cbn reading: read open error, 
I/O error
 8/21/03  1:32:33 nsrjb #11499: Diagnostic: Ignoring error reading label for 
operation Verify label on Srv_paym_br.009: 0(read open error, I/O error)
08/21/03 01:32:33 nsrd: /dev/rmt/1cbn  Eject operation in progress
08/21/03 01:35:05 nsrd: Jukebox 'L20' failed: expected volume 'Srv_paym_br.009' 
got 'NULL'.
08/21/03 01:35:12 nsrd: media info: loading volume Srv_paym_br.009 into 
/dev/rmt/1cbn
08/21/03 01:35:44 nsrd: /dev/rmt/1cbn  Verify label operation in progress
08/21/03 01:36:04 nsrd: shutting down (LANG=en_US, LC_MESSAGES=en_US)
08/21/03 01:36:04 nsrd: media event cleared: Waiting for 1 writable volumes to 
backup pool 'Srv_paym-br_FD_pool' tape(s) on advcom
08/21/03 01:37:30 nsrd: Jukebox 'L20' failed: Load operation failed on 
signal(15)
08/21/03 01:37:30 nsrd: media alert: Load operation failed on signal(15)
nsrexecd: Unable to roll out old /nsr/logs/daemon.log, continuing...

Thank you for your supports and cooperations.

Wassalam,
_________________________________________
Irwan Kurniawan
PT Siemens Indonesia
Jl. MT Haryono Kav. 58-60
Jakarta 12780 - Indonesia
Phone : +62 21 2750 8 183
Mobile : +62 816 199 6920

"Better team-works can lead us to better results"
_________________________________________

--
Note: To sign off this list, send a "signoff networker" command via email
to listserv AT listmail.temple DOT edu or visit the list's Web site at
http://listmail.temple.edu/archives/networker.html where you can
also view and post messages to the list.
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=

<Prev in Thread] Current Thread [Next in Thread>
  • [Networker] Problems at Networker 6.1, Irwan Kurniawan <=