Networker

[Networker] unstable networker 7.1.1 (running on Sun Enterprise 250 Solaris 8)

2004-04-01 07:24:53
Subject: [Networker] unstable networker 7.1.1 (running on Sun Enterprise 250 Solaris 8)
From: David De Maeyer <ddm AT RUC DOT DK>
To: NETWORKER AT LISTMAIL.TEMPLE DOT EDU
Date: Thu, 1 Apr 2004 14:25:22 +0200
Hi all,

Since few days our backup server is quite "unstable". We have 4 SDLT
drives connected to one SCSI bus on a Sun Enterprise 250 running
Solaris 8. Networker 7.1.1 is installed. We have tried to get the SCSI
chain as short as possible (~1.5m). Since few days tapes are declared
full when in reality only a few GB are written on each tapes. Problem
occurs randomly once on device 0cbn, once on device 6cbn, etc.

We have tried to shorten the SCSI chain by removing one drive but the
problem still occurs. We proceeded by doing some hardware permutations
to try to isolate the problem (without success yet).Tapes are declared
full and can't be ejected.

Found in the logs, concerns device 6cbn which has been physically
removed:
NetWorker media: (waiting) Waiting for 2 writable volumes to backup
pool 'Default' tape(s) on bart
NetWorker media: (notice) sdlt tape Datalogi 847 on /dev/rmt/6cbn is
full
NetWorker media: (notice) sdlt tape Datalogi 847 used 5914 MB of 100 GB
capacity
NetWorker media: (warning) /dev/rmt/6cbn moving: fsr 2194: drive status
is The end of data was reached
NetWorker media: (emergency) could not position Datalogi 847 to file 7,
record 2195
NetWorker media: (notice) Volume "Datalogi 847" on device
"/dev/rmt/6cbn": Cannot decode block. Verify the device configuration.
Tape positioning by record is disabled.
NetWorker media: (warning) /dev/rmt/6cbn moving: fsr 2194 (read): drive
status is Drive reports no error - but state is unknown
NetWorker media: (warning) /dev/rmt/6cbn reading: Error 0
NetWorker media: (warning) verification of volume "Datalogi 847", volid
4267980813 failed, can not read record 2195 of file 7 on sdlt tape
Datalogi 847
NetWorker media: (notice) verification of volume "Datalogi 847", volid
4267980813 failed, volume is being marked as full.

Later on and for other devices, once 6cbn removed:
NetWorker media: (warning) /dev/rmt/1cbn moving: fsf 96: drive status
is Not Ready, Logical Unit Has Not Self-Configured Yet
NetWorker media: (warning) /dev/rmt/1cbn reading: I/O error
NetWorker media: (warning) /dev/rmt/5cbn moving: fsf 8: drive status is
The end of data was reached
NetWorker media: (warning) /dev/rmt/1cbn opening: DRIVE_STATUS_LOADING


This behavior seems to be consistent with NetWorker's policy which is
to declare the tape full should an error be found on it... but it
happens too often no matter which drive it is, no matter how old/used
are the tapes (recently recycled tape).

As first step we would like to test the SCSI chain in order to make
sure we are not dealing with a hardware problem. Is there any utilities
out there to make such a test? We could of course try a probe_scsi_all
(Solaris box).

Did somebody experience the same unstable behavior?

Any idea on what is going on here?

Regards,
David




___________________________________________________
David De Maeyer
Roskilde University Center, Department of Computer Science
Box 260, Hus 42.1, 4000 Roskilde, Denmark
voice (+45) 46 74 38 29 / fax (+45) 46 74 30 72

--
Note: To sign off this list, send a "signoff networker" command via email
to listserv AT listmail.temple DOT edu or visit the list's Web site at
http://listmail.temple.edu/archives/networker.html where you can
also view and post messages to the list.
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=