Networker

Re: [Networker] Problem reading tape labels?

2009-03-10 17:08:55
Subject: Re: [Networker] Problem reading tape labels?
From: George Sinclair <George.Sinclair AT NOAA DOT GOV>
To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
Date: Tue, 10 Mar 2009 17:01:03 -0400
Teresa Biehler wrote:
We also see this problem.  It seems to indicate one of several problems:

- Data loss.  When we had Windows storage nodes that would
intermittently rewind the tape while the NW server was writing, we'd
have data loss.  Disabling the removable media service helped to lessen
this, but in the end this is one of the reasons we no longer share tape
drives across Windows storage nodes.


- The tape is bad and needs to be retired.  Sometimes it's one drive
that cannot read the tape, other times no drives can read the tape.
Either way, when this happens, the tape gets retired.

I've seen this a few times on newer tapes, too, that were just recently written to.


- The drive is having problems and needs to be cleaned or replaced.
When we get a lot of these errors on a single drive, we replace the
drive and they go away.

Quantum seemed reticent to want to replace the drives until after upgrading the firmware. I got the impression that if I upgraded the firmware and was still having problems they'd be more responsive to doing that, and I can see their point, too, so maybe that's worth a shot, but the drives are getting old, too, so your point is quite valid.


- The tape needs to be retensioned.  We're using AIT3 tapes and they
seem to be pretty sensitive.  Running "mt -f retension" on the tape
often resolves the problem.

I've never used that option with 'mt', but that's worth a shot. Would be interesting to see if it improves things after, say, an 'nsrjb' fails to read the label using '-I' or '-p'.


Generally, when we get this error, we start the troubleshooting by
running scanner -v on the tape.  If scanner can still read the label and
the first few files/records, then the tape is probably ok.

As I recall, I've never had problems with 'scanner' seeing the labels.

Will investigate.

Good luck.
Teresa


-----Original Message-----
From: EMC NetWorker discussion [mailto:NETWORKER AT LISTSERV.TEMPLE DOT EDU] On
Behalf Of George Sinclair
Sent: Tuesday, March 10, 2009 1:14 PM
To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
Subject: [Networker] Problem reading tape labels?

Hi,

We have a sporadic problem on our SDLT-600 drives (Quantum M1800 tape library) wherein NW will complain that it cannot read the label on the tape (SDLT-2 media) when it goes to unload the volume, and sometimes when it loads it. I have 'Verify label on unload' set to 'yes'.

It will then issue something like this if the error occurs following a backup when it goes to unmount:

03/10/09 10:17:09 nsrd: media info: About to start checking label
03/10/09 10:17:09 nsrd: media info: Not a valid networker label
03/10/09 10:17:09 nsrd: media info: expected volume 'vol1' got '-'.
03/10/09 10:17:10 nsrd: media info: Label check failed, marking savesets

suspect and volume vol1 full.

or maybe:
03/09/09 23:32:50 nsrd: media info: expected volume 'vol1' got 'NULL'.
03/09/09 23:32:50 nsrd: media info: Label check failed, marking savesets

suspect and volume vol1 full.

The odd thing, though, is that I never see this problem when I label tapes, and I can generally write to the tape fine for a while before I see this. The error seems random, however, and it might take a while to show up, so there could be a number of loads/unloads before it occurs. It might only occur on certain tapes, too, and never on others.

This has happened on a number of tapes, though. We'll be fine for a while, and then suddenly one day it hits us. NW will then mark all the save sets 'suspect' and the tape 'full'. Nasty! Anyway, I might test inventorying a group of tapes, including the culprit tape, and it will succeed for the first several and then maybe fail with the 'NULL label found' error, or it might fail on a tape that I had not previously seen the problem on. On the other hand, it might succeed in inventorying all the volumes, but then fail on a subsequent re-test. It seems to affect all the drives, but a 'scanner' command will read the label correctly. Hmm ...

There's no indication on the GUI panel that the drives need to be cleaned, and I don't want to over clean them. I called Quantum to discuss the problem with them some time ago, and they recommended upgrading the firmware.

My question, and it might seem silly, is would upgrading the firmware really resolve an issue like this? Would a newer version of the firmware

somehow try harder to read a label? Has anyone seen firmware issues that caused this mischief?

Thanks for any input.

George




--
George Sinclair
Voice: (301) 713-3284 x210
- The preceding message is personal and does not reflect any official or unofficial position of the United States Department of Commerce -
- Any opinions expressed in this message are NOT those of the US Govt. -

To sign off this list, send email to listserv AT listserv.temple DOT edu and type 
"signoff networker" in the body of the email. Please write to networker-request 
AT listserv.temple DOT edu if you have any problems with this list. You can access the 
archives at http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER