Veritas-bu

[Veritas-bu] Tape Alert automated cleaning not working

2001-02-13 14:39:46
Subject: [Veritas-bu] Tape Alert automated cleaning not working
From: Dale, Daniel P DaleDP AT NORTHAMERICA.Stortek DOT com
Date: Tue, 13 Feb 2001 12:39:46 -0700
Andrew,
I have looked into this issue from within StorageTek's
incident database. This problem was reported to us by Sun
and some of our own customers. One of our Lab's was used to
recreate the failure with a SCSI analyzer attached to the L700.
In the lab, when a cleaning cartridge was mounted on a drive and
a data cartridge mount was requested to the same drive the L700
returned the correct 2h 30h 03 response. It is my belief that the
status 5h 3Ah 00h you see in the Netbackup log, when the drive is 
downed, is coming from an attempt of Netbackup to eject the cleaning
cartridge after the L700 presents the Not Ready/Cleaning cartidge 
installed status. Look at the text of the message. It is a move 
medium error, of the cleaning cartridge, from the drive to the outport.

Veritas's position on this is that they do not support robot controlled
drive cleaning. This is very clearly stated in a Tech Tip available on
their support web page. Keyword 'TapeAlert'. This was also their response
to StorageTek when we contacted them about this issue. I cannot say
whether Veritas is working on supporting this in the future or not.

It is my suggestion that if TapeAlert is not working as you think it 
should, that you open an incident with Veritas and with the Vendor who 
supports your tape drive. Since both the hardware and software have to
work together for TapeAlert to function and from what I've seen it is
unclear why some drives get cleaned and not others. I know this places 
a burden on you but it is the best way to get this problem addressed.

StorageTek recommends, if using frequency based cleaning, to set it to
100hrs for DLT7000's. With the caveat that this may not be right for
everyone.

If you are using the Robot controlled cleaning on the L700 I recommend
having
your maintenance provider install 2.2 level code. There is a fix which 
addresses drive cleaning reliability. I do not know if Sun has made this
available for their branded L700's.

I have heard, as a work around, that some customers have (probably from an
SSO
environment) left Robot controlled cleaning enabled and created a script
which
parses the Netbackup log looking for the error text you described. The
script
then ups the downed drive. I'm not neccessarily endorsing this but I would
like
to know if anyone on this list has tried it and whether it worked or not.

Thanks,
Dan Dale
System Specialist
StorageTek



-----Original Message-----
From: Andrew Shinkarev [mailto:shinkara AT pprd.abbott DOT com]
Sent: Tuesday, February 13, 2001 6:19 AM
To: spe08 AT co.henrico.va DOT us
Cc: Veritas-bu AT mailman.eng.auburn DOT edu
Subject: Re: [Veritas-bu] Tape Alert automated cleaning not working


On Tue, Feb 13, 2001 at 08:59:31AM -0500, spe08 AT co.henrico.va DOT us wrote:

Code version 
2.00.00

Code Built on
Mar 29 2000 17.00.58

Hardware Version
MPC 02201482


> A quick question, what version of the L700 firmware are we talking about?
> 
> dds
> 
> -----Original Message-----
> From: Andrew Shinkarev [mailto:shinkara AT pprd.abbott DOT com]
> Sent: Tuesday, February 13, 2001 08:54
> To: Michael Traves
> Cc: Veritas-bu AT mailman.eng.auburn DOT edu
> Subject: Re: [Veritas-bu] Tape Alert automated cleaning not working
> 
> 
> On Tue, Feb 13, 2001 at 08:15:58AM -0500, Michael Traves wrote:
> 
> Michael, 
> 
> 
> You might need to say - every mount will be dealyed for 10 
> minutes. I were suggested to increase this value to 
> 3 minutes, and even 3 minutes is too much. 
> 
> Here is the description of the problem I received from Sun
> ( as soon as our L700's have Sun label on them )
> 
> The L700 manual states;
> 
> "When a drive requires cleaning, the library interleaves the cleaning
> cartridge mount with normal host operations. The "Fast Load" option is
> always enabled for cleaning cartridges, so the mount occurs within
> seconds. Typically, the cleaning mount occurs directly after a data
> cartridge dismount. Host applications see minimal processing
> interruptions (less than ten seconds) during the cleaning mount.  While
> the cleaning cartridge remains in the drive, the library processes host
> commands normally. If a host requests a data mount to the drive being  
> cleaned, then the library rejects the command and sends the Not Ready  
> sense key (02), with ASC 30 and ASCQ 03 (Cleaning Cartridge
> Installed).  The host receives the data mount error for the duration of
> the cleaning time."
> 
> HOWEVER,
> 
> There appears to be a bug with the current L700 firmware release. What
> we are seeing on customer sites is that the L700 is returning the
> following response during a mount to a drive that is currently being
> cleaned;
> 
> Medium Not Present, Drive Not Unloaded 5h 3Ah 00h
> 
> Instead of;
> 
> Not Ready, Cleaning Cartridge Installed. The library is performing an
> Auto Clean function on the data transfer element (tape drive). 2h 30h 03h
> 
> Here is an example of the error you'd see;
> 
> Jul 24 12:46:41 un1x tldcd[10561]: TLD(3) key = 0x5, asc = 0x3a, ascq =
0x0,
> MEDIUM NOT PRESENT
> Jul 24 12:46:41 un1x tldcd[10561]: TLD(3) Move_medium error: CHECK
CONDITION
> Jul 24 12:46:41 un1x tldcd[10561]: TLD(3) could not move barcoded tape
> CLN442
> from drive 7 to
> outport
> Jul 24 12:46:41 un1x tldd[22667]: TLD(3) drive 7 (device 6) is being
DOWNED,
> status: Robotic
> dismount failure
> Jul 24 12:46:41 un1x tldd[22667]: Check integrity of the drive, drive
path,
> and
> media
> 
> So, timing  has nothing to do with the porblem.
> 
> 
> Thanks,
> Andrey
> 
> > 
> > That's not entirely correct -- if you set your mount timeouts
> appropriately
> > (say 10mins or greater) than NetBackup will wait that long prior to
giving
> > up and assuming something is wrong with the drive.  Setting this value
> gives
> > enough time for the robot to perform the cleaning, without NetBackup
> timing
> > out and downing the drive.
> > 
> > 
> > ----- Original Message -----
> > From: "Everett, Craig" <Craig_Everett AT intuit DOT com>
> > To: <fabrice.brochart AT wanadoo DOT fr>; "Andrew Shinkarev"
> > <shinkara AT pprd.abbott DOT com>; "Everett, Craig" <Craig_Everett AT 
> > intuit DOT com>
> > Cc: <Veritas-bu AT mailman.eng.auburn DOT edu>
> > Sent: Monday, February 12, 2001 11:37 PM
> > Subject: RE: [Veritas-bu] Tape Alert automated cleaning not working
> > 
> > 
> > > If you use the "autoclean" feature it will cause Netbackup to down
> drives
> > > that are being cleaned by the library. That's why we are converting to
> > > Netbackup automated cleaning. If you talk to Storagetek about an
unusual
> > > rate of drives being downed they will first ask you if you are using
> > > Storagetek autocleaning and Netbackup software. If so you will get an
> > > unusually high rate of drives being downed because Netbackup doesn't
> know
> > > when STK is cleaning the drives. Converting to Netbackup cleaning will
> > help
> > > you resolve this problem. Thanks,
> > >
> > > Craig
> > >
> > > -----Original Message-----
> > > From: f@b [mailto:fabrice.brochart AT wanadoo DOT fr]
> > > Sent: Monday, February 12, 2001 2:54 PM
> > > To: Andrew Shinkarev; Everett, Craig
> > > Cc: Veritas-bu AT mailman.eng.auburn DOT edu
> > > Subject: RE: [Veritas-bu] Tape Alert automated cleaning not working
> > >
> > >
> > > Hi ,
> > >
> > > I suggest you to implement the "autoclean" feature on the StK's
library.
> > >
> > > This is the better way to ensure you that a drive cleaning is done.
> > >
> > > Hth , bye.
> > >
> > > Fabrice
> > >
> > > -----Message d'origine-----
> > > De : veritas-bu-admin AT Eng.Auburn DOT EDU
> > > [mailto:veritas-bu-admin AT Eng.Auburn DOT EDU]De la part de Andrew 
> > > Shinkarev
> > > Envoye : lundi 12 fevrier 2001 22:13
> > > A : Everett, Craig
> > > Cc : Veritas-bu AT mailman.eng.auburn DOT edu
> > > Objet : Re: [Veritas-bu] Tape Alert automated cleaning not working
> > >
> > >
> > > On Mon, Feb 12, 2001 at 12:44:07PM -0800, Everett, Craig wrote:
> > >
> > > Make them check Front Panel.
> > > It says if a tape drive needs cleaning.
> > >
> > > If you are using SSO, time based cleaning
> > > doesn't work quite well, and still
> > > you will need to check libraries sometimes.
> > >
> > > Simple script, ssh to robot host and
> > > tpclean command should help your operators to
> > > handle this on daily basis.
> > >
> > > Thanks,
> > > Andrew
> > > > Andrew,
> > > >
> > > > That's strange because Netbackup Support doesn't seem aware of
> severity
> > of
> > > > cleaning request and possible ignores.
> > > >
> > > > I would have our operators do manual cleaning if I could but I've
got
> > 100+
> > > > drives which would become  cumbersome at best. I guess if I can't
get
> > this
> > > > to work I'll switch to frequency based cleaning.
> > > >
> > > > Thanks for your help,
> > > >
> > > > Craig Everett
> > > > Technology Operations-Storage Team
> > > > Intuit Inc
> > > > craig_everett AT intuit DOT com
> > > >
> > > >
> > > >
> > > > -----Original Message-----
> > > > From: Andrew Shinkarev [mailto:shinkara AT pprd.abbott DOT com]
> > > > Sent: Monday, February 12, 2001 12:31 PM
> > > > To: Everett, Craig
> > > > Cc: Veritas-bu AT mailman.eng.auburn DOT edu
> > > > Subject: Re: [Veritas-bu] Tape Alert automated cleaning not working
> > > >
> > > >
> > > > On Mon, Feb 12, 2001 at 11:09:03AM -0800, Everett, Craig wrote:
> > > >
> > > > You are not alone....
> > > >
> > > > DLT7000 Tape System Product Manual p.5-56 contains a list of
> > > > TapeAlert Flags, Severity Levels, and Meanings for LOG SENSE
> > > > command. There are 9 flags, and 6 of them correspond to
> > > > cleaning requests. 3 of them have Severity Level Critical
> > > > and 3 just Warnings. I think NetBackup ignores Warnings
> > > > and taking care of Critical errors, becous I've seen NetBackup
> > > > cleaning a tape drive.
> > > >
> > > > You can ignore them too ....
> > > > I am checking my library every day and clean a tape drive
> > > > if I see a cleaning request on a front panel.
> > > >
> > > > Thanks,
> > > > Andrew
> > > >
> > > >
> > > > > Background:
> > > > > Sun E-450/Solaris 2.6
> > > > > Netbackup 3.2
> > > > > Storagetek L700 and 9740 Libraries
> > > > > DLT 7000 Drives w/ 245F and 2565 microcode levels
> > > > > Problem: Drives are not self cleaning using tape alert
> > > > >
> > > > >
> > > > > Issue:
> > > > > I recently implemented the tape alert method of cleaning on all of
> my
> > > > master
> > > > > servers. I set up barcode rules and inserted cleaning tapes as
> > required.
> > > > The
> > > > > master server is aware of the cleaning tape:
> > > > >
> > > > > Masterserver# vmquery -b -mt dlt_clean
> > > > > media   media     robot  robot  robot  side/  optical  #
mounts/last
> > > > > ID     type      type     #    slot   face   partner  cleanings
> > > > > mount time
> > > > >
> ---------------------------------------------------------------------
> > > > > CLN002  DLT_CLN   NONE     -      -     -       -           0
> > > 00/00/00
> > > > > 00:00
> > > > > CLN003  DLT_CLN   NONE     -      -     -       -           0
> > > 00/00/00
> > > > > 00:00
> > > > > CLN001  DLT_CLN   NONE     -      -     -       -           0
> > > 00/00/00
> > > > > 00:00
> > > > > CLN008  DLT_CLN   NONE     -      -     -       -           0
> > > 00/00/00
> > > > > 00:00
> > > > > CLN016  DLT_CLN   NONE     -      -     -       -           0
> > > 00/00/00
> > > > > 00:00
> > > > > CLN000  DLT_CLN   TLD      0    248     -       -          20
> > > 00/00/00
> > > > > 00:00
> > > > >
> > > > > So I know I have the cleaning tape available and mouts are at 20 (
> > it's
> > > a
> > > > > new tape ). Manual Cleaning works. Drive cleaning frequency is set
> to
> > 0.
> > > I
> > > > > keep finding tape drives with their cleaning light on ( cleaning
> > > required
> > > > )
> > > > > , but I don't see any cleaning occurring indicated by decremented
> > > cleaning
> > > > > tape mounts. I've set VERBOSE in my vm.conf so I can see the
request
> > for
> > > > > cleaning entered in the /var/adm/messages log but nothing ever
comes
> > > > > through? Has anyone ever seen this? Is there a command to simulate
a
> > > drive
> > > > > requesting a cleaning besides doing a manual cleaning? What
> netbackup
> > > log
> > > > > would be a good place to find some hidden problem that are perhaps
> > > > > preventing the cleaning? Any input would be greatly appreciated.
> > > > >
> > > > > Thanks,
> > > > > Craig Everett
> > > > > Technology Operations-Storage Team
> > > > > Intuit Inc
> > > > > craig_everett AT intuit DOT com
> > > > >
> > > > >
> > > > > _______________________________________________
> > > > > Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
> > > > > http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
> > > >
> > > > --
> > > > ----
> > > > Andrew Shinkarev
> > > > Network Systems Specialist
> > > > Abbott Laboratories
> > > > (847) 9387559
> > > > shinkara AT pprd.abbott DOT com
> > >
> > > --
> > > ----
> > > Andrew Shinkarev
> > > Network Systems Specialist
> > > Abbott Laboratories
> > > (847) 9387559
> > > shinkara AT pprd.abbott DOT com
> > > _______________________________________________
> > > Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
> > > http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
> > >
> > > _______________________________________________
> > > Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
> > > http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
> > > _______________________________________________
> > > Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
> > > http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
> > 
> > _______________________________________________
> > Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
> > http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
> 
> -- 
> ----
> Andrew Shinkarev
> Network Systems Specialist
> Abbott Laboratories
> (847) 9387559
> shinkara AT pprd.abbott DOT com
> _______________________________________________
> Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
> http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
> _______________________________________________
> Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
> http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu

-- 
----
Andrew Shinkarev
Network Systems Specialist
Abbott Laboratories
(847) 9387559
shinkara AT pprd.abbott DOT com
_______________________________________________
Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu