ADSM-L

Re: volume in use error

1999-10-25 17:02:18
Subject: Re: volume in use error
From: bbullock <bbullock AT MICRON DOT COM>
Date: Mon, 25 Oct 1999 15:02:18 -0600
 Hmm, that is slightly different from our situation.

We too are seeing the tapes erroring out and the volumes being marked as
read-only. However, we are only seeing the tape being stuck in the "in-use"
state when it is an ADSM database backup tape that errors out with the
ANR8830E error. These are easy to pick out because of the missing tape# in
the error message. All of the other tapes that get the 8830 error are put
back into the normal state.

To resolve these problem without having to halt the ADSM server I use an
'mtlib -l /dev/lmcp0 -C -VA00001 -tFF10' command to get it ejected out of
the library and then do an "audit library tapelibrary" to have it clean up
the ADSM server inventory. This makes it look nice until I try to check the
tape back into the library, when ADSM once again tells me that the tape is
still in use. At this point I have to recycle the server, but if I save up a
few of these "odd tape" while they are outside of the library, I only have
to reboot once a month rather than every few days.

If you check through the ADSM-L archive, you will see some related messages
posted by me and some others about these ANR8830E errors being logged. In my
case, Imation has found that perhaps half (the sample was small) of the
tapes logging the ANR8830E error still meet their specs and are logging very
few errors. They are still good tapes. :-(

Ben

> -----Original Message-----
> From: Ernie Lim [mailto:fm AT ERN-E DOT ORG]
> Sent: Monday, October 25, 1999 5:19 PM
> To: ADSM-L AT VM.MARIST DOT EDU
> Subject: volume in use error
>
>
> Hi.
>
> We just recently had micrcode upgraded on our 3590B's and
> have noticed a few
> things. First, the code on the drive now seems to spit back
> errors to adsm
> when it sees a tape with media errors. This is all fine and dandy but
> evertime it does this (recognize a bad tape, then mark it
> read-only) it
> seems ADSM loses its mind and puts the vol in the "IN USE" state. This
> happens to be a known issue with IBM. They recomend recycling
> the server and
> have not as of yet made a fix for it. We have over 700 volumes in our
> library and more and more tapes are being marked bad. Quite
> frankly Iam
> disguted with having to bring down all our apps and have
> operations screech
> to halt just becasue of this *retarded* situation. To us this
> is a VERY big
> problem. Iam just curious to know how many other ppl have
> come accross this.
> Is it just us??
>
>
> Ernie Lim
> Sr. UNIX Consultant
> Highway 407 ETR
> http://www.407etr.com
>
> Here is the APAR in all its glory.
>
>
> APAR NUMBER: IY02438           RESOLVED AS: PROGRAM ERROR
>
> ABSTRACT:
> IY02438: WHEN AN ERROR IS DETECTED ON A TAPE VOLUME (3570 &
> 3590 DRIVES),
> ADSM MAY NOT RELEASE THAT TAPE: THE VOLUME WILL REMAIN "IN USE"
>
> ORIGINATING DETAILS:
> When ADSM reports that data on a volume needs to be moved to a
> new cartridge due to volume errors, ADSM will continue to hold
> that volume in a state of "in use".  This, in turn, can cause
> any other operations or processes requesting this volume to
> fail.  This problem only affects 3570 and 3590 drives.  The
> following error messages will be seen in the activity log when
> this condition is encountered:
>    ANR8830E Internal 3590 drive diagnostics detect excessive
>             media failures for volume . Access mode is now set
>             to "read-only".
>    ANR8831W Because of media errors for volume , data should
>             be removed as soon as possible.
>    ANR9999D mmslib.c(5108): Entry not found in activity list;
>             lib=ADSM3494, vol=.
>    ANR8468I 3590 volume  dismounted from drive DRIVE1
>             (/dev/rmt8) in library ADSM3494.
>
> LOCAL FIX AS REPORTED BY ORIGINATOR:
> Recycling the ADSM Server will resolve this situation.  It is
> recommended, however, that the data contained on any of the tape
> volumes generating these errors be moved to another cartridge.
>
> RESPONDER SUMMARY:
> ****************************************************************
> * USERS AFFECTED:                                              *
> ADSM AIX, NT, SUN and HP servers with 3570/3590 drives.
> ****************************************************************
> * PROBLEM DESCRIPTION:                                         *
> On 3570/3590 drives, ADSM marks a volume read-only to
> prevent further writes if the internal drive diagnostics
> have detected excessive media failures.  Any other server
> operations that use this volume will not start because
> ADSM believes the volume is still in use.
> If the activity log contains ANR8830E and ANR8831W messages,
> then the customer has encountered this problem.
> ****************************************************************
> * RECOMMENDATION:                                              *
> Apply the PTF when available.
> ****************************************************************
> ADSM allows operations that use a cartridge marked read-only
> to start.  It does not believe the volume is still in use.
>
> RESPONDER CONCLUSION:
> None
>
> TEMPORARY FIX:
> Halt and restart the ADSM server.
>
<Prev in Thread] Current Thread [Next in Thread>