ADSM-L

Re: TSM server problem

2002-06-21 04:34:44
Subject: Re: TSM server problem
From: Alexander Verkooyen <alex AT SARA DOT NL>
Date: Fri, 21 Jun 2002 10:33:00 +0200
Thanks for the explanation! We'll probably upgrade to get
rid of this problem.
You're right, restarting the server solves the problem
temporarily but that is not something we want to do
on a regular basis. (had to restart dsmserv again this morning
to free two tapes)

Thanks again,

Alexander

(The original message was written by me but posted by Henk)

> I have seen similar things.  You have to put on 4.2.2 to correct this
> problem.  However, a recycle of the dsmserv will fix the problem.  The issue
> manifests itself in several ways.  I have only seen the problem once.  I am
> still on 4.2.1.15 which partially fixes the problem.  The problem is caused
> by a bunch of idle drives dismounting at the same time.  Apparently, there
> were some threading of updates to instorage control blocks that get hosed.
>
> Paul D. Seay, Jr.
> Technical Specialist
> Naptheon, INC
> 757-688-8180
>
>
> -----Original Message-----
> From: Henk ten Have [mailto:hthta AT SARA DOT NL]
> Sent: Thursday, June 20, 2002 11:17 AM
> To: ADSM-L AT VM.MARIST DOT EDU
> Subject: TSM server problem
>
>
> I couldn't find an APAR that describes what we have been seeing on our
> 4.2.1.11 (AIX 4.3.3) server during the last few weeks so I was wondering if
> we have discovered new bug.
>
> First we got this message in the activity log:
>
> 06/09/02   10:39:38  ANR1229W Volume 000591 cannot be backed up - volume is
>                       offline or access mode is "unavailable" or
> "destroyed".
>
> We did a 'q vol f=d'. The access mode of the volume was 'Available' so we
> dismissed it as a freak incident until the message repeated itself the next
> day.
>
> This time we opened the (3494) library and verified that the volume was in
> the correct cell.
>
> We tried a 'restore volume' followed by a 'delete volume'.
> The delete failed:
>
> ANR2405E DELETE VOLUME: Volume 000591 is currently in use by clients and/or
> data management operations. ANS8001I Return code 14.
>
> At that time there were no processes or sessions that used
> that particular volume. Also we had four other volumes
> that displayed the same behaviour.
> We halted the server and restarted it again which solved
> the problem. The volumes were no longer 'in use'.
> Today we noticed two other volumes that seem to have
> the same problem so I'm beginning to suspect that I've found
> a bug in the server.
>
> All these volumes have one thing in common: When I seacrh
> the activity log for their mounts and dismounts I can't
> find a dismount message after their last mount (before they become
> 'unavailable'). It is as if the volumes are unmounted by the library but TSM
> isn't being notified of this.
>
> Anyone seen this before?
>
> Cheers,
> Henk.


-----------------------------------------------
Alexander Verkooijen        (alexander AT sara DOT nl)
Alexander Verkooijen        (alexander AT sara DOT nl)
Senior Systems Programmer
SARA High Performance Computing
<Prev in Thread] Current Thread [Next in Thread>