ADSM-L

Re: Expiration problem with TSM 5.1.1.1 on AIX 4.3.3

2002-08-03 21:21:20
Subject: Re: Expiration problem with TSM 5.1.1.1 on AIX 4.3.3
From: Fred Johanson <fred AT MIDWAY.UCHICAGO DOT EDU>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Sat, 3 Aug 2002 19:27:36 -0500
I should have posted this weeks ago, but I've been fighting the losing battle
against the shrinking scratch pool for a month.

I called this in on July 3.  I've since gone from 5.1.1.0 to 5.1.1.1 to
5.1.1.2.  I've spent hours on the phone with Level 2 and burned lots of
innocent electrons in intemperate e-mail.  I was told last week not to be too
upset about 60-90 delays in waiting for the cancel to work because others are
waiting up to 3 hours or more.  It's now been more than a month since I've seen
the whole machine expire.  If I can sneak a few hours in a day, I may be able
to reclaim 2 or 3 tapes to scratch.  The worst thing is that no other process
is able to start while things are hung.  Even DB queries may hang forever,
while other users time out with the "storage inaccessible" messages.  If you
"show locks" it cetainly looks like a lock conflict.

Without a doubt the worst code I've seen in 7 years of dealing with
ADSM/TSM/ITSM/whatever.  My suggestion from the previous week was to find the
person who introduced this last Spring and make him fix it.  As of Friday it
looks like that is what is going to happen.

In the meantime, another long weekend of trying to get by with a full robot
aand very few scratch tapes.



Quoting Gerhard Rentschler <g.rentschler AT RUS.UNI-STUTTGART DOT DE>:

> Hello,
> after one day I got the first more severe problem. I started expiration and
> afterward migration for a storage pool. Later I started migration for
> another storage pool. Aftwards I noticed that the expiration and the other
> migration processes have stopped advancing their counters. Cancelling the
> processes doesn't help. For the expiration I get ANR0819I Cancel in
> progress. For the last migration process the cancel was accepted, but there
> was no effect. Everything else (Backup sessions, admin commands) seems to
> work.Colleagues from a neighbour university have this problem as well. They
> say after one hour they get an error message and the processes resume work.
>
> Has anyone else seen this? Is there a circumvention for this?
>
> I created PMR 88563, branch 070, country 724.
> Best regards
> Gerhard
> ---
> Gerhard Rentschler            email:g.rentschler AT rus.uni-stuttgart DOT de
> Regional Computing Center     tel.   ++49/711/685 5806
> University of Stuttgart       fax:   ++49/711/682357
> Allmandring 30a
> D 70550
> Stuttgart
> Germany
>


Fred Johanson

<Prev in Thread] Current Thread [Next in Thread>