ADSM-L

Re: Expiration Woes

2002-09-15 18:22:50
Subject: Re: Expiration Woes
From: "Seay, Paul" <seay_pd AT NAPTHEON DOT COM>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Sun, 15 Sep 2002 18:21:13 -0400
It looks like the bad performance for the expiration is a one time event.

However, there is a another problem that I am having difficulty isolating.
This problem has manifested itself in two ways.  It centers around some kind
of serialization and lock.  Everything basically stops.

Originally, the problem was caused by an administrator task being terminated
after doing some commands.  Often related to the stgpools.  Once you cancel
the administrator task that was causing the issue the problem cleared.

Last night, 4 days after putting on 4.2.2.12, the problem manifested itself
differently.  An expiration process had the stuff locked tight as a drum.
The expiration process had been running for a long time.  I cancelled it and
waited the timeout period for our environment (60 minutes) hoping this would
clear.  This time it appears there was a deadly embrace between a backup
stgpool command and the expiration process.  I may not have waited 60
minutes for the backup stgpool command to clear.  I had had enough and
halted the server.  That did the trick.  Down and back up in 10 minutes.

This problem was there at 4.2.1.11, and is not fixed.  The point I am making
is the fix to expiration processing in 4.2.2.12 may expose this problem more
frequently.  It probably has been around a while.

I wish AIX had an SVCDUMP command like MVS does.  At least I could send them
a dump of the problem.  I have opened a problem on the slooooooow
expiration.  They should have documented that in the 4.2.2.12 patch.  Now, I
am going to have to open a problem on the expiration/backup stgpool hang.

Paul D. Seay, Jr.
Technical Specialist
Naptheon Inc.
757-688-8180


-----Original Message-----
From: Guillaume Gilbert [mailto:guillaume.gilbert AT DESJARDINS DOT COM] 
Sent: Sunday, September 15, 2002 2:21 PM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Réf. : Re: Expiration Woes


Expiration ran as normal on saturday and sunday so I am guessing it was the
3 week backlog that caused the long running. As for cache hit, I am seeing
an improvement. I never could get it above 98 % (always stayed at 97.XX) but
now its staying in the 98% range. My db is 24 GB at 78% utilized and my
buffpool is 524288.

These bugs shouldn't be there. I ran at 4.1.3.0 for a year and it was smooth
as silk. When we decided to migrate to 4.2, I started my tests at 4.2.1.9. I
postponed the migration for 2 months just because each fix brought new
problems. I hope this lat one is stable is stable (it looks like it after 2
days). 5.1 doesn't look any better.

Guillaume Gilbert
CGI Canada




"Seay, Paul" <seay_pd AT NAPTHEON DOT COM>@VM.MARIST.EDU> on 2002-09-13 19:36:28

Veuillez répondre à "ADSM: Dist Stor Manager" <ADSM-L AT VM.MARIST DOT EDU>

Envoyé par :   "ADSM: Dist Stor Manager" <ADSM-L AT VM.MARIST DOT EDU>


Pour :    ADSM-L AT VM.MARIST DOT EDU
cc :
Objet :   Re: Expiration Woes

I hope this slow expiration is a one time event to correct the inadequacies
of the past.  It is pitiful compared to what 4.2.2.7 was doing.

I also had to increase my buffer pool significantly because the cache hits
went wacko after putting on 4.2.2.12.  I know my cache was not big enough
before but hits of <70% is a little bit crazy versus the 95+ that I was
getting before.

Paul D. Seay, Jr.
Technical Specialist
Naptheon Inc.
757-688-8180


-----Original Message-----
From: Guillaume Gilbert [mailto:guillaume.gilbert AT DESJARDINS DOT COM]
Sent: Friday, September 13, 2002 5:00 PM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Expiration Woes


Hey All

Since I had installed server 4.2.2.10 on AIX (3 weeks ago), I noticed that
my TDP Domino backups were not expiring. I check the log everyday to see if
the expiration process runs OK and it always had objects deleted and
finished with SUCCESS. I checked a few other nodes and noticed the same
thing. I opened a PMR tuesday but haven't gotten any response from Tivoli.
While checking the readme for server 4.2.2.12 today, I noticed APAR IC34465
Expiration deleting 0 objects. I installed it and after 3 hours of
expiration (double the usual time) I am now in the process of reclaiming
over 40 tapes. The expiration also scratched 10 tapes.

I don't know when this bug apeared but I suggest to those who are having
scratch problems to check into this.

Guillaume Gilbert
CGI Canada

<Prev in Thread] Current Thread [Next in Thread>