ADSM-L

Re: TSM Operational Reporting just stops functioning

2005-09-15 06:27:14
Subject: Re: TSM Operational Reporting just stops functioning
From: "Schaub, Steve" <Steve_Schaub AT BCBST DOT COM>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Thu, 15 Sep 2005 06:26:34 -0400
I would suspect IC44976, since Todd noted that TOR would work for a day
or two before entering the Twilight Zone.  I have also been bitten by
the expired admin pwd bug, but that stopped TOR cold in my case, so I
didn't think it fit this particular problem.
-steve

-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf Of
E Mike Collins
Sent: Thursday, September 15, 2005 12:57 AM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Re: [ADSM-L] TSM Operational Reporting just stops functioning

Hi Todd, Steve, All,

Two likely causes are documented in APARs IC43649, and IC44976.   You
can
go to ibm.com and search for those for more information.  In a nut
shell,
IC43649 documents a restriction for TOR where it doesn't surface an
expired admin password for the account it uses to communicate with the
server.  An expired password can cause all TOR worker threads to be
consumed resulting in no further scheduled reports being sent out.  The
recommendation is to reset any expired passwords and then set the
password to not expire.  IC44976 is fixed in 5.2.6 and will be available
in 5.3.2 when that ships.  It fixes a resource leak that will also cause
this behavior.  If you need additional information please send me a note
directly.  Best Regards, Mike Collins, emcollin AT us.ibm DOT com

Ref:
Subject:
TSM Operational Reporting just stops functioning TOR is 5.3.1.0, running
on W2K3.  The hardware also runs ISC/AC, but we don't use that much.

TOR is set to monitor two TSM servers, one at 5.3, and one at 5.2, both
running AIX, if it matters.

For each TSM server, I have the standard hourly report, and two daily
reports; the standard daily report split up into two - a summary report
and a detail report.

By looking at the current reports in MMC, it appears that TOR simply
stopped running the reports.  The tsmrept service is still running.
There is nothing in the application or system event logs to indicate an
issue.
Looking at the date/time on the last reports run, all of the daily
reports appeared to run at the prescribed time for that day.  The hourly
reports ended up running for their last time at different times (TSM
server1's last hourly report was several hours later than TSM server2's
last hourly report).  Once the hourly reports stop running, subsequent
daily reports will not run, either.

Stopping and starting the tsmrept service "fixes" this for a while.
Sometimes it will work fine for only a day, sometimes for three days.  I
have verified that the "select" commands are not getting to the TSM
server: again, the service is still running.

I deactived the "detail" level report, but that hasn't helped anything.
I found a file "tecinfo.txt" in the Console\TEC folder of TSM.  The log
information there supports my observations; hourly reports running fine,
daily reports running fine, then hourlys stopping for one server, and
then stopping for the other server, and no additional reports after
that, until I stop and start the service.

Any thoughts?

Todd
Please see the following link for the BlueCross BlueShield of Tennessee E-mail
disclaimer:  http://www.bcbst.com/email_disclaimer.shtm