Re: [ADSM-L] TSM server appears to hang
2014-07-17 10:50:59
Hi ,
AIX 6.1 TL6 and up and AIX 7.1 TL3 and up are known to bring those kind
of problems :
http://www-01.ibm.com/support/docview.wss?uid=swg21587513
Pierre Billaudeau
-----Message d'origine-----
De : ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] De la part de
Matthew McGeary
Envoyé : 16 juillet 2014 20:19
À : ADSM-L AT VM.MARIST DOT EDU
Objet : Re: [ADSM-L] TSM server appears to hang
We're having the exact same problem, have been for quite a few months now.
It occurred on 6.3.4.100 and 7.1. Running on AIX 6.1 TL7 SP6 hosted on a
P740. It gets so bad on ours that I'll have to halt the dsmserv process,
perform a db2stop force and then restart TSM. Because it happens at random
times and is totally infrequent, I've written a quick and dirty script to make
sure that TSM is running and to do the shutdown/restart if the non-responsive
behaviour kicks in again.
I don't have a solution for you but we've been all the way up the developer
chain without much success. What hardware are you running your server on?
Matthew McGeary
Technical Specialist
PotashCorp - Saskatoon
306.933.8921
From: "Rhodes, Richard L." <rrhodes AT FIRSTENERGYCORP DOT COM>
To: ADSM-L AT VM.MARIST DOT EDU
Date: 07/16/2014 09:08 AM
Subject: [ADSM-L] TSM server appears to hang
Sent by: "ADSM: Dist Stor Manager" <ADSM-L AT VM.MARIST DOT EDU>
Hi Everyone,
The past couple of days we're had a strange problem with one of our TSM
instances (v6.2.5). At times it appears to hang.
Last night (and the previous night) it had many servers that got a dozen or
more sessions. This is really strange! This morning as I was looking at this,
cmds like "q vol" and "q stgpool" hang - no response! Commands like "q node"
and "q proc" work. The server was doing very little I/O.
All of a sudden the hung cmds all ran through and the server I/O jumped to
200-400MB/s. Something was locking I/O. I think the many sessions are clients
that retry because the server is not responding.
In the TSM actlog there are no unusual messages about the time it un-stuck.
The only strange entry in the actlog is a ANR9999D with
lockwait error early the previous evening. There are no AIX errors.
Any thought?
Rick
-----------------------------------------
The information contained in this message is intended only for the personal and
confidential use of the recipient(s) named above. If the reader of this message
is not the intended recipient or an agent responsible for delivering it to the
intended recipient, you are hereby notified that you have received this
document in error and that any review, dissemination, distribution, or copying
of this message is strictly prohibited. If you have received this communication
in error, please notify us immediately, and delete the original message.
------------------
Information confidentielle : Le présent message, ainsi que tout fichier qui y
est joint, est envoyé à l'intention exclusive de son ou de ses destinataires;
il est de nature confidentielle et peut constituer une information privilégiée.
Nous avertissons toute personne autre que le destinataire prévu que tout
examen, réacheminement, impression, copie, distribution ou autre utilisation de
ce message et de tout fichier qui y est joint est strictement interdit. Si vous
n'êtes pas le destinataire prévu, veuillez en aviser immédiatement l'expéditeur
par retour de courriel et supprimer ce message et tout document joint de votre
système. Merci.
|
|
|