ADSM-L

Re: dsmserv process hung.

2006-01-30 16:02:46
Subject: Re: dsmserv process hung.
From: Richard Sims <rbs AT BU DOT EDU>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Mon, 30 Jan 2006 16:02:02 -0500
On Jan 30, 2006, at 3:44 PM, Ochs, Duane wrote:

AIX 5.3
TSM 5.3.1.2
This weekend one of my three TSM servers had the DSMSERV process hang.
The machine was accessible, the DSMSERV process still existed. It was
still accepting connections but not talking to them. ...

Duane - One cause of a problem of this type is a thread failure; some
        key thread fails, while the rest of the process lives on, but
rather crippled. There should in any case be evidence in your Activity
Log, typically an ANR9999 message. Where a thread failure has occurred,
there will likely be a dsmserv.err file in the server directory giving
details.

Does anybody have a method in place or an idea to monitor if the TSM
server is actually capable of communication ?

The most standardized method is to test the responsiveness of the TSM
server's Web admin port (usually, 1580). Various HTTP-based packages
can be used to do this. Here is a fragment from execution of an HTTP
prober which I wrote, to illustrate:

 http_check: Connected to HTTP server.  Now sending data...
 http_check: Request 'GET / HTTP/1.1^M^JHost: ourhost.bu.edu^M^J^M^J'
             has been sent to HTTP server '1111.222.333.444'.  Now
awaiting reply...
 http_check: Response took 0.009691 seconds to arrive.
 http_check: Received 2907 bytes of data from HTTP server:
 'HTTP/1.0 200 OK
 Server: ADSM_HTTP/0.1
 Content-type: text/html

 <HEAD>
 <TITLE>
 Server Administration
 </TITLE>
 ...

Or you could run a TSM consolemode perl command, for example, to follow
the Activity Log and call out any irregularities.

   Richard Sims

<Prev in Thread] Current Thread [Next in Thread>