ADSM-L

Re: Problem - server will not start

1997-08-23 19:28:24
Subject: Re: Problem - server will not start
From: Michael Kaczmarski <kacz AT US.IBM DOT COM>
Date: Sat, 23 Aug 1997 19:28:24 -0400
If you start the server in the foreground it will be easier to determine what
is happending by looking at the start-up mesages

Mike Kaczmarski
IBM Corporation
ADSM Development
kacz AT us.ibm DOT com



        ADSM-L AT VM.MARIST DOT EDU
        08/22/97 06:41 PM
Please respond to ADSM-L AT VM.MARIST DOT EDU @ internet


To: ADSM-L AT VM.MARIST DOT EDU @ internet
cc:
Subject: Problem - server will not start

This morning I came in to find that our ADSM server would not come up
following an ordinary system reboot.  Clients attempting to connect to it get
TCP/IP connection failure (meaning that the server is not yet ready to receive
external communication), and a look at the server shows less than half the
number of dsmserv processes that should be running on this AIX 3.2.5 system.
What is happening is that the primary dsmserv process starts some child thread
processes, but then the primary process goes into a period of seemingly
endless CPU activity.  At first, all that activity was in "system" time; but
then it switched over to being all "user" time.  I verified that the 3494 and
its tape drives are fully responsive.  There are no AIX error log entries, and
nothing in /dsmerror.log; nor any messages in starting the thing manually.
System monitor does not show high I/O activity on any disk.  In short, there
are no external indications of what is wrong.  Here is a look at the processes
as they are running at the moment, showing the primary still churning:

    pid   ppid  s w pri  ni tix    utime    stime      tty   user     command
   5218   73310 W E  60  60   0  0:00:00  0:00:05   pts/41 root     dsmserv
  35945   73310 W E  60  60   0  0:00:00  0:00:00   pts/41 root     dsmserv
  45665   73310 W E  60  60   0  0:00:00  0:00:00   pts/41 root     dsmserv
  50784   73310 W E  60  60   0  0:00:00  0:00:06   pts/41 root     dsmserv
  58983   73310 W E  60  60   0  0:00:00  0:00:00   pts/41 root     dsmserv
  62559   73310 W E  60  60   0  0:00:00  0:00:00   pts/41 root     dsmserv
  73310   71254 R C  92  60   5  1:35:44  0:20:14   pts/41 root     dsmserv
  76392   73310 W E  60  60   0  0:00:00  0:00:00   pts/41 root     dsmserv

Has anyone seen this before?  Any idea how to either get the server working or
force diagnostic information out?

       thanks, Richard Sims  Boston University
<Prev in Thread] Current Thread [Next in Thread>