ADSM-L

AIX server hang?

1996-07-12 13:02:00
Subject: AIX server hang?
From: J Holdren <JFH5 AT PSUVM.PSU DOT EDU>
Date: Fri, 12 Jul 1996 13:02:00 EDT
This is the scenario...

I found our server (2.1.0.7) with 6 checkout processes and one migration
process sitting waiting for about 100 minutes.  So I started throwing
admin commands at the server to see what I could find out.  No tape were
mounted, I looked at the volume waiting to be mounted for the migration.
(it is in a collocated storage pool)  There were no error there, no errors
in the activity log, the disk storage pool was 66% full (migration starts
at 10%), only two sessions running at the time (my admin session and one
big node).  Then I queried the content of the volume waiting to mount
(with a count=1) and nothing. (as I write this still nothing)  I tried to
start another admin session, it hasn't prompted me for the password yet.
This is when I decided to check the tape library.  I found the automatic
output station door left open so the server could not complete the
checkouts. (This is a 3494 atl)  Once the door was closed everything took
off nicely, the checkouts finished and the migration got its tape
finally, but when I tried to query the processes from another admin
session at the library it hung also.  No response from the query.

I also took a look at the processes on the server (rs6000 AIX 3.2.5) and
found one dsmserv process running, and as far as I can tell, only one
running; and it is running and running and running.

Any ideas?

My guess is that the migration was waiting for the tape to migrate files
that belong to the current node doing the backup and when I did the query
content it locked everything up until the migration finishes with the
tape.  Trouble is, with the library I can't see if the tape drive is
actually doing any activity.  Well I just paused the library to look at
the tape drive and I could see no activity.

(Oh btw these are 3590 tapes) and this particular tape was only about 4%
filling when I queried the volume.  Ok that is about 480 MB but the
3490's when full didn't take that long when a query content was issued
against them.  My idea here is that the q content is actually the problem
process.  As it looks now the 'query content' command might be the issue.

I haven't restarted the server yet, but I will have to very soon.  I can
not even get a connection to the server from my desktop.  Any suggestions
that might save me from recycling the server?

Thanks,  J

J Holdren                   (~              Center for Academic Computing
jfh5 AT psu DOT edu            PENN_)TATE          224A Computer Bldg
1 (814) 865-2964                            University Park, PA  16802
<Prev in Thread] Current Thread [Next in Thread>
  • AIX server hang?, J Holdren <=