ADSM-L

Solaris ADSM server crashes

1997-10-13 09:43:37
Subject: Solaris ADSM server crashes
From: Tony Brancato <brancato AT LLE.ROCHESTER DOT EDU>
Date: Mon, 13 Oct 1997 09:43:37 -0400
Hi folks!  I'm running two ADSM 2.1.0.6 servers on a SPARCserver 1000
running Solaris 2.5.1.  The system is SCSI-connected to a 3494E Dataserver
with two 3590 drives.  Both servers have about 9GB of disk volumes as
primary storage pools, which migrate to the 3590 tapes.  Both servers have
about 3GB of database space.  This problem started on the server which
stored more data, but has recently shown up on the other server as well.
The problem is that both servers seem to crash intermittently with the
following messages:

ANR7821S Thread XX (tid YY) terminating on signal 10 (Bus error).
ANR9999D Trace-back of called functions:
ANR9999D   0x000444C8  AbortServer
ANR9999D   0x0004497C  TrapHandler
ANR9999D   0xDF6E0080  *UNKNOWN*
ANR9999D   0x0003A184  DoBatch
ANR9999D   0x00038570  DiskServerThread
ANR9999D   0x000446B4  StartThread
ANR9999D   0xDF6E25F0  *UNKNOWN*
ANR9999D   0x000445DC  StartThread

The thread number and tid vary each time, but the stack trace is the same.
This is occurring at least once a day, so backups are not very reliable
right now.  On the "larger" server, I can force the server to crash by
starting a migration or a move data process, however the server has crashed
when neither of these processes were running.  I have not been able to
correlate the crashes on the "smaller" server.

Another problem I've been seeing is that the system does not seem to
recognize end-of-tape properly.  ADSM sees a write error at EOT and marks
the volume read-only.  I don't know if this is related or not.

Can anyone give me a clue to what's going on here, or at least tell me how
to get ADSM to provide more information?  Thanks in advance.


Tony Brancato
University of Rochester - Laboratory for Laser Energetics
brancato AT lle.rochester DOT edu
<Prev in Thread] Current Thread [Next in Thread>