Hacmp 5.2 Aix 5.3

dalami

ADSM.ORG Member
Joined
Feb 21, 2006
Messages
2
Reaction score
0
Points
0
HI ALL

My HA cluster included two p550 , i have created a logical disk in DS4300 turbo for concurent vg as the heartbeat device

today i have this probem:

in node 1 (( errpt ))
LABEL: OPMSG
IDENTIFIER: AA8AB241
Date/Time: Wed Jun 20 06:43:44 CUT 2007
Sequence Number: 2392
Machine Id: 00C0BE7A4C00
Node Id: SRVRADEEF1
Class: O
Type: TEMP
Resource Name: OPERATOR
Description
OPERATOR NOTIFICATION
User Causes
ERRLOGGER COMMAND
Recommended Actions
REVIEW DETAILED DATA
Detail Data
MESSAGE FROM ERRLOGGER COMMAND
clexit.rc : Unexpected termination of clstrmgrES
--------------------------------------------------------------------------
LABEL: SRC_SVKO
IDENTIFIER: BC3BE5A3

Date/Time: Wed Jun 20 06:43:44 CUT 2007
Sequence Number: 2391
Machine Id: 00C0BE7A4C00
Node Id: SRVRADEEF1
Class: S
Type: PERM
Resource Name: SRC

Description
SOFTWARE PROGRAM ERROR

Probable Causes
APPLICATION PROGRAM

Failure Causes
SOFTWARE PROGRAM

Recommended Actions
MANUALLY RESTART SUBSYSTEM IF NEEDED

Detail Data
SYMPTOM CODE
720907
SOFTWARE ERROR CODE
-9017
ERROR CODE
0
DETECTING MODULE
'srchevn.c'@line:'350'
FAILING MODULE
clstrmgrES

in node 2 (( errpt ))

LABEL: TS_NIM_ERROR_STUCK_
IDENTIFIER: 864D2CE3

Date/Time: Wed Jun 20 06:44:31 CUT 2007
Sequence Number: 1917
Machine Id: 00C0BE6A4C00
Node Id: SRVRADEEF2
Class: S
Type: PERM
Resource Name: topsvcs

Description
NIM thread blocked

Probable Causes
A thread in a Topology Services Network Interface Module (NIM) process
was blocked
Topology Services NIM process cannot get timely access to CPU

User Causes
Excessive memory consumption is causing high memory contention
Excessive disk I/O is causing high memory contention

Recommended Actions
Examine I/O and memory activity on the system
Reduce load on the system
Tune virtual memory parameters
Call IBM Service if problem persists

Failure Causes
Excessive virtual memory activity prevents NIM from making progress
Excessive disk I/O traffic is interfering with paging I/O

Recommended Actions
Examine I/O and memory activity on the system
Reduce load on the system
Tune virtual memory parameters
Call IBM Service if problem persists

Detail Data
DETECTING MODULE
rsct,nim_control.C,1.39.1.1,5459
ERROR ID
6XnGH40DnAS4/v6s00..e.1...................
REFERENCE CODE

Thread which was blocked
send thread
Interval in seconds during which process was blocked
26
Interface name
rhdisk5



Thanks in advance
 
It looks like the server either lost its connection to the disk or the server was too busy to keep the heartbeat connection alive. Since it lost the heartbeat, the topology service died which caused the cluster service to shutdown.

You should be able to restart the cluster service and the topology service. I would find out if the server was too busy or if there was some other error. If it was too busy, you can lower the heartbeat timeout.

-Aaron (been WAY too long since I've worked with HACMP)
 
Back
Top