• Please help support our sponsors by considering their products and services.
    Our sponsors enable us to serve you with this high-speed Internet connection and fast webservers you are currently using at ADSM.ORG.
    They support this free flow of information and knowledge exchange service at no cost to you.

    Please welcome our latest sponsor Tectrade . We can show our appreciation by learning more about Tectrade Solutions
  • Community Tip: Please Give Thanks to Those Sharing Their Knowledge.

    If you receive helpful answer on this forum, please show thanks to the poster by clicking "LIKE" link for the answer that you found helpful.

  • Community Tip: Forum Rules (PLEASE CLICK HERE TO READ BEFORE POSTING)

    Click the link above to access ADSM.ORG Acceptable Use Policy and forum rules which should be observed when using this website. Violators may be banned from this website. This notice will disappear after you have made at least 3 posts.

Hacmp 5.2 Aix 5.3

dalami

ADSM.ORG Member
Joined
Feb 21, 2006
Messages
2
Reaction score
0
Points
0
HI ALL

My HA cluster included two p550 , i have created a logical disk in DS4300 turbo for concurent vg as the heartbeat device

today i have this probem:

in node 1 (( errpt ))
LABEL: OPMSG
IDENTIFIER: AA8AB241
Date/Time: Wed Jun 20 06:43:44 CUT 2007
Sequence Number: 2392
Machine Id: 00C0BE7A4C00
Node Id: SRVRADEEF1
Class: O
Type: TEMP
Resource Name: OPERATOR
Description
OPERATOR NOTIFICATION
User Causes
ERRLOGGER COMMAND
Recommended Actions
REVIEW DETAILED DATA
Detail Data
MESSAGE FROM ERRLOGGER COMMAND
clexit.rc : Unexpected termination of clstrmgrES
--------------------------------------------------------------------------
LABEL: SRC_SVKO
IDENTIFIER: BC3BE5A3

Date/Time: Wed Jun 20 06:43:44 CUT 2007
Sequence Number: 2391
Machine Id: 00C0BE7A4C00
Node Id: SRVRADEEF1
Class: S
Type: PERM
Resource Name: SRC

Description
SOFTWARE PROGRAM ERROR

Probable Causes
APPLICATION PROGRAM

Failure Causes
SOFTWARE PROGRAM

Recommended Actions
MANUALLY RESTART SUBSYSTEM IF NEEDED

Detail Data
SYMPTOM CODE
720907
SOFTWARE ERROR CODE
-9017
ERROR CODE
0
DETECTING MODULE
'srchevn.c'@line:'350'
FAILING MODULE
clstrmgrES

in node 2 (( errpt ))

LABEL: TS_NIM_ERROR_STUCK_
IDENTIFIER: 864D2CE3

Date/Time: Wed Jun 20 06:44:31 CUT 2007
Sequence Number: 1917
Machine Id: 00C0BE6A4C00
Node Id: SRVRADEEF2
Class: S
Type: PERM
Resource Name: topsvcs

Description
NIM thread blocked

Probable Causes
A thread in a Topology Services Network Interface Module (NIM) process
was blocked
Topology Services NIM process cannot get timely access to CPU

User Causes
Excessive memory consumption is causing high memory contention
Excessive disk I/O is causing high memory contention

Recommended Actions
Examine I/O and memory activity on the system
Reduce load on the system
Tune virtual memory parameters
Call IBM Service if problem persists

Failure Causes
Excessive virtual memory activity prevents NIM from making progress
Excessive disk I/O traffic is interfering with paging I/O

Recommended Actions
Examine I/O and memory activity on the system
Reduce load on the system
Tune virtual memory parameters
Call IBM Service if problem persists

Detail Data
DETECTING MODULE
rsct,nim_control.C,1.39.1.1,5459
ERROR ID
6XnGH40DnAS4/v6s00..e.1...................
REFERENCE CODE

Thread which was blocked
send thread
Interval in seconds during which process was blocked
26
Interface name
rhdisk5



Thanks in advance
 

heada

ADSM.ORG Moderator
Joined
Sep 23, 2002
Messages
2,560
Reaction score
168
Points
0
Location
Indiana
It looks like the server either lost its connection to the disk or the server was too busy to keep the heartbeat connection alive. Since it lost the heartbeat, the topology service died which caused the cluster service to shutdown.

You should be able to restart the cluster service and the topology service. I would find out if the server was too busy or if there was some other error. If it was too busy, you can lower the heartbeat timeout.

-Aaron (been WAY too long since I've worked with HACMP)
 

Advertise at ADSM.ORG

If you are reading this, so are your potential customer. Advertise at ADSM.ORG right now.

UpCloud high performance VPS at $5/month

Get started with $25 in credits on Cloud Servers. You must use link below to receive the credit. Use the promo to get upto 5 month of FREE Linux VPS.

The Spectrum Protect TLA (Three-Letter Acronym): ISP or something else?

  • Every product needs a TLA, Let's call it ISP (IBM Spectrum Protect).

    Votes: 18 18.4%
  • Keep using TSM for Spectrum Protect.

    Votes: 60 61.2%
  • Let's be formal and just say Spectrum Protect

    Votes: 12 12.2%
  • Other (please comement)

    Votes: 8 8.2%

Forum statistics

Threads
31,732
Messages
135,270
Members
21,730
Latest member
fahim
Top