Suddenly Stopping dsmserv instance tsminst1

dyna

ADSM.ORG Member
Joined
Mar 17, 2006
Messages
46
Reaction score
3
Points
0
Website
Visit site
Since the beginning of this year, occasionally two Sp servers are being stopped.

Stopped, not crashed (so no coredumps or messages in the actlog).
Nothing in db2diag.log or dsmffdc.log.

just messages stating:

Apr 2 21:26:01 grnap250n02 systemd: Started Session 19097 of user root.
Apr 2 21:27:01 grnap250n02 systemd: Started Session 19098 of user root.
Apr 2 21:27:05 grnap250n02 tsminst1: Stopping dsmserv instance tsminst1 ...
Apr 2 21:27:05 grnap250n02 tsminst1: /etc/rc.d/init.d/tsminst1: line 174: kill: (3234) - No such process
Apr 2 21:27:05 grnap250n02 tsminst1: dsmserv instance tsminst1 stopped Successfully
Apr 2 21:28:01 grnap250n02 systemd: Started Session 19099 of user root.
Apr 2 21:28:05 grnap250n02 systemd: Removed slice User Slice of tsminst1.
Apr 2 21:29:01 grnap250n02 systemd: Started Session 19100 of user root.
Apr 2 21:30:01 grnap250n02 systemd: Started Session 19102 of user root.

to me this looks like an external process doing a sytemctl stop tsminst.service , but that would not give the kill message.

Does anyone has an idea what may be the cause of this?

Redhat 7.9, SP 8.1.11
 
I can't help you with why the instance is stopped. But the kill command comes from the script that's invoked when doing sytemctl stop tsminst.service. If you read /etc/rc.d/init.d/tsminst1 and go to line 174, you will see it is using the kill command to stop the server, but it looks like it was already stopped.
 
Is it possible db2 manager or servermon triggers this ?
 
Is it possible db2 manager or servermon triggers this ?
I can't see it. Both DB2 and Servermon are started by Spectrum Protect, not the other way around.
 
Did you check the activity log at Apr 2 21:27:05 to see if someone issued the HALT command?
Check the dsmffdc.log in the instance directory at that same time to see if there was an issue with the server.
You could check the db2diag.log as well at that time to be safe.
 
actlog shows nothing:
------------------------
04/02/2021 21:23:47 ANR0403I Session 320168 ended for node GRNAP1112 (Linux
x86-64). (SESSION: 320168)
04/03/2021 09:19:01 ANR7814I Using instance directory
/home/tsminst1/tsminst1.

dsmffdc.log shows:
---------------------
[04-02-2021 21:23:40.687][ FFDC_COMMUNICATIONS ]: (utcpcomm.c:417) TIMEOUT on request to connect to address 10.145.4.7, port 1503
[04-02-2021 21:23:51.032][ FFDC_GENERAL_SERVER_ERROR ]: (admnode.c:3957) csGetResults... rc=2351
[04-02-2021 21:23:51.035][ FFDC_GENERAL_SERVER_ERROR ]: (admnode.c:4196) csCheckIfNodeScheduled rc=2351
[04-02-2021 21:23:51.038][ FFDC_GENERAL_SERVER_ERROR ]: (sddelete.c:6471) Failure 2351 queuing refCount update request for object 7305853591 in pool 4
[04-02-2021 21:23:51.086][ FFDC_GENERAL_SERVER_ERROR ]: (bfcreate.c:3615) Error 9973 encountered while attempting to delete object 7305853591.
[04-02-2021 21:23:51.129][ FFDC_GENERAL_SERVER_ERROR ]: (admnode.c:4392) Error getting last replication server name rc=1012
[04-02-2021 21:23:51.359][ FFDC_GENERAL_SERVER_ERROR ]: (pvrclass.c:13857) ssOpenQueryPool error, rc: 9976.
[04-02-2021 21:23:51.359][ FFDC_GENERAL_SERVER_ERROR ]: (pvrclass.c:1305) Error 9976 building dcList
[04-02-2021 21:23:51.360][ FFDC_GENERAL_SERVER_ERROR ]: (pvrclass.c:1700) Unable to open device class query, rc: 9976.
[04-02-2021 21:23:51.360][ FFDC_GENERAL_SERVER_ERROR ]: (pvrclass.c:1851) Unable to open volume query for devclass DISK rcTemp=9976.
[04-02-2021 21:23:51.361][ FFDC_GENERAL_SERVER_ERROR ]: (monstats.c:12204) bfOpenQueryPool error, rc: 9979.
[04-02-2021 21:23:51.362][ FFDC_GENERAL_SERVER_ERROR ]: (monstats.c:11721) Status monitor thread for Group 1 grids ended.
[04-02-2021 21:23:51.631][ FFDC_GENERAL_SERVER_ERROR ]: (monstats.c:22252) Info, status monitor thread ended for gridIndex=9 current total threads=12.
[04-02-2021 21:23:52.937][ FFDC_GENERAL_SERVER_ERROR ]: (monstats.c:22252) Info, status monitor thread ended for gridIndex=20 current total threads=11.
[04-02-2021 21:23:54.620][ FFDC_GENERAL_SERVER_ERROR ]: (monstats.c:13659) Error 1116 pushing status
[04-02-2021 21:24:12.911][ FFDC_GENERAL_SERVER_ERROR ]: (monstats.c:22252) Info, status monitor thread ended for gridIndex=5 current total threads=10.
[04-02-2021 21:24:14.329][ FFDC_GENERAL_SERVER_ERROR ]: (monstats.c:22252) Info, status monitor thread ended for gridIndex=12 current total threads=9.
[04-02-2021 21:24:25.425][ FFDC_GENERAL_SERVER_ERROR ]: (monstats.c:22252) Info, status monitor thread ended for gridIndex=10 current total threads=8.
[04-02-2021 21:26:14.911][ FFDC_GENERAL_SERVER_ERROR ]: (monstats.c:22252) Info, status monitor thread ended for gridIndex=14 current total threads=7.
[04-02-2021 21:26:39.274][ FFDC_GENERAL_SERVER_ERROR ]: (monstats.c:22252) Info, status monitor thread ended for gridIndex=17 current total threads=6.
[04-02-2021 21:26:52.655][ FFDC_GENERAL_SERVER_ERROR ]: (monstats.c:22252) Info, status monitor thread ended for gridIndex=7 current total threads=5.
[04-03-2021 09:20:59.416][ FFDC_GENERAL_SERVER_ERROR ]: (srvmon.c:3236) Error 27 run command INSTR END

db2diag.log shows:
---------------------

2021-04-02-21.23.55.897839+120 E14615066E526 LEVEL: Warning
PID : 4057 TID : 140317974062848 PROC : db2sysc 0
INSTANCE: tsminst1 NODE : 000 DB : TSMDB1
APPHDL : 0-11194 APPID: *LOCAL.tsminst1.210402204332
AUTHID : TSMINST1 HOSTNAME: grnap250n02
EDUID : 50353 EDUNAME: db2agent (TSMDB1) 0
FUNCTION: DB2 UDB, base sys utilities, sqleDatabaseUnquiesce, probe:1
MESSAGE : ADM7510W Database unquiesce has been requested.

2021-04-02-21.23.56.048987+120 E14615593E558 LEVEL: Error
PID : 4057 TID : 140317974062848 PROC : db2sysc 0
INSTANCE: tsminst1 NODE : 000 DB : TSMDB1
APPHDL : 0-11194 APPID: *LOCAL.tsminst1.210402204332
AUTHID : TSMINST1 HOSTNAME: grnap250n02
EDUID : 50353 EDUNAME: db2agent (TSMDB1) 0
FUNCTION: DB2 UDB, base sys utilities, sqleDatabaseUnquiesce, probe:2
MESSAGE : ADM7511E Database unquiesce request has failed. The SQLCODE is
"1373".

2021-04-02-21.23.56.049402+120 E14616152E588 LEVEL: Warning
PID : 4057 TID : 140317974062848 PROC : db2sysc 0
INSTANCE: tsminst1 NODE : 000 DB : TSMDB1
APPHDL : 0-11194 APPID: *LOCAL.tsminst1.210402204332
AUTHID : TSMINST1 HOSTNAME: grnap250n02
EDUID : 50353 EDUNAME: db2agent (TSMDB1) 0
FUNCTION: DB2 UDB, base sys utilities, sqleDatabaseQuiesce, probe:1
MESSAGE : ADM7506W A request was made to quiesce a database with the following
quiesce mode: "QUIESCE DATABASE"

2021-04-02-21.23.56.097801+120 E14616741E1129 LEVEL: Error
PID : 4057 TID : 140306682996480 PROC : db2sysc 0
INSTANCE: tsminst1 NODE : 000 DB : TSMDB1
APPHDL : 0-950 APPID: *LOCAL.tsminst1.210402200449
AUTHID : TSMINST1 HOSTNAME: grnap250n02
EDUID : 51755 EDUNAME: db2agent (TSMDB1) 0
FUNCTION: DB2 UDB, relation data serv, sqlrr_rds_common_post, probe:870
MESSAGE : ZRC=0x8012006D=-2146303891=SQLR_CA_BUILT
"SQLCA has already been built"
DATA #1 : String, 28 bytes
UCstate terminate bit is set
DATA #2 : Hexdump, 4 bytes
0x00007F9B21C64E90 : 4001 0000 @...
DATA #3 : SQLCA, PD_DB2_TYPE_SQLCA, 136 bytes
sqlcaid : SQLCA sqlcabc: 136 sqlcode: -1224 sqlerrml: 0
sqlerrmc:
sqlerrp : SQLRRRCP
sqlerrd : (1) 0x80100003 (2) 0x00000003 (3) 0x00000000
(4) 0x00000000 (5) 0xFFFFFE0C (6) 0x00000000
sqlwarn : (1) (2) (3) (4) (5) (6)
(7) (8) (9) (10) (11)
sqlstate:

....................


2021-04-02-21.27.03.751253+120 I15623375E334 LEVEL: Event
PID : 18121 TID : 140737352263552 PROC : db2stop
INSTANCE: tsminst1 NODE : 000
HOSTNAME: grnap250n02
FUNCTION: DB2 UDB, base sys utilities, sqleIssueStartStop, probe:12166
DATA #1 : <preformatted>
Single node stopped [-1064].

2021-04-02-21.27.03.751567+120 I15623710E391 LEVEL: Warning
PID : 18121 TID : 140737352263552 PROC : db2stop
INSTANCE: tsminst1 NODE : 000
HOSTNAME: grnap250n02
FUNCTION: DB2 UDB, base sys utilities, sqleReleaseStStLockFile, probe:16076
MESSAGE : Released lock on the file:
DATA #1 : String, 39 bytes
/home/tsminst1/sqllib/ctrl/db2strst.lck
 
Back
Top