HSM 6.1 segfault on Linux

mabugaev

Newcomer
Joined
Dec 8, 2009
Messages
4
Reaction score
0
Points
0
Hello everyone.

I'm having issues with TSM client, specifically HSM portion of it, because
backup is working fine.
We recently replaced this client completely with more resourceful machine
(16GB RAM, double quad-core), changed OS (from AIX to CentOS) and
switched to newer version of TSM - 6.1. It has pretty big filesystem - 32TB,
attached to it, which we would like to be "space managed".
The problem is that after adding it to HSM, dsmscoutd starts quitting every
30 sec. (mainly of course because dsmwatchd restarts it). It puts the
following in the OS log over and over:
kernel: dsmscoutd[6364]: segfault at 0000000000000010 rip 000000000821e7aa
rsp 00000000f6590f00 error 4

in dsmerror.log there is whole bunch of "catched exception" messages
starting with:
ANS9577E An exception "(HashEntryFile::ReadHashFileHeader): File was not
savely written to disk! Has to be redone!"!
Unable to use meta file!

and then a dozens of messages similar to:
HashController.cpp ( 426): (HashController::insertEntry): Catched exception.
HashController.cpp ( 332): (HashController::insertFileEntry): Catched
exception.
ScannerThread.cpp ( 374): (ScannerThread::ThreadFunc): Entry for file:
(quotas) not entered in hashtable

with only difference - different names in "Entry for file: ()" lines.
(If needed I can provide full list)

What I tried and looked at:
- configuration is used from working instance under older setup, and
parameters seems to be ok
- there is enough free space in RAM and on HDD
- tweaked some of kernel parameters (some limits), without any apparent effect
- uninstalled and installed HSM RPM again with latest fix pack
- the only reference I found about ANS9577E message, was about
TSM 5.4 (not 6.1) and filesystem being too big (more than 5TB), but it said
it was resolved (nothing about new size though).

And I was just wondering, if anyone have seen anything similar and know what causing it and how to fix it?
Would appreciate any pointers/suggestions.

Thanks in advance,
Michael
 
Quick update for anyone interested:
IBM's TSM support notified about latest update - 6.1.3 of TSM client software,
which fixed the problem.
Unfortunately no details about what was causing it nor how it was fixed were provided.

Michael
 
Back
Top