ADSM-L

TSM Cluster Resource won't come online. - Fix

2005-01-20 18:56:44
Subject: TSM Cluster Resource won't come online. - Fix
From: TSM_User <tsm_user AT YAHOO DOT COM>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Thu, 20 Jan 2005 15:56:29 -0800
I thought I saw a question or comment in the past that suggested not using the 
feature to replicate the TSM password registry key when configuring TSM on a 
MSCS (Microsoft Cluster Server).  If you don't you are forced to have a static 
password for a node which in many cases in not desirable.  We ran through a 
number of tests and found out what can cause problems and how to resolve them.  
My apologies if this has been explained before.

Situation:
The cluster resources won't come on line.  You check the dsmsched.log and find 
that the password needs to be set.  You set the password. You can start the 
service using the local control panel without issue.  However, when you star 
the cluster resource for the service if fails.

Problem:
When you add the TSM password key to be replicated an *.cpt file is created on 
the quorum drive.  This file is used to replicate that key from one side of the 
cluster to the other.  What has happened is the local registry and this file 
are not out of sync.

Fix:
Right click the cluster resource and go to properties. Select the "Registry 
Replication" tab.  Click Modify so you can copy the key information.  Then 
remove the key and select OK.  This will delete the *.cpt file from the quorum 
drive.

Next you need to set the password correctly.  Sign on with the client GUI or 
use the command line and run q sess.  Just do something so you can set the 
password correctly.

Now go back to the cluster resource and add the key replication back.  Once you 
are done a new *.cpt file will be created which will again be in sync with the 
registry.

Note that you do not have to fail over to the any other nodes in the cluster.  
So long as you have it fixed on the node that is active the *.cpt file will 
ensure the other nodes in the cluster have their registries updated if a fail 
over happens.

How to break it to prove these steps:
If you want you can simply update the nodes password on the TSM server.  Then 
bring the clustered resources offline.  Then, try to bring them back on line. 
You will now have a situation where even if you get the password correctly set 
in the local registry the cluster resources will not come back on line.  Again, 
this is assuming you had the registry key replication set up in the first place.


                
---------------------------------
Do you Yahoo!?
 The all-new My Yahoo!  Get yours free!

<Prev in Thread] Current Thread [Next in Thread>
  • TSM Cluster Resource won't come online. - Fix, TSM_User <=