2008 / 2008 R2 Failover Cluster - Complete Cluster Environment Recovery

domw001

ADSM.ORG Member
Joined
Aug 22, 2005
Messages
8
Reaction score
2
Points
0
Has anyone managed to completely recover a 2008 / 2008 R2 Failover Cluster.

I'm talking about a whole environment recovery as in, a truck has just crashed through your computer room wall and taken out your cluster nodes (it's not personally happened to me, but , hey, you never know).

A lot of docs focus on recovery of a single node which is fine but I am having trouble recovering a 2-node cluster.
My process is as follows:

- On new node hardware, install vanilla operating system, service pack, etc.
- Create any local storage
- Install correct TSM BA Client (6.2.3.1)
- When all OK, restore system drive (C:\) and local storage (D:\)

So far, so good.....

The problem comes with the whole SystemState restore in that it won’t proceed unless the Cluster Service has started.

I have not joined my vanilla nodes to the domain or installed Failover Cluster (I never needed to do this on 2000/2003 clusters)
I can restore SystemState Bootable to get the OS back but the cluster service will not start because the CLUSDB file (part of SystemState Cluster Database) has not been restored and therefore not present.

Does anyone have any experience of this and is there a workaround.

Many Thanx

P.S. The only workaround I have is to shut down the Cluster Service on the inactive node, copy the CLUSDB file to an empty C:\ drive folder and restart the cluster service. This file can be successfully backed up during the data backup and restored during the data restore. This can then be used once the SystemState bootable restore has completed.

The only problem with this solution is that I don’t want to take my cluster service offline especially if it’s an Active-Active cluster.
 
Hi
If you perform non authoritative restore - rejoin node to the cluster, if you perform authoritative restore - use command 'restore systemstate clusterdb' after restore systemstate and drives.
Efim
 
Unfortunately its not possible.


non authoritative restore - rejoin node to the cluster

there is no cluster, its sitting under the front wheels of a big rig and has gone to motherboard heaven.

I am talking about a complete cluster recovery…. fresh node hardware, fresh storage, the works


authoritative restore - use command 'restore systemstate clusterdb' after restore systemstate and drives

'restore systemstate clusterdb' will only work if the cluster service has started otherwise you get the message:

ANS5211E The cluster service is offline. The cluster service must be online to perform an authoritative cluster database restore operation.


For the cluster service to come online the CLUSDB file must be present in the Windows\Cluster directory.
Unfortunately CLUSDB is contained in systemstate clusterdb
 
Yep tried that.

I had to use dsmc restore systemstate bootable to recover the OS and log in to the Domain.

When I try and run dsmc restore systemstate clusterdb afterwards I get...

dsmc restore systemstate clusterdb


IBM Tivoli Storage Manager
Command Line Backup-Archive Client Interface
Client Version 6, Release 2, Level 3.1
Client date/time: 07/06/2012 13:19:26
(c) Copyright by IBM Corporation and other(s) 1990, 2011. All Rights Reserved.

Node Name: X1201SQLNODE12
Session established with server L01: AIX-RS/6000
Server Version 5, Release 5, Level 6.0
Server date/time: 07/06/2012 13:18:48 Last access: 07/05/2012 14:56:46

Restore System State using shadow copy...
ANS1676W You are doing an authoritative cluster database restore. The process m
ay seem to be hang before and after the file is restored. This is because it ma
y need to start the cluster service if it is not up and take all the resources o
ffline. After the cluster database is restored, the cluster service will be res
tarted for changes to be in effect. The cluster service on all other nodes also
have been shutdown. They will be restarted. This may take a few minutes.


ANS5211E The cluster service is offline. The cluster service must be online to
perform an authoritative cluster database restore operation.
 
Back
Top