Move Spectrum Protect 8.1.12 instance to new hardware

jabuzzard · Jul 30, 2021

I am trying to move a TSM instance to new hardware. The old hardware was running RHEL 7.9 and the new is running RHEL 8.4. The server is now getting on in life (it's circa 12 years old), and while it is currently fine 2/3's of the servers of the same model we had are now dead; the display goes and just show multicoloured snow, so new hardware is called for.

The data is on a bunch of 24 drives in 4U disk shelves and the server is doing software RAID6 with shelf level redundancy. There are basically 23 arrays mounted /backup/disk00, /backup/disk01 (one disk per shelf per array) and I store the data on preallocated sequential files. The last drive in each shelf is a hot spare, and the disk22 is used both for backing up the DB etc. and as an NFS share for Spectrum Protect Plus to backup our VM's.

Yesterday I shutdown the old server after doing a DB backup, saving the volume history and device configuration etc. and swapped the server out and cabled up the disk shelves to the new server. Full set of new cables as we needed new SAS cards for RHEL8 and they have SFF-8644 ports. I am doing dm-multipath this time with dual connections to each shelf this time too.

I can see all the RAID6 arrays, they are correctly assembled through dm-multipath and mounted. I have got a 1TB NVME RAID1 disk for the database up and running. I have installed Spectrum Protect 8.1.12 and setup an instance with the same name.

However I cannot see how one restores the DB into the new instance using DSMSERV RESTORE DB. How does it know where the volumes are? I realize now in 16 years of running TSM servers I have never had to do this post 6.x Googling around produces no useful hits. I have a feeling now that one should have created a DRM plan? I guess I could start the instance on the old server and dump one out even if the disks holding the actual data are not attached? I guess I might be able to NFS export them back to the original server?

I still have access to the old server, though it doesn't have the disks attached anymore.

dietmar · Aug 2, 2021

If you plan this, i always recommend to run " run prepare " for a DRM plan. If u have the same server Version, and kept all the same Directories and have db backup ,volhist,devconfig from the old server u should be able to restore .

The prepare plan text file has then the cli to be used for the restore db command.

good luck.

RecoveryOne · Aug 3, 2021

Having the DRM plan is very handy. Even if you don't use it to do the full restore.
Having not done what you are attempting to do, I can only guess.
As long as the tsminst user/group are set on your filesystems where /backup/disk00 lives, and you have them mounted correctly (ie didn't rename /backup/disk00, to /backup/disk000.

Before doing anything, a good read would be:

Recommended DSMSERV RESTORE DB point-in-time procedure

What are the steps to perform and correct any inconsistencies caused by DSMSERV RESTORE DB to a point-in-time?

www.ibm.com

And

DSMSERV RESTORE DB

Use the DSMSERV RESTORE DB utility to restore a database to a point in time. A volume history file and a device configuration file must be available.

www.ibm.com

I'm assuming you copied the volhist and devconfig files over, and you don't need to manually edit them (database backup location is the same mount point/filesystem). Note the 2nd link states this:

DBBackup Specifies that the database is restored as follows:

Reads the volume history file to locate the database full and incremental backup volumes that are needed.

Rule 0. Disable all sessions, disable all admin tasks. You don't want data being moved about during this.
Rule 1. Take Database backups. Take a full, and then a snapshot. Always have a database backup.
Rule 2. Run prepare, and save the planfile. Make a copy of it before you start to edit.
Rule 3. Before doing anything else make sure you have a devconfig and volhist.
Rule 4. Since you are swinging storage to a new host, make sure permissions are correct on the TSM file systems. Your backup directories, your log directories.

Devconfig will help bring in tape drives if in use.
Volhist is where all your volume information lies. This is where tsm says your volumes live on disk/tape/other.

Here's my 'guess':
So, you have the new tsminst up and running, and it stood up a database by default if I recall correctly. You will need to halt that new instance and remove the 'blank' tsm database with:

Code:

dsmserv removedb TSMDB1

I like to build a file containing the directories for the database to be restored to. In my example, I have /home/tsminst1/dbdirs.txt. As my install is different than yours, adjust as needed to match your system
Content of dbdirs.txt:

Code:

/tsminst1/tsmdb000
/tsminst1/tsmdb001
/tsminst1/tsmdb002
/tsminst1/tsmdb003
/tsminst1/tsmdb004
/tsminst1/tsmdb005
/tsminst1/tsmdb006
/tsminst1/tsmdb007
/tsminst1/tsmdb008
/tsminst1/tsmdb009
/tsminst1/tsmdb010
/tsminst1/tsmdb011

Edit dsmserv.opt to point to your new log locations for archive/active logs.

As the tsminst1 user, I ran this:

Code:

/opt/tivoli/tsm/server/bin/dsmserv -i /home/tsminst1 restore db todate=<date of database backup>  totime=<time of database backup> source=dbb on=/home/tsminst1/dbdirs.txt activelogdir=/tsminst1/tsmactlog000 RESTOREKEYS=YES PASSWORD=<password> prompt=no

todate= Is date I ran the database backup. totime= is time actlog shows the database backup finishing.

Hope this helps.

jabuzzard · Aug 3, 2021

I coaxed the old server back into life and did a "run prepare" and then examined that. Basically I just needed to copy the dsmserv.opt, devconf.dat and volhist.dat into the TSM server's instance directory and then I was able to restore the DB using

dsmserv -i /opt/tsma restore db todate=today totime=now source=dbb RESTOREKEYS=YES

It chucked for a few minutes rather slower than I had hopped which I now realize because it was not using /opt as it prepared itself, and then it went "bam" as it actually populated the DB which is what I was expected with /opt being on a RAID1 NVME

The system disk is not shabby either 600GB 15k RPM RAID1 as well but it's no where near the performance of the NVME disk.

There was a slight issue with the security key, but given I only have one client (it backs up a large DSS-G and nothing else) I decided to update the client rather than mess with the server. Kicked off a dsmc incr /gpfs, normally I use mmbackup but visibility is poor into what that is going on so I decided a vanilla dsmc which takes a couple days would be best for the first backup on the new system.

All was going swimmingly till Sunday morning when Linux decided to kick off a consistency check on all 23 of my RAID arrays at once. Performance predictably cratered

On the old system I had disabled that and ran a script from cron that rotated through the arrays doing one array every two days at a low priority and I had forgotten to carry that over to the new system.

The whole system needs some performance tuning I think as it's not as fast as I would have hoped. Both the client and the server seem to sit around doing not a lot >80% CPU idle, loads of free RAM, disks and network nowhere near their benchmarked limits. However that is a task for another day.

Move Spectrum Protect 8.1.12 instance to new hardware

jabuzzard

dietmar

RecoveryOne

Recommended DSMSERV RESTORE DB point-in-time procedure

DSMSERV RESTORE DB

jabuzzard

Data Privacy Impact Assessment

Sponsor ADSM.ORG

Navigation Menu

NordVPN 3 Months FREE

Forum statistics