DR site plannig + VTL replication

avinim8

ADSM.ORG Member
Joined
Mar 9, 2007
Messages
54
Reaction score
1
Points
0
PREDATAR Control23

Hi,
Today we are moving tapes from the primary site to the secondary one manually by truck. We bought 2 VTL machines so that the critical data will back up directly to the VTL (or to disks and then migrated to the VTL) and then will be replicated by the primary VTL to the secondary one. I have several questions regarding this configuration:
1. The TSM server is not aware to the data replication process between both VTLs, so it does not know anything about the data stored in the secondary VTL. right?
2. What should I do once the primary site fails? restore the TSM DB to the secondary TSM server in the DR site? How will it know all the data stored in the VTL?

Is there anything else I should think about for this configuration to work?

Thanks,

Avi
 
PREDATAR Control23

i'm not an expert.. but I dont know if this will be support by IBM. Correct me if i'm wrong

1. The TSM server is not aware to the data replication process between both VTLs, so it does not know anything about the data stored in the secondary VTL. right?

YES tsm is not aware of the "replication" process between the 2 VTL if you use the one provide by the VTL manufacturer.

2. What should I do once the primary site fails? restore the TSM DB to the secondary TSM server in the DR site? How will it know all the data stored in the VTL?

Not sure if it will work... just think of the space reclamation process.. if the replication process between the 2 vtl's is running and you lost your primary site. you start your DB recovery... and the data in the other VTL is not synch..

But From what I could think is you have a good link between your Primary site and your DR site. you replicate you BKP VTL exaclly as your Primary VTL( Data, Volume Name, etc... )If it's possible define 2 set of disk on the "Server" One the main DB is on the primary site. the other set is on the DR site. and have them sync. (ex: DEFINE DB Copy in TSM). if you have a slow link this can impact your TSM Server if you dont have a good link.


This link is a little bit old...
http://www.redbooks.ibm.com/redbooks/pdfs/sg246844.pdf

But i think this could be a good start with the idea of a Tier 4/5 setup - 2 sites 2 phase comit.

How to technically inplement it... i'm still looking for the info..

We have Zero data lost setup for our critical apps /real time SAN Synchronous Replication between our 2 data center.. But for other application(none critical we are still in a cold restore scenario. restore TSM BD in a dr site/restore from tape.. )

adding a VTL is really appelling but since we have a really big pipe between our sites. We are looking to have our Primary pool in one site(VTL1) and offsite pool on VTL2.. This without using the replication software provide from the manufacturer.

So when using the copy storate pool we copy the data from VTL one to VTL2 it's TSM that will move all the nightly backup.. (If you have 10TB nighly backup.. you copy 10 TB)..

Hope this help
 
Last edited:
PREDATAR Control23

To put it simply, as long as the replication is exact, then when you bring up TSM at the DR site, you do not have to mark the VTL storage pool as 'destroyed'; just start restoring from it! (I did something similar with datadomain - worked like a charm.)
 
PREDATAR Control23

Can you please broaden about the process that need to be done?
I am not that familiar with it.

1. Usually, Once the primary site fails, I need to restore a TSM DB into my secondary TSM server and mark the primary storage pool as destroyed?
This way the TSM knows that now it should use the copy pool?
2. Now, with the VTLs I have to keep both VTLs synced and only then will be able to recover from disaster?
3. If the VTLs are not synced, the primary TSM DB will contain data about files that will not be in the secondary VTL? (assuming the DB had been replicated to the secondary site)
4. One more scenario is the the data is replicated between the VTLs but the DB isn't, so the secondary server is unaware of changes in the VTL (expired files for instance).



What should be done usually?
I am sure that this configuration is implemented somwhere and that they already thought of all those issues.

Please help...
 
PREDATAR Control23

You are basically correct on all counts. To make sure that the TSM database and VTL are in synch, write your database backup to the VTL! You use VTL with replication to speed up the process of recovery, but if you have to wait for your tapes to load the TSM database, then you're spending $$$ for nothing!
 
PREDATAR Control23

1. Usually, Once the primary site fails, I need to restore a TSM DB into my secondary TSM server and mark the primary storage pool as destroyed?
This way the TSM knows that now it should use the copy pool?

it's all depend on how you implement your VTL. If you use the replication software provide by the manufacturer.. it's imperative that TSM DO NOT SEE the second VTL. Since you replicate the VTL without TSM.. You do not need to do a DR scenario the VTL is an exact copy of you primary pool. At your DR site, you just need to restore your server that same method you would restore your server if it crash.

2. Now, with the VTLs I have to keep both VTLs synced and only then will be able to recover from disaster?

Yes the VTL need to be sync. What you can do is.when your DB backup is finish. Disable all session/sched/expi/reclamation.. all TSM activities
Synch your VTL with the replication software of your VTL.
Make sure the replication is succesfull.
Enable all you activities on your TSM server(Primary site)

As DanGiles said in the post." when you bring up TSM at the DR site, you do not have to mark the VTL storage pool as 'destroyed'; just start restoring from it! " The VTL is an exact copy of your PRimary pool

3. If the VTLs are not synced, the primary TSM DB will contain data about files that will not be in the secondary VTL? (assuming the DB had been replicated to the secondary site).

If you synch your VTL once a day after the database backup and you make sure it work. You shoudnt have the problem.
 
PREDATAR Control23

2. Now, with the VTLs I have to keep both VTLs synced and only then will be able to recover from disaster?

Yes the VTL need to be sync. What you can do is.when your DB backup is finish. Disable all session/sched/expi/reclamation.. all TSM activities
Synch your VTL with the replication software of your VTL.
Make sure the replication is succesfull.
Enable all you activities on your TSM server(Primary site)

Why should I disable all activity on the server? Once replication starts it should replicate all changes done up until the replication, not including the current data that is being written while the replication works. So no real reason for stopping all activity on the TSM server.
Correct me if I'm wrong.
 
PREDATAR Control23

As i said.. I'm not an expert in VTL replication. I'm just been cautious! and work from what i know on how TSM works and if I'm off the track feel free to correct me!

On our site we only have one.. planning to buy another one.. but without the "Software replication" we have 3 data center all link from a MAN with enough bandwidth. So our TSM database is copied on another Data-Center with the "define db copy" every change that that the DB does on the primary is instantly copied on the copy volume. in this case having the VTL Replication real time could be an option.. draw back.. only one pool... no copy pool so if one volume is corrupted the chance of restoring the volume is practically 0. If you have a copy pool you have greater chance to restore your volume.

The heart is the TSM DB. IF you cannot have your DB Sync real time on you DR site, you need a break point. EX:

1- DB backup is done at 13H00 finish at 13H30
2- Start VTL replication a 13H45 finish at 13:50
3- Start reclamation at 14H00 Fire in the main data center at 16H45..
4- Start DR at 17H00 at the DR Site.
5- Restore you DB of 13H30...

So if you have replication of your VTL from 14H00 to 16H45 and you TSM DB is not sync you will have a nasty surprise.

Maybe I'm over-paranoid, getting old school or maybe a need 2 other Pan Galactic Gargle Blaster to understand the whole concept of VTL replication from the manufacturer...

This is my 2 cents!
 
PREDATAR Control23

TSM is fairly flexible and forgiving of most minor transgressions, so you probably don't need to disable everything. I would let everything continue to run (which at this stage, shouldn't be much), but what I would do is put a tape re-use delay on your storage pool of a couple of days. This is important if the VTL is being replicate more-or-less in real time but you only take a db snapshot every 24 hours!
 
Top