DR - use copy storage rule for hydrated Data to Tape ?

dietmar · Nov 9, 2021

HI,

I wanted to know if someone already looked into the hydrated container Data copies for DR ? ( which offers direct restore from Tape )

Currently i am not sure if this is a good approach or not . I am thinking of doing "some" important Data/Hosts only and keep a Local Tape Copy of the Container ?

my thoughts :

* Restore will be very very slow, because it will need to mount the daily Inc Tape ? ( and 4300 is ... slow in mounting )
* Could a repair of the container from tape + restore some Important Hosts ( number of Tapes - Performance of the Server ? ) work in parallel and is this useful ?
* Container Repair is very slow and if you are in a DR Situation it is a real problem. It does not utilize the theoretical Tape Speed because it is a lot depended on the Server, DB2 and the Block Size used + non dedup small Files .
* If a fast DR is critical, i think the best is to have multi ISP Server's with library sharing, and doing a very small ISP Server for the Data which is needed first. Splitting to multiple Container Pools helps a bit, but the problem is still the size of the DB2 Tables as they are not splitted / partitioned per Container. ( Restore DB , Audit Table )

This is the current local Tape Copy "Performance" as an Example ( CONT_1 = VMWARE TDP Data , CONT_2 = DB dumps ) running on the Customer Site:

ANR4000I The protect storage pool process for CONTAINER_1 on server SERVER1 to TAPE on server SERVER1 is complete. Extents protected: 8329851 of 8329851. Extents failed to protect: 0. Amount protected: 1,301 GB of 1,301 GB. Amount failed to protect: 0 bytes. Extents deleted: 7845627 of 7845627. Extents failed to delete: 0. Extents moved: 0 of 0. Extents failed to move: 0. Amount moved: 0 bytes of 0 bytes. Amount failed to move: 0 bytes. Elapsed time: 0 Days, 1 Hours, 50 Minutes

ANR4000I The protect storage pool process for CONTAINER_2 on server SERVER1 to TAPE on server SERVER1 is complete. Extents protected: 12645224 of 12645224. Extents failed to protect: 0. Amount protected: 2,283 GB of 2,283 GB. Amount failed to protect: 0 bytes. Extents deleted: 8774194 of 8774194. Extents failed to delete: 0. Extents moved: 0 of 0. Extents failed to move: 0. Amount moved: 0 bytes of 0 bytes. Amount failed to move: 0 bytes. Elapsed time: 0 Days, 1 Hours, 29 Minutes

Here u see that it is total depended on the Data how fast/slow everything is ... And this is of course also valid during repair if u need DR.

Admin Schedule Time, Tape Drives (in total 6) on the repl Target are available, but not sure if we should try it out or not .

So if anyone tried it, i would be glad to here from you

.

Br, Dietmar

marclant · Nov 9, 2021

If you are going to have multiple servers, why not do protect stgpool to another server along with replication. You have instant DR this way?

While there is some cases where having more than one container pool makes sense, it's generally better to have just one. You lose on dedup reduction when you have multiple pools because dedup only works within a pool, not across pools. So you can have duplicate extents in different pools. It also add complexity in trying to manage housekeeping tasks like protect or repair because if you have to consider the number of tape drives, either run them back to back with all drives, or concurrently with each half the drives.

And you didn't mention and you may already know this, but in case you don't know. Rehydrated copy to tape doesn't replace protect to tape, both server a different purpose:
- rehydrated copy to tape enables you to do client restore directly from tape like you said, but you cannot use this copy to repair the container pool
- in contract, protect to tape enables you to repair the container pool, but clients cannot restore from it.

In the end, you will have to do a DR test to which options works best for you.

dietmar · Nov 9, 2021

HI,

There is already a primary and a repl Target Server + 4300 . The DR Centario is that only Tapes are available. And we have done already DR Tests from "scratch" . Not the complete all in one . But we PIT Restored the Server to another HW Server and we 100% phy. Removed Containers from one Production Pool and repaired it from Tape .

I decided to configure multible Pools, because about 400+ TB Restore/repair from Tape takes weeks not days before being able to restore 1 bit . If one Pool is finished we could run restore's during the next pool repairs . But as said it is still very long as the DB2 is in one piece .

The Pools share the same "Data", so the lesser dedup should not be the problem . ( VMWARE goes to one poool, File Services and DAtabases in others )

In Theory for a "normal" failure we could rebuild the Containers from Disk .

I have been going thru a ISP repair from Tape only. And i wish nobody to get in such situation. Therefore i am always looking to improve the environment in case this should happen again ...

I also think this is NOT an investment problem if u tell the customer he has to wait weeks to get 1 bit if such case happens . So having more Tapes or Drives/sLots is NOT the point in my opinion . We also increased the minimum dedup tier on some Nodes to be faster in Backend processing knowing this led to more Backend Licenses .

So the question is how we could use this function do re hydrate to Tape in case of a DR , and makes sense .

I am just curious what is the best way to get "back" from a DR with Tapes only when using Containers .

In generell handling very very small files full indexed is a pain, and not only for the DB2 . Also those mini IO's are a problem if u back them up to tape or get them back (random write IO with 60KB, make the math with Storewize 5030 and SATA Drives ... ) . I don't know if this works, but i think it would be better to group such files into bigger pieces and if one changes expire the old and write a complete new bigger "block" to Tape . Those ncf files ....

thx, Dietmar

RecoveryOne · Nov 9, 2021

I can chime in on what I was planning on doing if budgets allowed.

That said I was going to have a primary tsm server replicate to a secondary tsm server for the quick recovery/whatever. Then I was even going to have the worse case of hydrated tape to restore clients in case total failure.

Basically a disk-to-disk-to-tape setup. The primary server would be in charge of hydrating data for tape, and sending the replication data to the secondary server.

However, as time progresses it seems everyone is becoming comfortable with disk to disk replication. I'm of the old saying that replication is not backup, and that having a fairly immutable copy of data on tape is a win.

That above has been my pipe dream for years.

dietmar · Nov 9, 2021

Anyone thought/asked to have a feature of a hydrated retention sets to tape ? I think it would be much better to have hydrated set of active Data serial on Tape and expire them monthly / yearly as to run Daily Incr to Tape ? I could think of running each day a set of hosts/filespaces that way. Customer could then decide to airgap it with the DBB . I am not sure if the stgrule to Tape is practical ( reclaim volumes,restore speed ) ?

DR - use copy storage rule for hydrated Data to Tape ?

dietmar

marclant

dietmar

RecoveryOne

dietmar

Data Privacy Impact Assessment

Sponsor ADSM.ORG

Navigation Menu

NordVPN 3 Months FREE

Forum statistics