ADSM-L

Disaster Recovery Experiences,

1997-03-26 09:40:31
Subject: Disaster Recovery Experiences,
From: Paul D Brown <pdbrown AT ASHLAND DOT COM>
Date: Wed, 26 Mar 1997 10:40:31 -0400
Paul Brown here with Disaster Recovery experiences.
Again I run MVS/ADSM server at 2.1.12 level.  Clients are about every
possible option.
I have had very good success at making DR happen. Use TCP/IP for all
connections.

I do not have Disaster Recovery Manager option (DRM), but have been looking
at it and seeing if
it could eliminate the ADSM server down time for ADSM system files.
For now I do take ADSM down once a day for the system file backups.
I run Expire once a day on at scheduled time.  I am having no problems
meeting our window for
server backups at this time with 300+ GB of sever disk being looked at
nightly for Incremental backup.
The primary storage pool is of course on MVS disk and the next pool (BFS
files) are also on MVS disk.
I take a backup of the BFS files to tape with MVS storage management
software only.  This way I make sure
all my offsite tape is full.  The tape vaulting is taken care of with MVS
tape management software (CA-1) only.
My storage management software is Sterling Software's SAMS:Disk.  This
could of course be HSM as well.
Collocation is a must for quick server rebuilds.  I do use ADSM command
'SHOW VOLUMEUSAGE nodename'
to identify BFS files required for a specific server restore and
pre-restore them with Sterling Software's SAMS:Disk.
This makes the server rebuilds very fast since they all come from MVS disk.

To give you a little back ground of how our MVS DR restore works, it is all
handled with Sterling's SAMS:Disk.
We determined a long time ago that all applications are needed in case of a
disaster and do not spend time in
worrying about which application is more important than others.  We back up
all MVS disk with SAMS:Disk and
do not allow end users or application programmers to write data directly to
tape.  We control all allocations with
SAMS:Allocate.  This allows us to keep all tapes full and the number of
tapes needed to a minimum.  We also migrate or archive data fairly
aggressively with SAMS:Disk.  Keep recall rate at 10%.  ADSM data is
treated no differently than any other system.  At the DR hotsite all data
is made auto restorable by using MVS catalogs to show all data as being
archived, even the backup copies.  This is done by changing all catalog
entries and merging
SAMS:Disk files data sets from archives and back ups together.  We do run
pre-restore jobs for major MVS
software subsystems such as DB2, IMS, CICS, ADSM, and so on.  All user
files and data bases are auto restorable
and appear to be archived through catalog look up.  Any files that require
forward recovery are brought back
prior to system availability.  All catalog changes and files data set
merges are completed with copies of files and
catalogs during nightly backups to save time at the hotsite.  At hotsite
all systems are up and available within 24
hours of the start of rebuild, including servers.  We test monthly on MVS
test LPAR and once or twice a year at IBM
BRS centers.  The biggest key has been to let one system control all
offsite tape management and data recoveries,
in our case MVS platform.  We also do AS/400 and Risc/6000 backup control
through MVS by tying these systems
into STK Robotics libraries and then letting MVS CA-1 handle tape vaulting
for these systems.  ADSM was not a good candidate for larger mid range
systems because of needed recovery speeds.  Just too slow for larger
amounts of data.

System may sound complex, but it is truly not.  Would be happy to answer
any questions.
Thanks,
Paul





To:       ADSM-L @ VM.MARIST.EDU
cc:        (bcc: Paul D Brown)
From:     kphan1 @ TANDY.COM
Date:     03-25-97 01:09:45 PM CST
Subject:  Re: Julie,




Hi Paul,
Do you have DRM installed with ADSM?  I'm planning to install DRM and
will
want to do DR testing in the future. Please share your experiences with
us.
Thanks you in advance.
Khiem Phan 817 870-0460

> ----------
> From:         Paul D Brown[SMTP:pdbrown AT ASHLAND DOT COM]
> Sent:         Tuesday, March 25, 1997 11:35 AM
> To:   ADSM-L AT VM.MARIST DOT EDU
> Subject:      Re: Julie,
>
> Julie,
>   I have recovered several machines in DR tests with no problems.
> We have MVS/ADSM Server 2.1.12 and many flavors of clients.
> Collocation is a must for quick restores!
> The 'SHOW VOLUMEUSAGE nodename' command is very helpful to review the
> tapes
> that will be needed.
>   Any other inforamtion needed give me a call.
> Thanks,
> Paul Brown
> Storage Systems Manager
> Ashland Inc.
> Lexington, Kentucky
> 606-357-7585
>
> To:       ADSM-L @ VM.MARIST.EDU
> cc:        (bcc: Paul D Brown)
> From:     julphinn @ EMPHESYS.E-MAIL.COM
> Date:     03-25-97 12:34:40 PM
> Subject:  Julie,
>
>
>
>
> Hi Ted,
> Thanks!  Yes, we've already tested recovery from onsite tapes.
> I'm being asked to do a server and client test during the big
> disaster test the company has scheduled.  When they heard the horror
> story about how long it took to recover a client from offsite tapes,
> they asked me to do that part of the test ahead of time, as a trial.
> Are you saying that in the event of a real disaster, I would recover
> the onsite tapes from the offsite tapes, before restoring any clients?
> Will ADSM put them back in collocated order?
> Julie
>
> *** Original Author:  I1014833 @ IBMMAIL - ** Remote User **; 03/25/97
> 11:25am
>
> Date:         Tue, 25 Mar 1997 10:20:18 -0700
> From:         Ted Spendlove <SPENDEE AT THIOKOL DOT COM>
> Subject:      Julie,
> To:           ADSM-L AT VM.MARIST DOT EDU
>
> Julie,
>
> It seems to me that you are testing TWO things here.
> 1. loss of a client.
> 2. loss of a server.
>
> Perhaps it would be easier to test them separatly.  For example, test
> recovery
> ...of a lost client by
> doing a recovery with the 41 tapes in your onsite pool.  At a
> different
> test
> ...recover the 41 onsite
> tapes from the 158 offsite tapes.
>
> Ted Spendlove
> Thiokol Corp.
>
>
>
> >>> Julie Phinney <julphinn AT EMPHESYS.E-MAIL DOT COM> 03/25/97 10:05am >>>
> I need to do a restore test of our biggest ADSM client, from offsite
> tapes (I've been asked to do a test, someone heard a horror story
> about
> how long it took.)
> I ran a Q CON batch job on the Offsite tapepool that ran about 7 hours
> and wrote out 1000+ cylinders of mainframe DASD.  The client I'd need
> to restore has files spread across 158 tapes.  (As opposed to 41 tapes
> in the collocated onsite pool.)
> So it seems
> to me, I'll need to:
> 1) Request those 158 tapes be brought back from the vault.
> 2) Mark every onsite tape belonging to this node DESTROYED.
> 3) Mark the 158 tapes ACC=READW
> 4) Restore the client.
> My questions:
> 1) If I do this, can I take them back when I'm done and return the
>    destroyed onsite tapes to not destroyed status?
>    (If I have to leave them onsite, then I no longer have offsite
>     copies of all the data on those tapes belonging to OTHER nodes.)
> 2) Do I really need all 158?  Is it possible that some have only
>    inactive versions of files on them, and ADSM wouldn't call for
> them?
> 3) Is there an easier way to find out which tapes ADSM will call for?
> 4) There must be a better way to test a recovery of a client from
>    offsite tapes.  Does anyone know of a way?
> Thanks for any help!!!!!
> Julie Phinney
> JULPHINN @ EMPHESYS.E-MAIL.COM
>
> ---- End of mail text
>
<Prev in Thread] Current Thread [Next in Thread>
  • Disaster Recovery Experiences,, Paul D Brown <=