ADSM-L

Re: DR/TSM Policy Matrix

2002-09-13 10:32:32
Subject: Re: DR/TSM Policy Matrix
From: William Rosette <Bill_Rosette AT PAPAJOHNS DOT COM>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Fri, 13 Sep 2002 10:18:15 -0400
Hi John,

I am currently going through a major Policy Domain change to do some of the
things you and Salak are talking about.
We want to sent up 3 Domains by their Recoverability.  Class1pool is the DR
clients, class2pool are the high restore but not DR clients, and class3pool
are all the rest.  Class1pool will colocate onsite (primary pool
(tapepool)) and offsite (secondary pool (copypool)), class2pool will
colocate onsite and nocolocate offsite, and class3pool nocolocate onsite or
offsite.  My major concern will be DASD for database/recovery logs and for
storage pools during this changeover of the Domain.  We have piece mealed
our system to what it is now (2 years old), and we want to set this up
right for the future, so we will have to do some defraging of our current
SSA, DASD, and tape drive layout.  I am looking for any good, bad, ugly or
any other way of advice to move this to a future TSM utopia, if there is
such a thing.

Thank You,
Bill Rosette
Data Center/IS/Papa Johns International
WWJD


|---------+---------------------------->
|         |           J M              |
|         |           <jm_seattle@HOTMA|
|         |           IL.COM>          |
|         |           Sent by: "ADSM:  |
|         |           Dist Stor        |
|         |           Manager"         |
|         |           <[email protected]|
|         |           .EDU>            |
|         |                            |
|         |                            |
|         |           09/13/2002 09:35 |
|         |           AM               |
|         |           Please respond to|
|         |           "ADSM: Dist Stor |
|         |           Manager"         |
|         |                            |
|---------+---------------------------->
  
>------------------------------------------------------------------------------------------------------------------------------|
  |                                                                             
                                                 |
  |       To:       ADSM-L AT VM.MARIST DOT EDU                                 
                                                        |
  |       cc:                                                                   
                                                 |
  |       Subject:  Re: DR/TSM Policy Matrix                                    
                                                 |
  
>------------------------------------------------------------------------------------------------------------------------------|




Has anyone developed a policy matrix that relates DR requirements
(RTO/RPO)to backup and archive policies (frequency, versions, retention)for
an enterprise environment? I'd be interested in sharing ideas w/ anyone in
or out of the healthcare industry, as I'm in the process of building this
kind of tool. Feel free to contact me directly if this is of interest.

Cheers,  John


>From: Salak Juraj <j.salak AT ASAMER DOT AT>
>Reply-To: "ADSM: Dist Stor Manager" <ADSM-L AT VM.MARIST DOT EDU>
>To: ADSM-L AT VM.MARIST DOT EDU
>Subject: Re: Strategies for DR recovery of large clients
>Date: Wed, 11 Sep 2002 12:07:11 +0200
>
>Just 2 cents:
>
>Assuming that
>search for files spread over tape and tapes
>consumes much of your restore time,
>so that many small files count for significant amount of restore time
>while large files count for most of your tape capacity
>but not for most of the restore time
>
>you could speed up things
>by applying similar management as is commonly used  for directories
>for small files as well.
>
>You could define a 2-level hierarchy within your disk storage pools,
>the first level with "maximum size treshold" set to allow for your
>small files only, pointing to second disk storage pool
>without file size limitation.
>This way many of small files would be restored from disk storage pool
>with no tape search penalty.
>
>If this really suits you depends strongly on the
>statistical distribution of your file sizes.
>
>regards
>juraj SALAK
>
> > -----Original Message-----
> > From: Werner Kliewer [mailto:VKliewer AT MPI.MB DOT CA]
> > Sent: Tuesday, September 10, 2002 5:30 PM
> > To: ADSM-L AT VM.MARIST DOT EDU
> > Subject: Strategies for DR recovery of large clients
> >
> >
> > I am working on our first TSM based DR plan for our data
> > centre. We currently have a successfully tested several times
> > plan using system specific tools, such as Sysback/6000 for
> > AIX, BRM for AS/400, ArcServe for Novell and WinNT. We have
> > been asked to convert to TSM to consolidate all backups in one tool.
> >
> > We have recently installed an NSM running TSM version
> > 4.1.3.0, soon to be upgraded to 4.2.2.1 because that is the
> > newest version certified for the NSM. It is attached to an
> > LTO library. There is no HSM activated, but the TSM Server is
> > backing up numerous clients that have to be restored in a DR
> > scenario. Many of them are NT4.0, Windows 2000 and soon
> > Windows XP servers of various sizes. There are also several
> > AIX 4.3.3.ML09 servers.
> >
> > I have the TSM Server recovery down to probably the bare
> > minimum time of 4-6 hours, depending on the power of the
> > machine it is being recovered on. This does not include
> > creating the disk pools, which can take another 12-18 hours
> > but is not part of the critical path.
> >
> > The biggest Windows servers run Exchange or SQL Server, which
> >  tend to back up large blobs of data that are relatively easy
> > to restore.
> >
> > Two of the AIX servers are p680's with 750 logical volumes,
> > 500 filesystems and 1.5-2 terabytes of total data each,
> > 250-350gb of data backed up nightly. For my current test, the
> > DR media pool is 107 cartridges. I have to restore the data
> > in stages. Neither the command line client nor the GUI will
> > allow me to select all the filespaces I need at once for the
> > first pass. The command line tells complains that it is too
> > long and the GUI simply fails before starting if I choose too
> > many filesystems. Even if one of them worked properly, I
> > would have to pass at least 75 of those tapes to do the
> > restore. This initial pass includes great chunks of the /home
> > filesystem where things are changing every day.
> >
> > For my DR test, I am running on a p610 with a single,
> > stand-alone LTO drive. It takes about 12 hours to pass those
> > 75 tapes once. I will have to pass them 3-5 times for each
> > restore. For the real DR test, I will have an F50 and 3 LTO
> > drives, but I will be restoring at the same time as all the
> > other critical servers, so I will be lucky to get a single
> > drive to myself, and it is the LOAD, UNLOAD and LOCATE parts
> > of the process that take up the bulk of the time. Actual data
> > transfer is quite well optimized, once the data is located.
> >
> > I am currently looking at 2 possible ways to improve this.
> > None of the servers will have direct attached backup/recovery
> > devices of sufficient capacity, throughput, or reliability to
> > be useful. We cannot afford enough drives to cover all the
> > servers. All restores must be done via the TSM server.
> >
> > One possibility is to use BACKUP SETs. But I am concerned
> > that BACKUP SETs are oriented to local restore scenarios and
> > am not sure how easy they are to manage and restore from a
> > central storage (NSM/TSM) point of view. There is also some
> > concern about the additional TSM activity creating the BACKUP
> > SETs would cause on an already fairly close to capacity NSM.
> >
> > The second option is to do full system ARCHIVES, but this
> > would cause activity on both the NSM and the client, neither
> > of which have available windows for this activity.
> >
> > Because either of these possibilities would, of necessity, be
> > occasional (at best once a week), there is the additional
> > issue of how easy it is to bring the system up to the most
> > current backup after the restore. Would a multi-filespace
> > simple restore be intelligent enough to pass only the last 7
> > days of tape or would it pass all tapes with those filespaces on them?
> >
> > A third possibility I have thought of recently is to isolate
> > these very large servers in their own COPY POOLs, effectively
> > co-locating only these servers, but I am not convinced this
> > would reduce the number of tapes passed by the DR restores,
> > and it would certainly increase the total number of tapes in
> > the DR set and increase off-site reclamation activity, which
> > already takes the better part of the day shift most days.
> >
> > Sorry for the long post. And thanks for any hints more
> > experienced TSM users can provide.
> >
> > I have a 48 hour DR Test window in which to build the TSM
> > Server, restore data to clients, hand the clients to the
> > DBA's to rebuild the databases, and test the end user
> > applications. In the past, we ran SYSBACK/6000 on the AIX
> > servers and were usually home and cooled in 36 hours. But the
> > AIX servers have been consolidated from 7 large servers to 2
> > enormous servers and the SYSBACK/6000 would also struggle to
> > complete in the 48 hour window. The point of going with TSM
> > was to try to improve on this.
> >
> > P.S. Where are the AS/400 clients? When we were sold the
> > product, we were told it ran on AS/400. We foolishly assumed
> > this meant the client first, server second ...
> >
> > <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> > Werner (Vern) Kliewer
> > Sr. ITS Analyst
> > Mid-Range Support
> > Manitoba Public Insurance
> > (204)-985-7745
> > vkliewer AT mpi.mb DOT ca
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >




_________________________________________________________________
Send and receive Hotmail on your mobile device: http://mobile.msn.com

<Prev in Thread] Current Thread [Next in Thread>