ADSM-L

Re: DISASTER Client Restores Slow

2002-05-17 16:53:44
Subject: Re: DISASTER Client Restores Slow
From: Jim Kirkman <jmk AT EMAIL.UNC DOT EDU>
Date: Fri, 17 May 2002 16:47:42 -0400
In North Carolina we've been advised that 7 miles is an 'adequate' distance for
off-site storage for DR.

And Wanda, be careful about how you view a 'hurricane zone'! Here in Chapel
Hill, NC we're some 200 miles inland and got whomped by Hurricane Fran in '96.
Power was out on campus for 3 days. I would think Md. might have similar
susceptibility.

But I am in total agreement, separate the TSM server and the tape library and
eliminate so much physical tape movement.

"Prather, Wanda" wrote:

> Yes, they are on the same campus.
> And I agree, if we were in an earthquake zone, that wouldn't be far enough
> away.
>
> But that's really a management decision - what disaster coverage do you
> require?
>
> Currently, our copy pool tapes are stored in the other building, and
> management believes that provides sufficient distance.  We are covered for
> fire, flood, explosion.  We are not in an earthquake or hurricane zone.
>
> We are NOT covered for a regional power outage such as would occur in a
> florida-like hurricane zone.
> But in that case we can take our tapes and move them elsewhere, and what we
> lose is time, not data.
> If this were a commercial enterprise, that time would be a larger concern.
>
> As with any DR situation, you have to figure out YOUR exposures and your
> requirements for resolving them.
>
> With our traditional methods of creating primary pool tapes onsite and
> moving copy pool tapes offsite, you have the slow/uncollocated restore
> problem.
>
> All I am suggesting here, is if we can use newer technnology to overcome the
> communicaiton limits, and think outside the box a bit, we should put the TSM
> library offsite, and keep the copy pool tapes onsite.  That would provide
> equal coverage and eliminate the slow/uncollocated restore problem.  (AND
> you eliminate the up-front time required to rebuild the TSM server.)
>
> Just something to think about.
>
> -----Original Message-----
> From: David Longo [mailto:David.Longo AT HEALTH-FIRST DOT ORG]
> Sent: Friday, May 17, 2002 1:18 PM
> To: ADSM-L AT VM.MARIST DOT EDU
> Subject: Re: DISASTER Client Restores Slow
>
> I would say that technically, that sounds good.  Except for one
> thing.  How close are the two buildings?  I wouldn't want all
> my eggs on the same campus.
>
> David Longo
>
> >>> Wanda.Prather AT JHUAPL DOT EDU 05/17/02 12:49PM >>>
> An idea to think about:
>
> We are looking at the possibility of changing our hardware config to take
> care of this issue.  We are considering MOVING our TSM server out of the
> data center into a different building.  With today's fiber connections the
> technology exists to do that.
>
> The primary pool tapes would stay in the tape robot, in bldg 2.
> Racks in the data center would be used as the "vault" for the copy pool
> tapes.
>
> If we lose the data center, the TSM server stays up and is ready to go, with
> collocated tapes available for restores.  If we lose bldg 2, who cares, our
> data center is OK.  We can take our time rebuilding a TSM server.
>
> The technology is out there so that we all probably start thinking about
> solutions like this.
>
> If all you have to work with is a DR site where you have to rebuild from
> scratch, this idea won't help you, I know.  In that case I think backupsets
> + incremental restore-by-date to current is probably the fastest way to go.
> Creating backupsets periodically can be automated.  It's not reasonable to
> do for ALL your servers, but for your most critical ones it's a fine idea.
> ************************************************************************
> Wanda Prather
> The Johns Hopkins Applied Physics Lab
> 443-778-8769
> wanda_prather AT jhuapl DOT edu
>
> "Intelligence has much less practical application than you'd think" -
> Scott Adams/Dilbert
> ************************************************************************
>
> -----Original Message-----
> From: Walker, Thomas [mailto:Thomas.Walker AT EMICAP DOT COM]
> Sent: Friday, May 17, 2002 11:08 AM
> To: ADSM-L AT VM.MARIST DOT EDU
> Subject: Re: DISASTER Client Restores Slow
>
> And using another backup solution won't result in many tape mounts as well?
> TSM might be more mounts than others, but you only have to do one restore.
> Remember, not using incremental forever means that you must resotre a
> machine at least two times. How about using backup sets if time is that much
> of an issue? If you have a couple hundred servers, I assume you have enough
> tape drives to make this feasible? Using, say 5 drives, to restore 100
> clients is probably a pipe dream. Also, do you run multiple servers? You can
> easily pass the bus throughput of most machines when trying to restore this
> much data. I guess what I'm saying is that people argue that tsm mounts a
> lot of tapes and appears slow during DR restores, but in reality,  the
> people that complain are usually trying to restore a lot of clients on a
> severely under-sized configuration. I think matching your DR hardware setup
> to your production setup is not a good idea. Most production setups are for
> speed in backing up. This usually means they are not optimized for restore
> speed. Also, prioritizing restores is key. I've cut DR times from originally
> 2 1/2 days of nightmare when I came here to about 17 hours with a fairly big
> setup. In other words, don't just start all restores at once and let 100
> clients fight for 6 drives. Of course it's gonna look slow!
>
> -----Original Message-----
> From: Talafous, John G. [mailto:Talafous AT TIMKEN DOT COM]
> Sent: Friday, May 17, 2002 10:34 AM
> To: ADSM-L AT VM.MARIST DOT EDU
> Subject: Re: DISASTER Client Restores Slow
>
> I am sure TSM will wait. And while we're on this subject, we are looking at
> Disaster Recovery plans and the path we must take using TSM to recover a
> couple hundred servers.  It looks bleak.
>
> We are finding that, due to incremental forever backups, recovery times are
> extremely long because of tape mount after tape mount after tape mount. In a
> real disaster, we expect to take an entire day or more to recover a single
> server. With a limited number of tape drives the recovery time required for
> 100 servers could take weeks.
>
> Has anyone else run into this dilemma? What is TSM's direction? How can I
> speed up the recovery process?
>
> John G. Talafous              IS Technical Principal
> The Timken Company            Global Software Support
> P.O. Box 6927                 Data Management
> 1835 Dueber Ave. S.W.         Phone: (330)-471-3390
> Canton, Ohio USA  44706-0927  Fax  : (330)-471-4034
> talafous AT timken DOT com           http://www.timken.com
>
> This e-mail including any attachments is confidential and may be legally
> privileged. If you have received it in error please advise the sender
> immediately by return email and then delete it from your system. The
> unauthorized use, distribution, copying or alteration of this email is
> strictly forbidden.
>
> This email is from a unit or subsidiary of EMI Recorded Music, North America
>
> "MMS <health-first.org>" made the following
>  annotations on 05/17/02 13:33:33
> ----------------------------------------------------------------------------
> --
> This message is for the named person's use only.  It may contain
> confidential, proprietary, or legally privileged information.  No
> confidentiality or privilege is waived or lost by any mistransmission.  If
> you receive this message in error, please immediately delete it and all
> copies of it from your system, destroy any hard copies of it, and notify the
> sender.  You must not, directly or indirectly, use, disclose, distribute,
> print, or copy any part of this message if you are not the intended
> recipient.  Health First reserves the right to monitor all e-mail
> communications through its networks.  Any views or opinions expressed in
> this message are solely those of the individual sender, except (1) where the
> message states such views or opinions are on behalf of a particular entity;
> and (2) the sender is authorized by the entity to give such views or
> opinions.
>
> ============================================================================
> ==

--
Jim Kirkman
Jim Kirkman
AIS - Systems
UNC-Chapel Hill
966-5884
<Prev in Thread] Current Thread [Next in Thread>