ADSM-L

Re: [ADSM-L] Fw: DISASTER: How to do a LOT of restores?

2008-01-22 13:22:14
Subject: Re: [ADSM-L] Fw: DISASTER: How to do a LOT of restores?
From: Maria Ilieva <mariika AT GMAIL DOT COM>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Tue, 22 Jan 2008 10:21:40 -0800
The procedure of creating active data pools (assuming you have TSM
version 5.4 or more) is the following:
1. Create FILE type DISK pool or sequential TAPE pool specifying
pooltype=ACTIVEDATA
2.Update node's domain(s) specifying ACTIVEDESTINATION=<created active
data pool>
3. Issue COPY ACTIVEDATA <node_name>
This process incrementaly copies node's active data, so it can be
restarted if needed. HSM migrated and archived data is not copied in
the active data pool!

Maria Ilieva

> ---
> W. Curtis Preston
> Backup Blog @ www.backupcentral.com
> VP Data Protection, GlassHouse Technologies
>
> -----Original Message-----
> From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf 
> Of
> James R Owen
> Sent: Tuesday, January 22, 2008 9:32 AM
> To: ADSM-L AT VM.MARIST DOT EDU
> Subject: Re: [ADSM-L] Fw: DISASTER: How to do a LOT of restores?
>
>
> Roger,
> You certainly want to get a "best guess" list of likely priority#1
> restores.
> If your tapes really are mostly uncollocated, you will probably
> experience lots of
> tape volume contention when you attempt to use MAXPRocess > 1 or to run
> multiple
> simultaneous restore, move nodedata, or export node operations.
>
> Use Query NODEData to see how many tapes might have to be read for each
> node to be
> restored.
>
> To minimize tape mounts, if you can wait for this operation to complete,
> I believe
> you should try to move or export all of the nodes' data in a single
> operation.
>
> Here are possible disadvantages with using MOVe NODEData:
>   - does not enable you to select to move only the Active backups for
> these nodes
>         [so you might have to move lots of extra inactive backups]
>   - you probably can not effectively use MAXPROC=N (>1 nor run multiple
> simultaneous
>         MOVe NODEData commands because of contention for your
> uncollocated volumes.
>
> If you have or can set up another TSM server, you could do a
> Server-Server EXPort:
>         EXPort Node node1,node2,... FILEData=BACKUPActive TOServer=...
> [Preview=Yes]
> moving only the nodes' active backups to a diskpool on the other TSM
> server.  Using
> this technique, you can move only the minimal necessary data.  I don't
> see any way
> to multithread or run multiple simultaneous commands to read more than
> one tape at
> a time, but given your drive constraints and uncollocated volumes, you
> will probably
> discover that you can not effectively restore, move, or export from more
> than one tape
> at a time, no matter which technique you try.  Your Query NODEData
> output should show
> you which nodes, if any, do *not* have backups on the same tapes.
>
> Try running a preview EXPort Node command for single or multiple nodes
> to get some
> idea of what tapes will be mounted and how much data you will need to
> export.
>
> Call me if you want to talk about any of this.
> --
> Jim.Owen AT Yale DOT Edu   (w#203.432.6693, Verizon c#203.494.9201)
>
> Roger Deschner wrote:
> > MOVE NODEDATA looks like it is going to be the key. I will simply move
> > the affected nodes into a disk storage pool, or into our existing
> > collocated tape storage pool. I presume it should be possible to
> restart
> > MOVE NODEDATA, in case it has to be interrupted or if the server
> > crashes, because what it does is not very different from migration or
> > relcamation. This should be a big advantage over GENERATE BACKUPSET,
> > which is not even as restartable as a common client restore. A
> possible
> > strategy is to do the long, laborious, but restartable, MOVE NODEDATA
> > first, and then do a very quick, painless, regular client restore or
> > GENERATE BACKUPSET.
> >
> > Thanks to all! Until now, I was not fully aware of MOVE NODEDATA.
> >
> > B.T.W. It is an automatic tape library, Quantum P7000. We graduated
> from
> > manual tape mounting back in 1999.
> >
> > Roger Deschner      University of Illinois at Chicago
> rogerd AT uic DOT edu
> >
> >
> > On Tue, 22 Jan 2008, Nicholas Cassimatis wrote:
> >
> >> Roger,
> >>
> >> If you know which nodes are to be restored, or at least have some
> that are
> >> good suspects, you might want to run some "move nodedata" commands to
> try
> >> to get their data more contiguous.  If you can get some of that DASD
> that's
> >> coming "real soon," even just to borrow it, that would help out
> >> tremendously.
> >>
> >> You say "tape" but never "library" - are you on manual drives?
> (Please say
> >> No, please say No...)  Try setting the mount retention high on them,
> and
> >> kick off a few restores at once.  You may get lucky and already have
> the
> >> needed tape mounted, saving you a few mounts.  If that's not working
> (it's
> >> impossible to predict which way it will go), drop the mount retention
> to 0
> >> so the tape ejects immediately, so the drive is ready for a new tape
> >> sooner.  And if you are, try to recruit the people who haven't
> approved
> >> spending for the upgrades to be the "picker arm" for you - I did that
> to an
> >> account manager on a DR Test once, and we got the library approved
> the next
> >> day.
> >>
> >> The thoughts of your fellow TSMers are with you.
> >>
> >> Nick Cassimatis
> >>
> >> ----- Forwarded by Nicholas Cassimatis/Raleigh/IBM on 01/22/2008
> 08:08 AM
> >> -----
> >>
> >> "ADSM: Dist Stor Manager" <ADSM-L AT VM.MARIST DOT EDU> wrote on 01/22/2008
> >> 03:40:07 AM:
> >>
> >>> We like to talk about disaster preparedness, and one just happened
> here
> >>> at UIC.
> >>>
> >>> On Saturday morning, a fire damaged portions of the UIC College of
> >>> Pharmacy Building. It affected several laboratories and offices. The
> >>> Chicago Fire Department, wearing hazmat moon suits due to the highly
> >>> dangerous contents of the laboratories, put it out efficiently in
> about
> >>> 15 minutes. The temperature was around 0F (-18C), which compounded
> the
> >>> problems - anything that took on water became a block of ice.
> >>> Fortunately nobody was hurt; only a few people were in the building
> on a
> >>> Saturday morning, and they all got out safely.
> >>>
> >>> Now, both the good news and the bad news is that many of the damaged
> >>> computers were backed up to our large TSM system. The good news is
> that
> >>> their data can be restored.
> >>>
> >>> The bad news is that their data can be restored. And so now it must
> be.
> >>>
> >>> Our TSM system is currently an old-school tape-based setup from the
> ADSM
> >>> days. (Upgrades involving a lot more disk coming real soon!) Most of
> the
> >>> nodes affected are not collocated, so I have to plan to do a number
> of
> >>> full restores of nodes whose data is scattered across numerous tape
> >>> volumes each. There are only 8 tape drives, and they are kept busy
> since
> >>> this system is in a heavily-loaded, about-to-be-upgraded state.
> (Timing
> >>> couldn't be worse; Murphy's Law.)
> >>>
> >>> TSM was recently upgraded to version 5.5.0.0. It runs on AIX 5.3
> with a
> >>> SCSI library. Since it is a v5.5 server, there may be new facilities
> >>> available that I'm not aware of yet.
> >>>
> >>> I have the luxury of a little bit of time in advance. The hazmat
> guys
> >>> aren't letting anyone in to asess damage yet, so we don't know which
> >>> client node computers are damaged or not. We should know in a day or
> >>> two, so in the meantime I'm running as much reclamation as possible.
> >>>
> >>> Given that this is our situation, how can I best optimize these
> >>> restores? I'm looking for ideas to get the most restoration done for
> >>> this disaster, while still continuing normal client-backup,
> migration,
> >>> expiration, reclamation cycles, because somebody else unrelated to
> this
> >>> situation could also need to restore...
> >>>
> >>> Roger Deschner      University of Illinois at Chicago
> rogerd AT uic DOT edu
>