Re: [ADSM-L] Fw: DISASTER: How to do a LOT of restores? [like Steve H sa

DR strategy using an ACTIVEdata STGpool is like Steve H said, but
with minor additions and a major (but temporary) caveat:

COPY ACTIVEdata is not quite ready for this DR strategy yet:

See APAR PK59507:  COPy ACTIVEdata performance can be significantly degraded
(until TSM 5.4.3/5.5.1) unless *all* nodes are enabled for the ACTIVEdata 
STGpool.

http://www-1.ibm.com/support/docview.wss?rs=663&context=SSGSG7&dc=DB550&uid=swg1PK59507&loc=en_US&cs=UTF-8&lang=en&rss=ct663tivoli

Here's a slightly improved description of how it should work:

DEFine STGpool actvpool ... POoltype=ACTIVEdata -
        COLlocate=[No/GRoup/NODe/FIlespace] ...
COPy DOmain old... new...
UPDate DOmain new... ACTIVEDESTination=actvpool
ACTivate POlicy new... somePolicy
Query SCHedule old... * NOde=node1,...,nodeN    [note old... sched.assoc's]     
                
UPDate NOde nodeX DOmain=new...                 [for each node[1-N]
DEFine ASSOCiation new... [someSched] nodeX     [as previously associated]
COpy ACTIVEdata oldstgpool actvpool     [for each oldstgpool w/active backups]

[If no other DOmain except new... has ACTIVEDESTination=actvpool,
the COpy ACTIVEdata command(s) will copy the Active backups from specified
nodes node[1-N] into the ACTIVEdata STGpool actvpool to expedite DR for...]

[But, not recommended until TSM 5.4.3/5.5.1 fixes APAR PK59507!]
--
Jim.Owen AT Yale DOT Edu   (203.432.6693)

Steven Harris wrote:

Nick

I may well have a flawed understanding here but....

Set up an active-data pool
clone the domain containing the servers requiring recovery
set the ACTIVEDATAPOOL parameter on the cloned domain
move the servers requiring recovery to the new domain,
Run COPY ACTIVEDATA on the primary tape pool

Since only the nodes we want are in the domain with the ACTIVEDATAPOOL
parameter specified, will not only data from those nodes be copied?

Regards

Steve

Steven Harris
TSM Admin, SYdney Australia

"ADSM: Dist Stor Manager" <ADSM-L AT VM.MARIST DOT EDU> wrote on 23/01/2008
11:38:17 AM:

For this scenario, the problem with Active Storagepools is it's a
pool-to-pool relationship.  So ALL active data in a storagepool would be
copied to the Active Pool.  Not knowing what percentage of the nodes on

the

TSM Server will be restored, but assuming they're all in one storage

pool,

you'd probably want to "move nodedata" them to another pool, then do the
"copy activedata."  Two steps, and needs more resources.  Just doing

"move

nodedata" within the same pool will semi-collocate the data (See Note
below).  Obviously, a DASD pool, for this circumstance, would be best, if
it's available, but even cycling the data within the existing pool will
have benefits.

Note:  Semi-collocated, as each process will make all of the named nodes
data contiguous, even if it ends up on the same media with another nodes
data.  Turning on collocation before starting the jobs, and marking all
filling volumes read-only, will give you separate volumes for each node,
but requires a decent scratch pool to try.

Nick Cassimatis

----- Forwarded by Nicholas Cassimatis/Raleigh/IBM on 01/22/2008 07:25 PM
-----

"ADSM: Dist Stor Manager" <ADSM-L AT VM.MARIST DOT EDU> wrote on 01/22/2008
01:58:11 PM:

Are files that are no longer active automatically expired from the
activedata pool when you perform the latest COPY ACTIVEDATA?  This

would

mean that, at some point, you would need to do reclamation on this

pool,

right?

It would seem to me that this would be a much better answer to TOP's
question.  Instead of doing a MOVE NODE (which requires moving ALL of
the node's files), or doing an EXPORT NODE (which requires a separate
server), he can just create an ACTIVEDATA pool, then perform a COPY
ACTIVEDATA into it while he's preparing for the restore.  Putting said
pool on disk would be even better, of course.

I was just discussing this with another one of our TSM experts, and

he's

not as bullish on it as I am.  (It was an off-list convo, so I'll let
him go nameless unless he wants to speak up.)  He doesn't like that you
can't use a DISK type device class (disk has to be listed as FILE

type).

He also has issues with the resources needed to create this "3rd copy"
of the data.  He said, "Most customers have trouble getting backups
complete and creating their offsite copies in a 24 hour period and

would

not be able to complete a third copy of the data."  Add to that the
possibility of doing reclamation on this pool and you've got even more
work to do.

He's more of a fan of group collocation and the multisession restore
feature.  I think this has more value if you're restoring fewer clients
than you have tape drives.  Because if you collocate all your active
files, then you'll only be using one tape drive per client.  If you've
got 40 clients to restore and 20 tape drives, I don't see this slowing
you down.  But if you've got one client to restore, and 20 tape drives,
then the multisession restore would probably be faster than a

collocated

restore.

I still think it's a strong feature whose value should be investigated
and discussed -- even if you only use it for the purpose we're
discussing here.  If you know you're in a DR scenario and you're going
to be restoring multiple systems, why wouldn't you do create an
ACTIVEDATA pool and do a COPY ACTIVEDATA instead of a MOVE NODE?

OK, here's another question.  Is it assumed that the ACTIVEDATA pool
have node-level collocation on?  Can you use group collocation instead?
Then maybe I and my friend could both get what we want?

Just throwing thoughts out there.

---
W. Curtis Preston
Backup Blog @ www.backupcentral.com
VP Data Protection, GlassHouse Technologies

-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf

Of

Maria Ilieva
Sent: Tuesday, January 22, 2008 10:22 AM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Re: [ADSM-L] Fw: DISASTER: How to do a LOT of restores?

The procedure of creating active data pools (assuming you have TSM
version 5.4 or more) is the following:
1. Create FILE type DISK pool or sequential TAPE pool specifying
pooltype=ACTIVEDATA
2.Update node's domain(s) specifying ACTIVEDESTINATION=<created active
data pool>
3. Issue COPY ACTIVEDATA <node_name>
This process incrementaly copies node's active data, so it can be
restarted if needed. HSM migrated and archived data is not copied in
the active data pool!

Maria Ilieva

---
W. Curtis Preston
Backup Blog @ www.backupcentral.com
VP Data Protection, GlassHouse Technologies

-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf

Of

James R Owen
Sent: Tuesday, January 22, 2008 9:32 AM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Re: [ADSM-L] Fw: DISASTER: How to do a LOT of restores?

Roger,
You certainly want to get a "best guess" list of likely priority#1
restores.
If your tapes really are mostly uncollocated, you will probably
experience lots of
tape volume contention when you attempt to use MAXPRocess > 1 or to

run

multiple
simultaneous restore, move nodedata, or export node operations.

Use Query NODEData to see how many tapes might have to be read for

each

node to be
restored.

To minimize tape mounts, if you can wait for this operation to

complete,

I believe
you should try to move or export all of the nodes' data in a single
operation.

Here are possible disadvantages with using MOVe NODEData:
  - does not enable you to select to move only the Active backups for
these nodes
        [so you might have to move lots of extra inactive backups]
  - you probably can not effectively use MAXPROC=N (>1 nor run

multiple

simultaneous
        MOVe NODEData commands because of contention for your
uncollocated volumes.

If you have or can set up another TSM server, you could do a
Server-Server EXPort:
        EXPort Node node1,node2,... FILEData=BACKUPActive

TOServer=...

[Preview=Yes]
moving only the nodes' active backups to a diskpool on the other TSM
server.  Using
this technique, you can move only the minimal necessary data.  I

don't

see any way
to multithread or run multiple simultaneous commands to read more

than

one tape at
a time, but given your drive constraints and uncollocated volumes,

you

will probably
discover that you can not effectively restore, move, or export from

more

than one tape
at a time, no matter which technique you try.  Your Query NODEData
output should show
you which nodes, if any, do *not* have backups on the same tapes.

Try running a preview EXPort Node command for single or multiple

nodes

to get some
idea of what tapes will be mounted and how much data you will need to
export.

Call me if you want to talk about any of this.
--
Jim.Owen AT Yale DOT Edu   (w#203.432.6693, Verizon c#203.494.9201)

Roger Deschner wrote:

MOVE NODEDATA looks like it is going to be the key. I will simply

move

the affected nodes into a disk storage pool, or into our existing
collocated tape storage pool. I presume it should be possible to

restart

MOVE NODEDATA, in case it has to be interrupted or if the server
crashes, because what it does is not very different from migration

or

relcamation. This should be a big advantage over GENERATE

BACKUPSET,

which is not even as restartable as a common client restore. A

possible

strategy is to do the long, laborious, but restartable, MOVE

NODEDATA

first, and then do a very quick, painless, regular client restore

or

GENERATE BACKUPSET.

Thanks to all! Until now, I was not fully aware of MOVE NODEDATA.

B.T.W. It is an automatic tape library, Quantum P7000. We graduated

from

manual tape mounting back in 1999.

Roger Deschner      University of Illinois at Chicago

rogerd AT uic DOT edu


On Tue, 22 Jan 2008, Nicholas Cassimatis wrote:

Roger,

If you know which nodes are to be restored, or at least have some

that are

good suspects, you might want to run some "move nodedata" commands

to

try

to get their data more contiguous.  If you can get some of that

DASD

that's

coming "real soon," even just to borrow it, that would help out
tremendously.

You say "tape" but never "library" - are you on manual drives?

(Please say

No, please say No...)  Try setting the mount retention high on

them,

and

kick off a few restores at once.  You may get lucky and already

have

the

needed tape mounted, saving you a few mounts.  If that's not

working

(it's

impossible to predict which way it will go), drop the mount

retention

to 0

so the tape ejects immediately, so the drive is ready for a new

tape

sooner.  And if you are, try to recruit the people who haven't

approved

spending for the upgrades to be the "picker arm" for you - I did

that

to an

account manager on a DR Test once, and we got the library approved

the next

day.

The thoughts of your fellow TSMers are with you.

Nick Cassimatis

----- Forwarded by Nicholas Cassimatis/Raleigh/IBM on 01/22/2008

08:08 AM

-----

"ADSM: Dist Stor Manager" <ADSM-L AT VM.MARIST DOT EDU> wrote on

01/22/2008

03:40:07 AM:

We like to talk about disaster preparedness, and one just

happened

here

at UIC.

On Saturday morning, a fire damaged portions of the UIC College

of

Pharmacy Building. It affected several laboratories and offices.

The

Chicago Fire Department, wearing hazmat moon suits due to the

highly

dangerous contents of the laboratories, put it out efficiently in

about

15 minutes. The temperature was around 0F (-18C), which

compounded

the

problems - anything that took on water became a block of ice.
Fortunately nobody was hurt; only a few people were in the

building

on a

Saturday morning, and they all got out safely.

Now, both the good news and the bad news is that many of the

damaged

computers were backed up to our large TSM system. The good news

is

that

their data can be restored.

The bad news is that their data can be restored. And so now it

must

be.

Our TSM system is currently an old-school tape-based setup from

the

ADSM

days. (Upgrades involving a lot more disk coming real soon!) Most

of

the

nodes affected are not collocated, so I have to plan to do a

number

of

full restores of nodes whose data is scattered across numerous

tape

volumes each. There are only 8 tape drives, and they are kept

busy

since

this system is in a heavily-loaded, about-to-be-upgraded state.

(Timing

couldn't be worse; Murphy's Law.)

TSM was recently upgraded to version 5.5.0.0. It runs on AIX 5.3

with a

SCSI library. Since it is a v5.5 server, there may be new

facilities

available that I'm not aware of yet.

I have the luxury of a little bit of time in advance. The hazmat

guys

aren't letting anyone in to asess damage yet, so we don't know

which

client node computers are damaged or not. We should know in a day

or

two, so in the meantime I'm running as much reclamation as

possible.

Given that this is our situation, how can I best optimize these
restores? I'm looking for ideas to get the most restoration done

for

this disaster, while still continuing normal client-backup,

migration,

expiration, reclamation cycles, because somebody else unrelated

to

this

situation could also need to restore...

Roger Deschner      University of Illinois at Chicago

rogerd AT uic DOT edu

Re: [ADSM-L] Fw: DISASTER: How to do a LOT of restores? [like Steve H said, but...]