ADSM-L

Re: question on configuring large NT client for optimum restore proce ssing

2002-08-17 17:13:41
Subject: Re: question on configuring large NT client for optimum restore proce ssing
From: Don France <DFrance-TSM AT ATT DOT NET>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Sat, 17 Aug 2002 14:03:49 -0700
Interesting approach, Zlatko;  I agree, this should work -- if one is very
serious about restore SLA (and simulating Unix filespace-level
collocation -- using node-level collocation with your suggestion); a simpler
approach could accomplish the same effects:  cap drive-letter size at 100 or
250 GB, or maybe even 500 GB, then start a new drive letter. (The
drive-letter is a filespace on Windows platforms... until/unless you get
into the DFS or NTFS virtual volume game and then it's not that much
different).

If the customer restore SLA can "tolerate" 10-20 GB/Hr, a 300 GB cap with
DIRMC "tricks" will be sufficient;  just use high-level directories for
"GrpData" and "UsrData" with multiple restore threads; using classic (rather
than no-query) restore causes tape mounts to be sorted (more think time on
the server), it's another performance trade-off (lots of tapes, collocation,
lots of versions, large number of files being restored -- all interact to
affect restore speed).

I've supported several emergency server recovery situations, recent customer
had DLT, no collocation, DIRMC (worked great), 316 GB was restored in 30
elapsed hours -- that would have been more like 20 hours if they hadn't
over-committed the silo, requiring tape mounts for over 40 tapes in a 29
slot silo.

Tim ==> BTW, if you are going to use NAS filer from IBM, you can run the
backups (from the filer); so, with weekly (or monthly) full-image, in
concert with daily (or weekly) differential-image, plus normal daily
progressive-incremental (for file-level granularity), you'd get the fastest
file restore *and* server recovery possible... probably saturate the
network, getting 30 or 300 GB/hr (100Mbps vs. 1 Gbps).

Also, have you looked at Snapshot and/or SnapMirror support?!?  IBM NAS
comes with TSM Agent at 4.2 level;  IBMSnap, PSM & DoubleTake components
allow you to protect the NAS-based data on the NAS (so you could mirror each
drive-letter or network share) and run backups (image and file-level)
directly from the NAS. This kind of online/nearline recovery could totally
mitigate your restore SLA concerns;  if your SLA states 99% of the time,
recovery must be done in less than 4 hours, you are covered -- you only need
tape restores for site-level or drive-level disaster, which becomes less
than 1% of the failure instances, over time (after the first year).  See the
latest RedBook info about Snapshots & Replication with PSM & Double-Take --
built-in components of the IBM NAS, along with TSM client.

Typical NAS customers get a bunch of snap/mirror capacity (sufficient for a
number of days worth of Snapshots plus some RAID-1 protection of critical
data, RAID-like striping for faster performance, etc.)... so, you probably
won't need TSM tapes as much as in the old days where all the backup data
resides (exclusively) on tapes.  JBOD devices have gotten soooo cheap, it's
not terribly expensive to keep 7 or 14 days of incremental snapshots online
(in the ~snapshot directory, if using NetApp Snapshot, for example);  once
the end user is told where his snapshot data is stored, he stops calling in
trouble tickets for restores less than the snapshot retention period!

Check out the RedBooks on IBM NAS... up and coming, cheap (JBOD) solutions
to large file servers.  See this one, to start
http://publib-b.boulder.ibm.com/Redbooks.nsf/RedbookAbstracts/sg246831.html

Looks like you are in for some *actual* fun, with this project!!!


Don France
Technical Architect -- Tivoli Certified Consultant
Tivoli Storage Manager, WinNT/2K, AIX/Unix, OS/390
San Jose, Ca
(408) 257-3037
mailto:don_france AT att DOT net

Professional Association of Contract Employees
(P.A.C.E. -- www.pacepros.com)



-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU]On Behalf Of
Zlatko Krastev
Sent: Saturday, August 17, 2002 9:03 AM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Re: question on configuring large NT client for optimum restore
proce ssing


Tom,

try to emulate virtualmountpoint through separate node names:
-       for each huge directory (acting as "virtualmountpoint") define a
node. In dsm.opt file define
--              exclude X:\...\*
--              include X:\<the_dir>\...\*
--              exclude.dir X:\<dir_1>
--              exclude.dir X:\<dir_2>
--              exclude.dir X:\<dir_3>
-       define other_dirs node with excludes for all "virtualmountpoints"
and without first exclude and the include.
Thus only the directory is included, existing known directories are
exclude.dir-ed and not traversed. If new directory is created and
forgotten to be excluded it will be traversed but only structure will be
backed up and not files. Last node will backup all but "virtualmountpoint"
directories.
You can create several schedule services and add them to MSCS resource
group for that (one-and-only) drive.
Collocation will come from itself.
Disclaimer: never solved such a puzzle nor tested it but ought to work.

Zlatko Krastev
IT Consultant




Please respond to "ADSM: Dist Stor Manager" <ADSM-L AT VM.MARIST DOT EDU>
Sent by:        "ADSM: Dist Stor Manager" <ADSM-L AT VM.MARIST DOT EDU>
To:     ADSM-L AT VM.MARIST DOT EDU
cc:

Subject:        question on configuring large NT client for optimum restore
proce ssing

What's the preferred method of configuring a large NT fileserver for
optimum
data recovery speed?

Can I do something with co-location at the filesystem (what IS a
filesystem
in NT/2000?) level?

We're bringing in an IBM NAS to replace four existing NT servers and our
recovery time for the existing environment stinks. The main server
currently
has something over 800,000 files using 167 GB (current box actually uses
NT
file compression, so it's showing as 80 GB on NT). We had to do a recovery
last year (raid array died) and it ran to 40+ hours; I'm getting the
feedback that over 20 hours will be un-acceptable.

The TSM server and the client code are relatively recent 4.2 versions and
will be staying at 4.2 for the rest of this year (so any neat features of
TSM 5 would be nice to know but otherwise unuseable :-)

To add to the fun and games, this will be an MS cluster environment. With
1.2 TB of disk on it. We do have a couple of weeks to play around and try
things out before getting serious. One advantage to the MSCS is that disk
compression is not allowed, so that should speed things up a bit on the
restore.

Tom Kauffman
NIBCO, Inc

<Prev in Thread] Current Thread [Next in Thread>