ADSM-L

Re: using JBOD as a copy pool?

2004-11-15 23:54:53
Subject: Re: using JBOD as a copy pool?
From: "Mark D. Rodriguez" <mark AT MDRCONSULT DOT COM>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Mon, 15 Nov 2004 22:54:39 -0600
Mike,

I have been following this thread and was a little confused at first.  I
could not tell what it is you wish to accomplish.  It appears you are
asking us if a solution is valid without really telling us what the
problem is!  If we knew what the problem is we may have a good solution
that you have not thought of yet.

Now on your last post it appears that your concern has to do with the
speed at  which you could do a restore of 225GB of data.  Well there is
lots of ways to skin that cat.  First of all we need to know a little
about the data; lots of little files or are they big files; is it spread
across multiple filespaces (on windows that means drives); is this all
standard backup data or is some of it backed up using the add on
programs for mail or data base protection (formerly known as TDPs); is
the vast majority of the data static from say week to week or are most
of the files changing on a weekly basis; does this 225GB include
restoring the OS or is this all just data?  All these things can have an
impact on how you want to design your backup/restore strategies.

A couple of techniques you could look at (assuming the is spread across
several filespaces) is doing regular image backups of your filespaces.
If they don't change much from week to week then doing it monthly would
be fine, if a large part of the data changes with in any given week then
you need to do it weekly.  Remember image backups don't work for the C:
(boot) drive.  Image restores (actually you would use an image plus
incremental restore) for entire filespace are much faster than file
level restores, especially if there are large number of small files.
This will allow you to restore from several tape drives concurrently
since each filespace is handled separately.  Also, when you do your
regular incremental backups send the data into a storage hierarchy that
will do collocation on a filespace basis.  This will make sure that
there is not tape contention during the restore.  In theory, you should
be able to easily keep a 100Mb ethernet fully loaded with just 2 or
definitely with 3 tape drives (assuming modern drives like LTO2 or
3592), this is allowing for mounts and seeks of the different drives.
At an average throughput of 10MB/sec over ethernet that is a 6.25 hour
restore.  If you had Gigabit ethernet you would probably need about 4 or
5 drives to keep it busy and you could get your restore in under an
hour!  Of course all of this is based on you having no other contention
for bandwidth on the network or for the tape drives.  Also, one other
limiting factor and probably the most important one is whether the
clients disk subsystem can swallow the data fast enough.  Unfortunately
that is usually the problem!  We can almost always get the data to the
client much faster than the client can put it on to its disks and there
is nothing that ITSM can do about that.  If the client needs the fastest
possible restores than it will need the hardware necessary to keep up.

--
Regards,
Mark D. Rodriguez
President MDR Consulting, Inc.

===============================================================================
MDR Consulting
The very best in Technical Training and Consulting.
IBM Advanced Business Partner
SAIR Linux and GNU Authorized Center for Education
IBM Certified Advanced Technical Expert, CATE
AIX Support and Performance Tuning, RS6000 SP, TSM/ADSM and Linux
Red Hat Certified Engineer, RHCE
===============================================================================


Mike wrote:

Hi Paul!

On Mon, 15 Nov 2004, Paul Zarnowski wrote:



At 11:11 AM 11/15/2004, Stapleton, Mark wrote:


From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On
Behalf Of Mike


We have a few servers where the TSM GUI estimates nearly 3.5
days to restore all the files. We're talking now about getting
something like a JBOD as an additional copypool (is that the
right term?). The data flow would be node->storage pool->tape pool
                                   ->JBOD copy pool.

The JBOD copy pool would self-expire after 3 days (depending
on how much is actually stored to the JBOD.

With the nightly backups copied to the JBOD, would this
help/speed up a catastrophic restore (catastrophic of the
node, not of TSM)?


I'm curious as to how you plan to "self-expire" the JBOD copy pool.
Remember that there will be as many files in the copy pool as there is
in the primary pool.


Mike,

I don't see how what you are proposing would work - I have the same
question that Mark has.  Also, TSM will not restore directly from the copy
pool UNLESS volumes in the primary pool have been marked destroyed.  I
don't think this is what you're looking for.  However, you might consider
the following instead:

Data flow: node -> JBOD storage pool -> tape pool
                                    -> tape copy pool

For the JBOD storage pool, use a MIGDELAY of 3 days.  This will ensure that
objects stay on the JBOD storage pool for at least 3 days before being
migrated.

Be careful - if your JBOD storage pool is random access and gets large,
then if you should ever lose your database, the time required to perform an
audit of all of the JBOD volumes could take a very long time to
complete.  As with many things, there are tradeoffs to consider.



Honestly I don't yet know how or if it will work. The data flow
I'm thinking of is:

node -> primary storage pool -> tape pool
                            -> tape copy pool
                                                        -> JBOD pool

Once the node has sent it's data to the storage pool, then the
storage pool migrates everything to the other three pools. The
JBOD just expires; does not migrate to another pool. Is there
someway to create a priority or hierarchy of restoration methods
such that the JBOD is checked first for the files, then the
tape pools are checked?

What started this is an estimate from a windows gui that to
restore ~225GB of data would take ~105 hours. The windows admin
finds this unacceptable. The goal is to restore most of the
data in a short amount of time, say restore the fileset in
5-6 hours (fourth point of contact), with (or not as needed)
incremental updates to follow.

Mike




<Prev in Thread] Current Thread [Next in Thread>