ADSM-L

Re: [ADSM-L] TSM vs. Legato Networker Comparison

2007-07-25 19:23:17
Subject: Re: [ADSM-L] TSM vs. Legato Networker Comparison
From: Stuart Lamble <adsm AT CAROUSEL.ITS.MONASH.EDU DOT AU>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Thu, 26 Jul 2007 08:57:09 +1000
On 26/07/2007, at 2:54 AM, Schneider, John wrote:

Greetings,
        We have been a TSM shop for many years, but EMC came to our
management with a proposal to replace our TSM licenses with Legato
Networker, at a better price than what we are paying for TSM today.
This came right on the heels of paying our large TSM license bill, and
so it got management's attention.
        We have an infrastructure of 15 TSM servers and about 1000
clients, so this would be a large and painful migration.  It would
also
require a great deal of new hardware and consultant costs during the
migration, which would detract from the cost savings.
        So instead of jumping from one backup product to another based
on price alone, we have been asked to do an evaluation between the two
products.  Do any of you have any feature comparisons between the two
products that would give me a head start?

Funny you say this. Monash was a Legato (now owned by EMC) shop until
the TSM migration around 3-4 years ago; I suspect that a large number
of the problems that were perceived to be Networker's fault were
actually the fault of the aging DLT silos and drives that underlay
Networker. I still have fond memories of those silos; they gave me a
great deal of callout pay every time they had a stuck cartridge or
similar. :-)

I also suspect that the greater reliability we've had since putting
in TSM is more because we also got in new tape silos (LTO2, now half
LTO2 and half LTO3, and soon to be half LTO3 and half LTO4) - if we'd
stuck with the DLT silos, we'd still be in a world of pain,
regardless of the software.

There are plusses and minuses to both products. Some points to consider:

  * Networker uses the traditional "full plus incrementals, or dump
levels" system. Monash used a pattern of "full once a month;
incremental every other day; and a dump level interwoven" - so, for
example, it might go "full, incremental, level 8, incremental, level
7, incremental, level 9, level 2, incremental, level 8, incremental,
etc." - the idea being to minimise the number of backups needed to
restore a system.
  * Networker indexes are somewhat analogous to the TSM database. In
theory, you can scan each tape to rebuild the indexes if they're
lost; in practice, if you lose the indexes, you're pretty much dead -
there's just too much data to scan if the system is more than
moderately sized. Yes, Networker backs up the indexes each day. :)
  * At least the versions of Networker (up to 7.x) we used doesn't
support the idea of staging to disk - everything goes directly to
tape. However, data streams from multiple clients are multiplexed
onto tape to get the write speeds up. This is good for backups, but
does make recovery slower (since the data read will include a lot of
data for other clients.)
  * No more reclamation or copy pools to deal with (because of the
traditional full/incremental/dump level system). So the burden placed
on the tape drives is probably going to be significantly lower
(although you will be backing up more data each night than you would
with TSM.)
  * I don't think Networker has anything analogous to TSM's scratch
pool: volumes belong to a pool of tapes, and there's no shuffling
between the pool. So if the "standard" pool has a hundred tapes
available for use, but the "database" pool is out of tapes and needs
one more, you need to manually intervene. This *may* have been
because of the way we configured Networker, though, and it may also
have changed in the interim. Note that you *have* to have a separate
pool of tapes for index backups.

My honest assessment mirrors that of the other people who have
replied: use this as an opportunity to negotiate better pricing from
IBM, and point out to the powers that be that there are risks
involved with moving to a different backup product. There's nothing
wrong with Networker, it's a good system, but you aren't familiar
with it; it takes time with any new product to learn the tricks of
the trade. It's only in the past year or two that we've started to
feel more competent with TSM, as we've found and dealt with problems
in the production system which never showed up (and would never show
up) in the smaller scale proof of concept.

You also should note that it took Monash a couple of years to finish
the migration from Networker to TSM; I would expect a migration in
the other direction would take at least a year. I definitely would
not advise a dramatic cut-over - do a small number of servers at a
time to make sure you're not pushing the server too hard (and
besides, you want to stagger the full backups so they don't all take
place on the same day ...)

Oh, one other point that comes directly from Monash's experience with
Networker (assuming you do go down that path): we had a number of
large servers (mail in particular) that would take a very long time
to do a complete full backup. We ended up setting Networker up to
stagger the full backups on their filesystems: system filesystems on
day 1; mailbox filesystem 1 on day 3; mailbox filesystem 2 on day 5;
etc. Whether this is likely to be an issue for you depends on your
system; the mail servers in question used an mh-like directory tree:
one file per email message, which is pretty much the pathological
case for any backup system (especially since the mail volume is
massive: I don't know how many thousands of staff and students there
are across our campuses, but it would not be exaggerating to say tens
of thousands, if not more.)

It's not a direct comparison, but hopefully there's some information
in there that you'll find useful. Whatever happens, I wish you luck.