Veritas-bu

[Veritas-bu] Datacenter versus TSM

2003-03-12 18:22:07
Subject: [Veritas-bu] Datacenter versus TSM
From: vaxzilla AT jarai DOT org (Brian Chase)
Date: Wed, 12 Mar 2003 15:22:07 -0800 (PDT)
On Wed, 12 Mar 2003, David A. Chapa wrote:

> Well Dan I did a comparative analysis of ADSM and NBU a few years ago for a
> client.  I haven't looked at TSM much recently, but I remember from meetings
> with IBM that they were going to make ADSM more like NetBackup with regards
> to the scalability and master/media concept.  This may be completely wrong
> with today's product, but this is what I remember from doing this work.

I've run both ADSM v3 and NetBackup in reasonably sized multi-terabyte
environments.  There are pluses and minuses to each.  I have some
observations and corrections to offer:

> In the past ADSM (pre-TSM) had some very significant problems with scaling
> after it was installed.
>
> 1.  Use of a proprietary Database meant that you had to pre-allocate disk
> space to be used for your database, if you outgrew this space, it was not an
> easy task to re-size

This statement, as a whole, is false.  Although TSM does use a proprietary
database, it can be easily expanded.  The database consists of DB
volumes, these volumes can either be raw disk volumes or they can be
special files created on a regular filesystem.  To expand the database,
you add more DB volumes.  In my setup, I used DB volume files on a
regular filesystem.

> 2.  Performance tests that were run in their NY facility claimed some very
> large amount of data (~1TB) was backed up in a very small window (~1hr?).
> Here's what the important thing to realize, they were able to achieve those
> speeds because they had 16 individual ADSM servers backing up an ORACLE
> database.  These 16 servers didn't know about each other and maintained
> their own database, no single repository for meta data as with NBU.
>
> The other things that I noted was their paradigm for backup was different,
> Incremental forever.  Their spin is why should you back up the data over and
> over again if you don't need to.

Their backup model, if it's compatible with one's environment, is really
/really/ nice.  You not only reduce network traffic with their scheme,
but you end up saving considerable amounts of tape.

> Good point, and I kind of like this
> idea...however, when I added it all up I became increasingly more concerned
> about database corruption and recoverability.  What if your backup database
> had problems?  What was the course of action to restore/recover?  While I
> liked the idea, it was too questionable for client/server technologies,
> perhaps not as much in the mainframe world, which is where TSM/ADSM
> originally came from (AD*Star).

The TSM database supports mirroring of the database within TSM itself.
It's also a /true/ database with log volumes and rollback capabilities.
In my setup, I mirrored the database and log volumes within TSM and
across two physical disk arrays.  If either disk array died, TSM would
keep trucking along.  It was pretty nice.  Additionally, there were
backups of this database made to tape.  The backups could be triggered
at certain times of day, or after some percentage of change to the
database.  Also, the TSM database backup could be run concurrently with
your regular backup because of it being a true database, with the log
volumes.  There were no limits on the number of database backups you
could make.

IBM/Tivoli did a really good job of providing facilities to protect the
database from corruption, far better than VERITAS do.  However, if that
database were to be corrupted, you are S.O.L.  More significant than
than the proprietary database format is TSM's proprietary tape format.
The biggest concern to have is that, without a valid TSM database in
place, your backup tapes are absolutely worthless.

> Another issue I found to be quite bothersome, is this bit called
> reclamation.  TSM has two modes of backup, co-location or non co-location.
> Basically it will co-locate all data without regards to how long it will be
> retained and non-co-locate doesn't.  Much like the NBU default behaviour for
> not mixing retention levels.  However, I believe in the interest of
> performance and time, co-location is recommended, which means there must be
> another process somewhere that manages these mixed retention levels
> otherwise you'll have "holes" on your tapes and very inefficient use of
> expensive media.  So this is call reclamation.  This reclamation process
> reclaims all of the space on the tape by moving the data with similar
> retentions to like media.  Nice process, very cool, but if a reclamation
> process is not finished yet, your backups cannot run, they must wait for the
> reclamation process to finish before a backup can start.

TSM's reclamation and data management facilities are second to none.
As data expired from tapes, leaving them them "fragmented" with holes in
them where data had been expired, TSM keeps track of what percentage of
space is in use on the tapes.  You could set thresholds, such that, when
a tape was only 50% full, the data would be moved off onto new active
media.  The existing tape would then be reclaimed.  You could also move
data off of a tape onto another in the same pool with a simple "move
data" command.  This was very handy for dealing with a tape that had a
media error.

Other really nifty features included the ability to reclaim and
consolidate data from your offsite tapes.  If say an offsite tape hit
the reclamation threshold you've set, the local onsite copy of that data
would be used to create a new consolidated offsite tape.  The offsite
tape would then be indicated as one that could be returned.  Very slick.

I also don't ever recall the reclamation process preventing backups from
occuring.  It did cause the server to slow down as the process scanned
through the database looking for data to expire.  If this happened at an
inconvenient time, it was easy enough to abort the process from the
admin console.

And come to think of it, you can also do things like temporarily suspend
a running backup if you need a drive for something else that's more
urgent. You've a lot of control available to you, if you need it.

> Before they bought/integrated with Tivoli, there was no way to push
> out the backup client you had to manually install it on each server.
> With the integration of Tivoli you can use NetCourier to carry the
> installation package to the client and have it done that way.

That's not entirely true--you did have to write your own configuration
tool to do it.  Mine was a perl script that unpacked and configured the
client.  (On Unix systems, admittedly.  For Windows or older Mac
clients, it was done manually).

> I do not remember their interface, both gui and command line, to be easy.
> There is a significant learning curve.

The setup of TSM is a lot more involved, but day to day management of
things was not much more complex than NetBackup.  I tended to avoid the
TSM GUI, as I'm a command-line sort of guy, but they did have a really
nice text based admin console.  The command line tools were about as
cumbersome as those of NetBackup.  I like the volume manager, "vm"
prefixed tools a lot.  They're very consistent and sensible.  The
NetBackup, "nb"  prefixed tools are miserably inconsistent and gross
by comparison.

Another things I found really useful with TSM was the ability to create
custom queries of the TSM database.  Using the admin command line, you
could pass SQL queries to TSM to generate reports.  You could also
compose complex queries in macro files, ones capable of accepting
parameters.  For those familiar with SQL programming, this was a plus.

> The client that had me do this analysis decided to go against my
> recommendation to stay with NBU and went with TSM.  They spent lots of
> money and have recently called me up to help them move back to
> NetBackup not more than 18 months later.
>
> Having said that...there are some very cool things about TSM.  The
> coolest thing is that is caches backup to disk before streaming to
> tape...or should I say SCREAM to tape.  This is something that I would
> love to see VERITAS do, but I think I have an idea that will achieve
> the same thing without having VERITAS' code being changed.

The staging pools of TSM are something I miss very, very, very much.
I'd could have hundreds of clients streaming data into my TSM server and
it'd all be consolidated in the staging pools before being streamed onto
tape.  If you've a lot little backups coming from different clients, the
TSM disk staging features will keep the data flowing at a very high
rate.

My overall opinion is that TSM is a much better product than NetBackup
from a technical standpoint, but they made a few significant design
decisions which I find unacceptable from the practical standpoint.
The first one is that they store tape data in a proprietary format, the
second is that you can't recover any data without having TSM running and
your TSM database loaded.  One consequence of this is that TSM isn't
helpful in cases where you need to data tapes with external
organizations;  it also makes using TSM as part of an archiving solution
not terribly workable.

-brian.


<Prev in Thread] Current Thread [Next in Thread>