ADSM-L

Re: Migrating an HSM filesystem

2007-02-12 21:36:46
Subject: Re: Migrating an HSM filesystem
From: Anker Lerret <ADSM-L AT LERRET DOT US>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Mon, 12 Feb 2007 21:22:48 -0500
> I believe that the best HSM transfer method is to first perform a
> recursive dsmmigrate of all its file system data, which leaves only
> stubs or small files in the file system.  This is to make the later
> restoral faster.  On the new system, you would establish a file
> system having the name of the original, then put that under HSM
> control.  Now, perform a restoral with RESToremigstate Yes in effect,
> to fully populate that newly-established HSM file system with a
> directory structure and stub files and files which were too small to
> participate in HSM migration.  (Note that the stub files will not
> contain any leading file data.)

Thank you very much for the quick, thorough response.  This is, I believe,
a *perfect* solution, as it will leave the old filesystem in place in case
of questions.

Since I posted yesterday, I've learned that the new copy of the HSM
filesystem will not be placed on a new TSM client, but on a client that's
already defined to the TSM server.  That means that the RENAME NODE
technique won't work.  In fact, I can't see a way to tell TSM that a
filespace is moving to another TSM client.  It doesn't matter, though,
with the RESTOREMIGSTATE technique.


>> Our only wholesale loss of TSM data involved HSM;
>
> I'm curious about the circumstances of such a data loss, particularly
> where there should be backups available.  I've been running HSM for
> over a decade without data loss.

This happened before I was deeply involved with TSM and it was a clear
case of administrative assault, so we can't really blame HSM--except that
HSM was so flaky when this happened that folks were issuing random
commands instead of following well-understood procedures.

As I understand it, HSM was having some sort of problem (a frequent
occurrence then) and two TSM administrators were working on the problem,
each unaware of the other.  One administrator unmounted the HSM-enabled
filesystem, leaving the stub files exposed.  Soon afterward, the other
administrator issued a DSMC INCR (thinking to get a good baseline).  The
DSMC INCR started replacing the full-sized files with stubs.  As soon as
the two realized what had happened, they killed the DSMC INCR, but not
before numerous multi-gigabyte files were replaced with stubs.  I was
called in at this point and we couldn't figure out how to recover without
restoring the TSM database, which would have had other unhappy
repercussions.  We took a deep breath and told the users.  They had
application problems for months afterwards as random files came up short.

So there's my confession (in more detail than you probably wanted).  The
overall narrative is correct, but I may not have the details perfect.

> I understand your fears: HSM is probably the most complex and Rube
> Goldberg-ish facility in TSM, with lots to go wrong.

Thank you; I'm glad we're not the only ones.  I'll let the list know how
this procedure works (after we've pulled the arrows our of our backs).
Don't hold your breath; we're just starting to plan now, so it will be
weeks or months before we actually do it.

Thanks!
anker

<Prev in Thread] Current Thread [Next in Thread>