Networker

[Networker] HSM Modules (was Re: Migrating from Networker to Commvault galaxy)

2008-07-09 12:02:43
Subject: [Networker] HSM Modules (was Re: Migrating from Networker to Commvault galaxy)
From: John Stoffel <john.stoffel AT TAEC.TOSHIBA DOT COM>
To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
Date: Wed, 9 Jul 2008 12:00:17 -0400
>>>>> "Bruce" == Bruce Breidall <Bruce.Breidall AT CONCUR DOT COM> writes:

Bruce> I would be cautious installing any third party HSM module,
Bruce> especially if the number of files reaches into the millions.

Are you talking in regards to Networker and third party HSM tools? 

Bruce> You will find that it will be impossible to finish your sweeps
Bruce> of the objects, and then you really have a mess. The management
Bruce> side will be a nightmare (unable to clean up orphaned objects,
Bruce> unable to finish migrations, etc...).

I can sorta see what you're saying, but I'd like to know more context
here, esp since I think HSM is a valid way to improve the
manageability of data and backups.  

For example, we have quite a number of NFS file systems on NetApps
which we'd love to do HSM on because it would allow us to migrate
un-used data off to cheaper storage.  And it would allow us to NOT
have to do full backups every month of data that never changes.

Maybe synthetic fulls are the answer here, so that we only run
Incrementals each night and then once a month we build a synthetic
full.  This could be a big help if it works properly.

Bruce> If the data is currently indexed and owned by an application,
Bruce> it is there that policies should be enforced and data movement
Bruce> is controlled.  Easier said than done....but start planting the
Bruce> seed now.

What if the applications are just plain dumb?  Or the users who manage
the data don't *have* any type of indexing tool that works with their
tools to help manage their data.  

I find your statement confusing and I'd like to know more. 

Bruce> Archiving and backups should be completely separate entities,
Bruce> and they should never touch - especially when you are trying to
Bruce> address large file systems with 100's of millions of files. The
Bruce> vendors will have you thinking otherwise, because they know how
Bruce> difficult it is to get this accomplished.

Let's get our terminology straight, because we can't discuss stuff
properly if we don't agree on the basics.  *grin*


Backup:

Copying the data in a filesystem to some other media.  Can be a true
copy, an incremental copy of just changed files/dirs.  The idea is
that this data is used for restoration of entire filesystems,
directories or single files.  

Archive:

Copy and DELETION of data from the source to another form of Media.
Can be indexed to the file level or not.  Restore to the source file
system involves intervention.  This process grabs all files/dirs and
their contents when run.

HSM:

The movement of files/dirs from one storage device to another while
presenting a consistent and unified interface to the enduser so that
they do not know that anything has changed.  This process is designed
to move files/dirs not accessed in a configurable time frame to
cheaper media.

Ideally HSM integrates into the Backup/Archive section above so that
when you move files from primary to secondary storage, the backup does
NOT need to continue to backup the secondary storage.  But on restore,
the end-user visible filesystem can be restored easily and quickly.

John

To sign off this list, send email to listserv AT listserv.temple DOT edu and 
type "signoff networker" in the body of the email. Please write to 
networker-request AT listserv.temple DOT edu if you have any problems with this 
list. You can access the archives at 
http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER