New HSM configuration for large files

amsterdam

Active Newcomer
Joined
Mar 18, 2011
Messages
14
Reaction score
0
Points
0
Location
Ohio
I'm setting up a new HSM filesystem on a new server. This is RedHat 5, TSM server 6.2.1 with GPFS 3.4. My questions are about how to configure policies for migration and backup of the files. My HSM instance will archive lots of very large video files that, once they've been initially written to disk and migrated to tape, will never change nor be deleted. Therefore, I don't want to do a traditional backup of the files that would create multiple versions since this would eat up lots of tape and be unnecessary; however, I still need to backup the filesystem, including the stub files and any other files than may not yet be migrated.

Question #1: If I set
migrequiresbkup=yes, will this essentially create two copies of the file: One that's the backup copy and one that's the migrated copy? I don't want this, so I'm assuming set this to "no."

Question #2: In other posts, people have said that backups of HSM filesystems can cause all the files to be retrieved. How do I avoid that? I just want whatever's on the filesystem, including stubs and any file not yet migrated, to get backed up. Migrated files should just stay where they are on the tape.

I should also add, I'm planning on using two different tape pools, one for backup and one for the migrated data. This is so I can expire backup data and reclaim the space. Migrated files, once written to tape will only ever be read and then copied to offsite pools for disaster recovery. Once migrated files are sent offsite they'll basically stay there for years until it's time to move them to new media.

Thanks in advance to the help.

...adam
 
This is a tricky thing to address. You really should take into account why you only want one copy of a file? For a disaster scenario you probably will want more than one copy hanging around. For example what would you do if the file space (partition) should get corrupt or go bad or get wiped out somehow (yes I have seen this happen with a HSM controlled partition). By specifying MIGRATIONREQUIRESBACKUP you are protecting yourself against something like this. Use a management class that only keeps one copy of the data. They you can have an Offsite copy as well. Then HSM will migrate the files to DISK/TAPE as you wish. Once the initial backup of the file takes place, incremental backups should backup the stub file without recall. On a Windows system I know there is a SKIPMIGRATED parameter but I don't remember seeing this for any IX system. However you do talk about archive of files. If you are archiving then you do not need the extra copies of the files. The only thing I can add is Tapes do fail. Disk does fail. Operators make mistakes.
 
Oh I definitely want copies... 2 in fact. Here's the HSM side of the equation:
- files are migrated to the tape pool designated for space mangement (SpacePool)
- migration happens fairly quickly after the data is initially written to disk, let's say with in a few days
- shortly after that, SpacePool data gets copied to two additional copy pools that are sent offsite

So, within a week or so, I should have three copies of a file, the first is in the library and can spool back to disk when needed and the other two are off site.

Now to the backup side of the equation:
- during the intervening time before a file is migrated to SpacePool, backups should/would occur on the HSM filesystem
- backups are sent to a tape pool called BackupPool
- if the backup grabs some of those big files before they're migrated, I could eventually have 4 copies of my file: one in BackupPool, one in SpacePool and two in each of the copy pools

So what's the best way to handle this?
1) Schedule migrations of large files that they happen before a backup run? That way, only stub files or other smaller files would get written to the backup Pool. The only caveat is that I'd have to make sure that my copy pools get updated with the new data in case the one copy of the file in SpacePool fails.

2) Schedule BackupPool data to expire just after the amount of days it takes for SpacePool to write data to a copy pool, and hope I have enough scratch tapes...

Thanks for the questions. This is really helping me think through this.

...adam
 
Sounds like a plan to me. However if the large files get migrated to HSM before a backup happens, then when the backup does occur the large file will get recalled, backed up, and probably return to either resident or pre-migrated state. Then migration will have to happen again (which is not a big deal).
I am pretty sure this is the way it will happen. I have about 15 customers using the IX verison of HSM and essentially there are two types of automatic migration. They are threshold and demand. You may get around all this by using the "pmpercentage" and setting it to 100% and the files will be premigrated but still be in the file space for backup and no recall should happen.
The manual says: The percentage of file system space that is available to contain premigrated files. The default is the difference between the percentage that you set for the high threshold and the percentage that you set for the low threshold. Specify a value from 0 through 100 percent
 
If I'm understanding you correctly, a backup has to occur in order to avoid any recall. If that's the case, it sounds like the best and most efficient way of scheduling everything is:
- premigrate files first to your space management pool
- run a backup, which essentially makes a second copy of your premigrated files
- finish the migration process, leaving stub files
- copy space management pool data to your copy pools
- re-run your backup

So long as your only keeping one version of the file in backup, the large file will get rewritten as the smaller stub file.

Does that sound right?
 
Well no. When a file is pre-migrated it is still on the disk (file system). Files are "r" for resident, "p" for pre-migrated and "m" for migrated. When the file is in "m" there is only a stub file on the disk. TSM is aware of data that has been backed up. However if you are using migrationrequiresbackup no, you will probably only get the backup once. The file will still be backed up as far as I know.
I could be totally wrong here so if anyone else has anything to add, please do so.
The best I can say at this point is give your plan a try and see how it works. Then do some random restores/recalls etc. to make sure you have the data out there.
 
Back
Top