• Please help support our sponsors by considering their products and services.
    Our sponsors enable us to serve you with this high-speed Internet connection and fast webservers you are currently using at ADSM.ORG.
    They support this free flow of information and knowledge exchange service at no cost to you.

    Please welcome our latest sponsor Tectrade . We can show our appreciation by learning more about Tectrade Solutions
  • Community Tip: Please Give Thanks to Those Sharing Their Knowledge.

    If you receive helpful answer on this forum, please show thanks to the poster by clicking "LIKE" link for the answer that you found helpful.

  • Community Tip: Forum Rules (PLEASE CLICK HERE TO READ BEFORE POSTING)

    Click the link above to access ADSM.ORG Acceptable Use Policy and forum rules which should be observed when using this website. Violators may be banned from this website. This notice will disappear after you have made at least 3 posts.

What's the scoop on hard links?

ldmwndletsm

ADSM.ORG Member
#1
Hi, I hunted around and found a few old discussions of this here, and some sundry IBM documentation, but I have several questions on this issue. We have a large collection of data that makes extensive use of hard links. As a test, I ran a first-time backup of a 1.7 TB file system (one of over a hundred such file systems), and it used up 2.6+ TB of tape! At first, this seemed very puzzling before I finally figured out that it backed up each instance of the hard links. Sheesh! I then tried to restore it, but it didn't rebuild the hard links, and we ran out of space on the target file system. I then found an IBM document (https://www.ibm.com/support/pages/apar/IT02889 ) stating that the resourceutilization needs to be set to 1 (default for restores) to allow the links to be reestablished. We had it at a higher value for better optimization on backups. Anyway, this worked, and everything looks correct. There was also one source that suggested using a no query restore as opposed to a classic restore? I used a classic restore, however, so I haven't tested that.

1. Is there any way to turn off this "back up each hard link instance" carte blanche behavior and instead only back up the metadata for the hard links and one copy of a given hard link, not all of them?

This seems an utter waste of tape, time and resources.

2. Why does TSM have this behavior?

I don't see this with EMC NetWorker. If I back up a 1.7 TB file system, the size of the backup is 1.7 TB. Restoring the data rebuilds the links, and as I recall, there is the same caveat, as with TSM, in that all the links need to be restored simultaneously, not in groups or just some of them. Otherwise, it works just dandy with no redundant copies.

3. Is there some option that we could set that would skip all but one (for the given inode) on the backups and still allow us to rebuild the hard links?

We need to be able to rebuild the hard links from a restore. We don't want to have to manually recreate them. They do NOT have predictable naming conventions.

4. I thought I saw something in the IBM documentation that suggested that an archive, as opposed to a backup, does not do this? I can't find that page now, but I may have misunderstood.

We need to be able to run incrementals, so an archive would not work as that would force a full. Regardless, does anybody know if the behavior for archiving is the same for hard links?
 

ldmwndletsm

ADSM.ORG Member
#3
Marclant,

Thanks for your response. :) I had seen that earlier (I should have noted such in my post). I haven't spoken directly with IBM about this. Maybe they might be able to shed a little more light on it. If I come across anything, I'll follow up here with an update.

Perhaps, there are not that many operations (relative, of course) making heavy use of hard links these days wherein the TSM behavior is a major imposition. My understanding is that the product has migrated between three companies over they years, with IBM being the third, so maybe it's a legacy left over from before. Don't know, just surmising. Then again, since TSM tracks every file separately in the database then maybe there's some limitations, but still seems odd that it would require having to back up all the matching inodes as their own entries.
 

Advertise at ADSM.ORG

If you are reading this, so are your potential customer. Advertise at ADSM.ORG right now.

UpCloud high performance VPS at $5/month

Get started with $25 in credits on Cloud Servers. You must use link below to receive the credit. Use the promo to get upto 5 month of FREE Linux VPS.

The Spectrum Protect TLA (Three-Letter Acronym): ISP or something else?

  • Every product needs a TLA, Let's call it ISP (IBM Spectrum Protect).

    Votes: 18 18.6%
  • Keep using TSM for Spectrum Protect.

    Votes: 59 60.8%
  • Let's be formal and just say Spectrum Protect

    Votes: 12 12.4%
  • Other (please comement)

    Votes: 8 8.2%

Forum statistics

Threads
31,665
Messages
134,993
Members
21,694
Latest member
jifangming
Top