• Please help support our sponsors by considering their products and services.
    Our sponsors enable us to serve you with this high-speed Internet connection and fast webservers you are currently using at ADSM.ORG.
    They support this free flow of information and knowledge exchange service at no cost to you.

    Please welcome our latest sponsor Tectrade . We can show our appreciation by learning more about Tectrade Solutions
  • Community Tip: Please Give Thanks to Those Sharing Their Knowledge.

    If you receive helpful answer on this forum, please show thanks to the poster by clicking "LIKE" link for the answer that you found helpful.

  • Community Tip: Forum Rules (PLEASE CLICK HERE TO READ BEFORE POSTING)

    Click the link above to access ADSM.ORG Acceptable Use Policy and forum rules which should be observed when using this website. Violators may be banned from this website. This notice will disappear after you have made at least 3 posts.

A question on hard links?

ldmwndletsm

ADSM.ORG Member
#1
I found a little discussion of this forum, and I did read this: https://www.ibm.com/support/knowledgecenter/en/SSEQVQ_8.1.4/client/c_bac_hardlnkunx.html

This is on Linux, using Spectrum Protect 8.1.3.

I conducted a test last night wherein I created two hard links (hlink2_file1, hlink2_duplicate_file1) to an already existing file (file1) under that same directory. I checked, and as expected, all three had the same inode numbers (`ls -li`), the link count was 3, and all modtimes, attributes and MD5 digests matched. The parent file system (ext4) had previously been backed up a number of times using an 'incr', and file1 existed prior to the very first backup. The two hard links had never previously existed. I then restored the file system to another file system (xfs), and the three files were restored, but one of them was assigned a different inode than the other two. So file1 was given an inode of 6381833, with a link count of 1, but hlink2_file1 and hlink2_duplicate_file1 were assigned and inode of 6291572, and their link count is 2. Otherwise, all the attributes, digests and modtimes are identical between the three.

Of course, it's to be expected that restored files will be assigned new indode numbers, but I was a little surprised that TSM (or the OS?) didn't maintain the link count of 3 by making all three restored files the same "new" inode. I carried out a similar test using EMC NetWorker on a different Linux box (source files on an ext3 file system; restored to a different ext3 file system), and when I restore hard links there, it works as expected. I think one problem I ran into in the past with NetWorker was if you failed to recover all the hard links then mischief might ensue. But this was not the case in these tests.

Does anyone know what might have happened here? Is this the expected TSM behavior wherein 2 of the 3 or 4 of the 5, etc. would be restored as hard links but one would not? Unless I'm missing somethig obvious here, could anyone test this?


BACKGROUND
We have an archive of data wherein when someone checks out a unit directory to make changes, and then checks it back in, a new version subdirectory is created wherein any files that have not changed are hard links to their counterparts in the original subdirectory (version-01). Otherwise, any new files are created in a new version directory (02, 03, etc.). A given archive unit directory could have many version subdirectories. Symbolic links could be used instead, but I think the hard linking is a result of rsync, or some such thing, and was not contrived.

Anyway, based on the behavior that I see from TSM, it looks like if we restore one of these, and lets say, for example, that it has a total size of 250 GB, with 10 version subdirectories, and most of the files have a link count of 10, then even though we'd end up with 10 subdirectories, most of the files would have a link count of 9, and the files in one of the version subdirectories would have a link count of 1. Otherwise, everything would work fine since the content, attributes, etc. would remain intact. But in this case, we'd have a redundant copy of the files (as far as the file system is concerned), but not a subset or superset. Probably nobody would notice unless they were comparing the link counts with a manifest of the original data attributes. So nothing would break, but we'd end with 500 GB of data not 250 GB.
 

Advertise at ADSM.ORG

If you are reading this, so are your potential customer. Advertise at ADSM.ORG right now.

UpCloud high performance VPS at $5/month

Get started with $25 in credits on Cloud Servers. You must use link below to receive the credit. Use the promo to get upto 5 month of FREE Linux VPS.

The Spectrum Protect TLA (Three-Letter Acronym): ISP or something else?

  • Every product needs a TLA, Let's call it ISP (IBM Spectrum Protect).

    Votes: 18 19.6%
  • Keep using TSM for Spectrum Protect.

    Votes: 57 62.0%
  • Let's be formal and just say Spectrum Protect

    Votes: 10 10.9%
  • Other (please comement)

    Votes: 7 7.6%

Forum statistics

Threads
31,583
Messages
134,647
Members
21,649
Latest member
worblehat
Top