Re: [Networker] Questions on backing up hard links?
2005-08-19 18:26:23
Here is a bit of technical info on how hard links work on a "standard"
UNIX filesystem (i.e. BSD type, although most others mimic this).
Let's call each data object that would be stored on a filesystem a
"file". This includes directories, device files, named pipes, symbolic
links, etc. A "hard link" is NOT a data object in itself.
Each file is stored on the filesystem in available data blocks, and an
inode is allocated that stores the pointers to the data blocks, and
everything there is to know about the file - except the name.
Names are associated to files via special files called directories. They
are nothing more than files that the system treats a little differently,
and all they are is a table that lists inodes and the names we want to
associate with them.
A "hard link" is nothing more than adding an additional directory entry
that points to the same inode. So two or more different names refer to
the same inode, and thus the same data block.
When you create a hard link, it increments the "link count" value in the
inode (which starts at one). When you remove a directory entry ("rm" the
file), it decrements it. If the link count reaches zero, the data blocks
and inode are marked unused.
Now, as far as what Networker does, I'm not sure how it handles them
specifically. However, Networker does keep track of the inodes of the
files that are being backed up (you can see them with "ls -la" in
recover), and when it encounters another directory entry to a file it
has already backed up, it simply adds another index entry, referring to
the same data.
When you recover the file, it isn't going to "recover the original and
rename it", since which exactly _is_ the original? That information is
not stored anywhere. If you only recover one particular entry to the
file, a new file is created (inode and data blocks), and a new directory
entry is created. It won't get its original inode number (except by
amazing luck) because that inode may be in use already. If you recover
more than one link to the same file, only one "file" is created - but
each directory entry recovered will point to the same inode - so there
is still only one "copy" of the data.
What if you have multiple links to the same file, delete one of them,
update the data through another, then recover the one that was deleted?
I'm not 100% sure, but I believe the result will be that the recovered
version will be a separate file unto itself, and not related to the
originals any more, so now you have two distinct copies.
What if, in the same scenario, you didn't delete the link, and recover
with the "overwrite" option? Well, that may depend on the file
operations used - either it truncates and rewrites the existing data, or
it unlinks the one you are recovering and you end up with a separate
file as above. I'd have to run a test to be sure, but can't at the moment.
Note that symbolic links are completely and utterly different and are
nothing more than files that contain a path to the destination file, and
that are backed up and recovered as a regular file would be - except
they are very small.
George Sinclair wrote:
Does anyone know how NetWorker handles the backup/recovery of hard links
or how it's supposed to?
I tested backing up two files (TEST and TEST.ln). TEST was approx 10 KB.
The second file is a hard link to the first. The one thing I noticed is
that NetWorker seems to indicate that only 10 KB was backed up. If I run
nwrecover, I can see entries for both files, however. I guess this makes
sense because I wouldn't think NetWorker would actually back up files to
tape that are hard links since they share the same inode, and that would
be redundant data on tape, plus they are actually the same file as far
as the OS is concerned, anyway, and NetWorker only sees what the OS
tells it. In other words, if I have 10 original files totaling 50 MB,
and I then create a hard link to each one (link count now = 2 for all 20
files), and I back up all 20 files, NetWorker will indicate that only 50
MB was backed up, not 100 MB even though recover and verbose show all 20
pathnames.
So is NetWorker merely updating the client index with information about
the hard link but not writing it to tape?
Another thing I notice is that if I remove the hard link (TEST.ln), and
then I recover the hard link, NetWorker indicates that it's reading 10
KB, and when it restores it, while it has the same mtime, and same name
(TEST.ln), it now has a new inode, so it's technically no longer a hard
link to TEST. A diff between the files shows no differences, but editing
one is not reflected in the other, so NetWorker did not recover it as a
hard link. It appears to be just an ordinary copy with the original name.
How does is it able to recover it? Does it just recover the original an
rename it TEST.ln? Is this the behavior one would expect?
Does it make any sense to back up hard links? My testing shows that they
are not recovered as hard links so seems pointless to do so? You'd have
your pathnames back, but they would just be copies taking up space, not
actual hard links. You'd have to re-create the hard links from scratch.
We're running an older 6.1.1 release on Solaris 2.8 primary server. I
was doing my tests on a Linux client, but the backups and recovers were
from our Linux storage node.
Thanks.
George
To sign off this list, send email to listserv AT listserv.temple DOT edu and
type "signoff networker" in the
body of the email. Please write to networker-request AT listserv.temple DOT edu
if you have any problems
wit this list. You can access the archives at
http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER
To sign off this list, send email to listserv AT listserv.temple DOT edu and type
"signoff networker" in the
body of the email. Please write to networker-request AT listserv.temple DOT edu
if you have any problems
wit this list. You can access the archives at
http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER
|
|
|