Amanda-Users

Re: backup/recover using tar and hard links

2007-10-17 22:58:43
Subject: Re: backup/recover using tar and hard links
From: "Dustin J. Mitchell" <dustin AT zmanda DOT com>
To: "Olivier Nicole" <on AT cs.ait.ac DOT th>
Date: Wed, 17 Oct 2007 21:52:13 -0500
On 10/17/07, Olivier Nicole <on AT cs.ait.ac DOT th> wrote:
> That sounds a strange behaviour to me: every time one user read the
> message, the file is modified (Status: R added) so the hard link is
> broken for that user.

Oliver -- that change is in the index, which Cyrus imapd keeps
separate from the messages themselves.  It really does do this
duplicate elimination.  IIRC, it also does duplicate elimination by
lazily hashing files to look for duplicates that the MTA didn't know
about (e.g., the same spam delivered from different bots in a
botherd).

> > I'm trying to recover the contents of a mailbox, but it contains some
> > hard links to mail messages in other mailboxes. Some of those other
> > mailboxes (that contained the actual file) have been removed (deleted)
> > in the past few months. The amrecover (tar actually) program complains
> > that it cannot hard link to <filename> because the other mailbox/file
> > doesn't exist: "tar: ./user/student5/626.: Cannot hard link to
> > `./user/student2/555.': No such file or directory".
>
> That aso sounds weird to me: a hard link is a single file sharde in
> multiple directories, the file does not reside inside one directory
> and is not linked from others, it is the same file under different
> names.

This sounds weird to me, too, and may be a bug in GNU Tar.  Here's a
test case (on my Mac desktop with tar 1.13.25):

erdos:~/tmp dustin$  mkdir files
erdos:~/tmp dustin$  echo "file1" > files/file1
erdos:~/tmp dustin$  ln files/file1 files/file2
erdos:~/tmp dustin$  ls -li files/ # verify both have the same inode
total 16
36261200 -rw-r--r--  2 dustin  dustin  6 17 Oct 21:38 file1
36261200 -rw-r--r--  2 dustin  dustin  6 17 Oct 21:38 file2
erdos:~/tmp dustin$  tar -cf files.tar files
erdos:~/tmp dustin$  rm files/file*
erdos:~/tmp dustin$  tar -xf files.tar files/file2
tar: files/file2: Cannot hard link to `files/file1': (null)
tar: Error exit delayed from previous errors

Same test on one of my Gentoo linux box, with tar 1.18, gives
tar: files/file2: Cannot hard link to `files/file1': No such file or directory

I'm guessing that tar notices inodes it's seen before, and stores them
effectively as symlinks in the tarfile.  When tar finds the
information about file2, it tries to make a hard link, but isn't smart
enough to realize it hasn't extracted the target file.  Dennis, do you
want to follow up on bug-tar?

> Short answer, if you have enough temporary disk space, restore all the
> mailboxes and you should have no more missing link problems. Then you
> should be able to move only the malbox of that specific user.

Precisely.  The other option is to repeat the recovery, adding the
file that isn't found at each step; in this case, you'd add
"./user/student22/555." and retry the extraction.

Obviously, this isn't ideal.  I'm surprised nobody else has been
snagged by this before.  I've been using Amanda on Cyrus mailboxes for
years, with lots of recoveries.  I guess I've just gotten lucky.

Dustin

-- 
Storage Software Engineer
http://www.zmanda.com

<Prev in Thread] Current Thread [Next in Thread>