Amanda-Users

Re: NFS mount tar incremental problem

2008-01-25 05:53:50
Subject: Re: NFS mount tar incremental problem
From: Paul Bijnens <Paul.Bijnens AT xplanation DOT com>
To: Jordan Desroches <jordan.d.desroches AT Dartmouth DOT EDU>
Date: Fri, 25 Jan 2008 11:43:48 +0100
On 2008-01-25 02:08, Jordan Desroches wrote:
To answer a portion of my own question, the linux command (for the NFS mount point /mnt/thayer/home):

$stat --format=%d /mnt/thayerfs/home

will give the decimal device number necessary to run the tar-snapshot-edit script. It still doesn't answer the more puzzling question of why tar is not picking up mount as NFS as the documentation says it should.

The idea to use the device number to identify the device is wrong.
Quoting Linus himself:
  "The device number is a random cookie, not a unique identifier."

  http://lwn.net/Articles/65195/

It used to be that the "device number" was a static number
(e.g. something like "device number = major*256+minor", in the days
when major and minor numbers where hardcoded in the kernel).

The right way currently is to not consider the "device number"
to unique identify a system.  It is only unique among the currently
other device numbers present on the system, but the same device
when unmounted/remounted is not guaranteed to get the same
device number again.  That is why udev was invented, and there
you can use some other property of the device to get a static
name:

  http://www.kernel.org/pub/linux/utils/kernel/hotplug/udev_vs_devfs

Gnutar still relies on the fact that the device number is
static and does never change, not even after a reconfiguration
of the devices (e.g. a reboot, or remove/reattach of a device).




Best,

Jordan

On Jan 23, 2008, at 9:29 PM, Jordan Desroches wrote:

I've dug further into the gnutar-lists directory, and I think I know what is causing the problem, but I don't quite know what to do about it. I have a NFS mounted directory /mnt/thayerfs/home

Here is a section of the incremental file from 1/19/08:

0^@1174939899^@122745500^@23^@10709081^@./chen..........

here from 1/20/08:

0^@1190136047^@734375000^@23^@553774^@./gre.....

here from 1/21/08:

0^@1166633525^@49097900^@23^@4647068^@./sp....

here from the 1/22/08:

0^@1190143403^@296875000^@23^@2470941^@./oli....

and here, from 1/23/08:

0^@1191075142^@436061900^@22^@21981625^@./st.....

The first thing that raises my concern is the leading 0. I believe, according the tar docs, that indicates that tar is NOT detecting the mount as a NFS mounted partition. If it had detected it as an NFS partition (which is would do because apparently tar takes different action with NFS paritions), there would be a leading 1.

The second thing is that that the device number changes between the 22nd and the 23rd from 23 to 22. There was a reboot between those days.

Is there a way of preventing the device number from changing? If not, then is the knowledge of the device number enough to run the script Paul suggested? If running the script is the solution (and to ask a potentially, and I hope simple question), how do I determine the device number of a NFS mounted partition to tell the script to change it?

Thanks for your help :-)

Jordan

On Jan 23, 2008, at 11:19 AM, Paul Bijnens wrote:



Jordan Desroches wrote:
Hello all,
I've been having a problem with incremental dumps on a NFS mounted Netapp. AMANDA runs great until I reboot the client (or remount the NFS shares on the client). At that point, while calcsize predicts what I believe is the correct incremental dump size, tar proceeds to do a full dump of all the NFS mounted files. I believe this has to due with something changing between mounts that tar is translating as a change to all files. Upon reading some of the documentation for tar, it indicated that in the incremental dump gnutar-lists, there should be a "1" preceding every entry to indicate that the file is NFS mounted because (Quoting http://www.gnu.org/software/automake/manual/tar/Incremental-Dumps.html";): "Metadata stored in snapshot files include device numbers, which, obviously is supposed to be a non-volatile value. However, it turns out that NFS devices have undependable values when an automounter gets in the picture. This can lead to a great deal of spurious redumping in incremental dumps, so it is somewhat useless to compare two NFS devices numbers over time. The solution implemented currently is to considers all NFS devices as being equal when it comes to comparing directories; this is fairly gross, but there does not seem to be a better way to go." Here is an example from one of my gnutar-lists, showing what I believe are preceding zeroes, indicating that tar thinks that the files are not on NFS: 1201070794^@37216648^@0^@947801240^@0^@24^@8623377^@./unclaimed_afs/nmlhome/mcbride/.desktop-nauset.dartmouth.edu/0.0^@Y4Dwmdeskname^@Y4Dwmdesks^@Y4Dwmdesks.bak^@Y4Dwmsession^@^@^@0^@1180594753^@523384000^@24^@9457059^@./spacescience/web/wl/per/HenrysForkFishing^@YIMG_0103.jpg^@YIMG_0104.jpg^@YIMG_0105.jpg^@YIMG_0106.jpg^@YIMG_0107.jpg^@YIMG_0108.jpg^@YIMG_0109.jpg^@YIMG_0110.jpg^@YIMG_0111.jpg^@YIMG_0112.jpg^@YIMG_0113.jpg^@YIMG_0114.jpg^@YIMG_0115.jpg^@YIMG_0116.jpg^@YIMG_0117.jpg^@YIMG_0118.jpg^@YIMG_0119.jpg^@YIMG_0120.jpg^@YIMG_0121.jpg^@YIMG_0214.jpg^@^@^@0^@1170258810^@0^@24^@11505238^@./paulsen/MAC_Keith/Mac_NIH/Proposals/Breast PPG/Original Proposal '98/Letters^@
Here's how the FS is mounted in /etc/fstab:
192.168.0.2:/vol/research /mnt/thayerfs/research nfs hard,rsize=32768,wsize=32768 0 0
And here is an example disk list entry:
tardis  /mnt/thayerfs/research_p-z /mnt/thayerfs/research {
nocomp-test
include "./[p-zP-Z]*"
}
Has anyone run into this problem, or know how to fix it?


Very related to this:

http://wiki.zmanda.com/index.php/Tar_dumps_every_file_in_a_level-1_backup_after_a_hardware_change

and fixing (each time you have the change!!) it with this script:

http://www.gnu.org/software/tar/utils/tar-snapshot-edit.html

This is actually a gnutar problem...


--
Paul Bijnens, xplanation Technology Services        Tel  +32 16 397.511
Technologielaan 21 bus 2, B-3001 Leuven, BELGIUM    Fax  +32 16 397.512
http://www.xplanation.com/          email:  Paul.Bijnens AT xplanation DOT com
***********************************************************************
* I think I've got the hang of it now:  exit, ^D, ^C, ^\, ^Z, ^Q, ^^, *
* F6, quit, ZZ, :q, :q!, M-Z, ^X^C, logoff, logout, close, bye, /bye, *
* stop, end, F3, ~., ^]c, +++ ATH, disconnect, halt,  abort,  hangup, *
* PF4, F20, ^X^X, :D::D, KJOB, F14-f-e, F8-e,  kill -1 $$,  shutdown, *
* init 0, kill -9 1, Alt-F4, Ctrl-Alt-Del, AltGr-NumLock, Stop-A, ... *
* ...  "Are you sure?"  ...   YES   ...   Phew ...   I'm out          *
***********************************************************************





--
Paul Bijnens, xplanation Technology Services        Tel  +32 16 397.511
Technologielaan 21 bus 2, B-3001 Leuven, BELGIUM    Fax  +32 16 397.512
http://www.xplanation.com/          email:  Paul.Bijnens AT xplanation DOT com
***********************************************************************
* I think I've got the hang of it now:  exit, ^D, ^C, ^\, ^Z, ^Q, ^^, *
* F6, quit, ZZ, :q, :q!, M-Z, ^X^C, logoff, logout, close, bye, /bye, *
* stop, end, F3, ~., ^]c, +++ ATH, disconnect, halt,  abort,  hangup, *
* PF4, F20, ^X^X, :D::D, KJOB, F14-f-e, F8-e,  kill -1 $$,  shutdown, *
* init 0, kill -9 1, Alt-F4, Ctrl-Alt-Del, AltGr-NumLock, Stop-A, ... *
* ...  "Are you sure?"  ...   YES   ...   Phew ...   I'm out          *
***********************************************************************

<Prev in Thread] Current Thread [Next in Thread>