Amanda-Users

Re: some filesystems fail on a host

2005-07-07 14:15:06
Subject: Re: some filesystems fail on a host
From: Oscar Ricardo Silva <osilva AT scuff.cc.utexas DOT edu>
To: amanda-users AT amanda DOT org
Date: Thu, 07 Jul 2005 13:03:50 -0500
ARGH! Just to try something different I turned off iptables on the amanda server last night and everything worked fine. I was so sure it wasn't the cause since on another system two out of three filesystems were being backed up. On the one below, one out of four filesystems worked.

I'll slink off to the corner now ... and excuse the top-posting ...


Oscar


At 07:52 PM 7/6/2005, Paul Bijnens wrote:
Oscar Ricardo Silva wrote:
I have a few hosts where some filesystems fail to be backed up. I thought it might be firewall/iptables issues but the fact that at least one filesystem on the host is successful seems to ruin that idea. Also,
...
scuff.cc.utexas.edu           /        1      2030      2030   --
0:01 1401.2    0:02   838.6
scuff.cc.utexas.edu /home 0FAILED --------------------------------------------------- scuff.cc.utexas.edu /usr 0FAILED --------------------------------------------------- scuff.cc.utexas.edu /var 0FAILED ---------------------------------------------------

Are you sure the above report and details of amandad below are
from the same run?
Very very strange...


amandad: time 819.606: sending PREP packet:
----
Amanda 2.4 PREP HANDLE 016-38D50D09 SEQ 1120528857
OPTIONS features=fffffeff9ffe7f;
/ 0 SIZE 318970
/ 1 SIZE 2030
/home 0 SIZE 29349730
/home 1 SIZE 29349740
/var 0 SIZE 469370
/var 1 SIZE 469370
/var 2 SIZE 469370
/usr 0 SIZE 4436240

What is also very strange is that your lvl 0 and incremental lvls are
different (as expected) for / but are identicalfor /home and /var .

Any idea why that is?  Do you run gnutar with --atime-preserve or so?
Or is there another program having the same effect?
Or are there problems with the gnutar-lists file (e.g. permissions
etc.).  Is RH AS rel3 configured with SElinux and prohibiting
something else?



amandad: time 829.600: dgram_recv: timeout after 10 seconds
amandad: time 829.600: waiting for ack: timeout, retrying
amandad: time 839.600: dgram_recv: timeout after 10 seconds
amandad: time 839.600: waiting for ack: timeout, retrying
amandad: time 849.600: dgram_recv: timeout after 10 seconds
amandad: time 849.600: waiting for ack: timeout, retrying
amandad: time 859.600: dgram_recv: timeout after 10 seconds
amandad: time 859.600: waiting for ack: timeout, retrying
amandad: time 869.600: dgram_recv: timeout after 10 seconds
amandad: time 869.600: waiting for ack: timeout, giving up!
amandad: time 869.600: pid 29140 finish time Mon Jul  4 21:14:42 2005

Am I wrong and could it be some restrictions between client and server?

Usually this is indeed some firewall problem (udp reply packets
time out after 180 seconds by default in iptables I believe -- and FW1
defaults even to 40 seconds).

Run tcpdump or ethereal on the server to verify if you can see the packet arriving or not.

A quick test -- without amanda can be done manually with netcat (nc):

On the server:  nc -u -l -p 900
On the client:  nc -vv -u theserver 900

And now everything you type on one side should appear on the other side.
Wait some time (40 seconds, 180, or 800 seconds) and type something
again, and verify if the connection is still valid.


--
Paul Bijnens, Xplanation                            Tel  +32 16 397.511
Technologielaan 21 bus 2, B-3001 Leuven, BELGIUM    Fax  +32 16 397.512
http://www.xplanation.com/          email:  Paul.Bijnens AT xplanation DOT com
***********************************************************************
* I think I've got the hang of it now:  exit, ^D, ^C, ^\, ^Z, ^Q, F6, *
* quit,  ZZ, :q, :q!,  M-Z, ^X^C,  logoff, logout, close, bye,  /bye, *
* stop, end, F3, ~., ^]c, +++ ATH, disconnect, halt,  abort,  hangup, *
* PF4, F20, ^X^X, :D::D, KJOB, F14-f-e, F8-e,  kill -1 $$,  shutdown, *
* kill -9 1,  Alt-F4,  Ctrl-Alt-Del,  AltGr-NumLock,  Stop-A,  ...    *
* ...  "Are you sure?"  ...   YES   ...   Phew ...   I'm out          *
***********************************************************************




<Prev in Thread] Current Thread [Next in Thread>