Amanda-Users

Re: 2.6.6-rc2 and newer cause trouble with amanda

2004-06-15 10:58:35
Subject: Re: 2.6.6-rc2 and newer cause trouble with amanda
From: Paul Bijnens <paul.bijnens AT xplanation DOT com>
To: Andreas Sundstrom <sunkan AT zappa DOT cx>
Date: Tue, 15 Jun 2004 16:49:56 +0200
Andreas Sundstrom wrote:


I have now made a dump on the traffic passing through lo during the amdump. I
have also recompiled amanda with these two settings (as another friendly person
suggested): --with-tcpportrange=50000,50040 --with-udpportrange=890,899

The dump is available at ftp://zappa.cx/pub/amanda.pcap

I used this filter in ethereal to get rid of some other stuff that I suppose is
not interesting: "! dns and ! tcp.port == 25"

It seems ethereal store only the first few bytes of each packet.
There is probably an option to set that size; similar to "tcpdump -s 1500".
That means I don't have the full info, but I believe I've seen enough!

There is something strange indeed.

13:00:30.573676 192.168.20.100.amanda > 192.168.20.100.890: udp 125 (DF)
0x0000   4500 0099 008d 4000 4011 8fae c0a8 1464        E.....@[email protected]
0x0010   c0a8 1464 2760 037a 0085 aebc 416d 616e        ...d'`.z....Aman
0x0020   6461 2032 2e34 2052 4550 2048 414e 444c        da.2.4.REP.HANDL
0x0030   4520 3030 302d 4338 4443 3036 3038 2053        E.000-C8DC0608.S
0x0040   4551 2031 3038 3732 3937 3037 310a 434f        EQ.1087297071.CO
0x0050   4e4e                                           NN
13:00:30.574146 192.168.20.100.890 > 192.168.20.100.amanda: udp 50 (DF)
0x0000   4500 004e 0001 4000 4011 9085 c0a8 1464        E..N..@[email protected]
0x0010   c0a8 1464 037a 2760 003a b654 416d 616e        ...d.z'`.:.TAman
0x0020   6461 2032 2e34 2041 434b 2048 414e 444c        da.2.4.ACK.HANDL
0x0030   4520 3030 302d 4338 4443 3036 3038 2053        E.000-C8DC0608.S
0x0040   4551 2031 3038 3732 3937 3037 310a             EQ.1087297071.

The above was the request to set up the tcp connections.  The  tracer
dumped not all of the packet, if broke of after "CONN", followed by the
tcp port numbers, but which you find in one of the amandad.XXXX.debug
logs as well.
Normally, there should be three consecutive numbers.
The three tcp portnumbers are used in the next exhange to set up
three tcp connections, for data, error, and index respectively.

13:00:45.622602 192.168.20.100.50027 > 192.168.20.100.50001: S ...
13:00:45.622696 192.168.20.100.50001 > 192.168.20.100.50027: S ...
13:00:45.622815 192.168.20.100.50027 > 192.168.20.100.50001: . ack ...

This was handshake for the first connection: the data connection.
(I have shortened the line to fit on screen.)
It connected to port 50001.

13:00:45.625090 192.168.20.100.50028 > 192.168.20.100.50002: S ...
13:00:45.625165 192.168.20.100.50002 > 192.168.20.100.50028: S ...
13:00:45.625237 192.168.20.100.50028 > 192.168.20.100.50002: . ack ...

The second handshake to port 50002, for the error messages.

13:00:45.627502 192.168.20.100.50029 > 192.168.20.100.65535: S ...
13:00:45.627564 192.168.20.100.65535 > 192.168.20.100.50029: R ...

You would expect a handshake to port 50003, for the index, but
instead there is a connection to port 65535, which is rejected.

13:00:45.628082 192.168.20.100.50027 > 192.168.20.100.50001: F ...
13:00:45.628177 192.168.20.100.50028 > 192.168.20.100.50002: F ...
13:00:45.628849 192.168.20.100.50001 > 192.168.20.100.50027: . ack ...
13:00:45.628880 192.168.20.100.50002 > 192.168.20.100.50028: . ack ...

And amanda cleans up the other two connections.
Amanda tries again with another set of ports a few times
but always trying to connect to 65535 for the index.
Then she gives up completely.


Can you verify in the amandad.XXXX.log that the index connection
was indeed asked to port 50003?
The debug file looks like (search for string CONNECT):

  ====
  Amanda 2.4 REP HANDLE 000-C8DC0608 SEQ 1087282569
  CONNECT DATA 32771 MESG 32772 INDEX 32773
  OPTIONS features=fffffeff9ffe0f;
  ----

Next thing to find out is who/why/when decided to connect to port
65535 instead.  Also note that number:  all 1-bits 16-bit wide.

A kernel bug is indeed one of the possibilities.

Just for fun: if you disable the indexing, then the backup will run
fine, I believe.  ("index no" in dumptype).


--
Paul Bijnens, Xplanation                            Tel  +32 16 397.511
Technologielaan 21 bus 2, B-3001 Leuven, BELGIUM    Fax  +32 16 397.512
http://www.xplanation.com/          email:  Paul.Bijnens AT xplanation DOT com
***********************************************************************
* I think I've got the hang of it now:  exit, ^D, ^C, ^\, ^Z, ^Q, F6, *
* quit,  ZZ, :q, :q!,  M-Z, ^X^C,  logoff, logout, close, bye,  /bye, *
* stop, end, F3, ~., ^]c, +++ ATH, disconnect, halt,  abort,  hangup, *
* PF4, F20, ^X^X, :D::D, KJOB, F14-f-e, F8-e,  kill -1 $$,  shutdown, *
* kill -9 1,  Alt-F4,  Ctrl-Alt-Del,  AltGr-NumLock,  Stop-A,  ...    *
* ...  "Are you sure?"  ...   YES   ...   Phew ...   I'm out          *
***********************************************************************



<Prev in Thread] Current Thread [Next in Thread>