Amanda-Users

Re: index tee cannot write [Broken pipe]

2007-08-02 13:07:20
Subject: Re: index tee cannot write [Broken pipe]
From: Paul Bijnens <Paul.Bijnens AT xplanation DOT com>
To: Mike Gallant <gallantm AT indiana DOT edu>
Date: Thu, 02 Aug 2007 18:44:06 +0200
On 2007-08-02 15:56, Mike Gallant wrote:
I have an amanda (2.5.1p3) server (solaris10) with seven amanda clients (2.4.4) and a new amanda (2.5.2) client on new hardware. I have been getting a "index tee cannot write [Broken pipe]" error message. I have looked at all the various aspects I can think of but still no joy. Could someone point me in the right direction for resolving this.


The error "index tee cannot write" is the collateral damage of some
earlier problem.  The index tcp-connection got closed, probably because
of some other error.
See also:
  http://wiki.zmanda.com/index.php/Mesg_read:_Connection_reset_by_peer
which has a similar effect on the client.

Don't focus on this "index tee" but look earlier.
I'm interested in the error messages from the server-side as well.



I have included the sendbackup.debug (client) and a good part of the amandad (client) missing the repetitive entries.

Thanks,
Mike

sendbackup: debug 1 pid 14529 ruid 105 euid 105: start at Thu Aug  2 01:55:27 
2007
sendbackup: version 2.5.2
[...]
sendbackup: time 3.199:  91:  normal(|):   DUMP: Estimated 5852872 blocks 
(2857.85MB) on 0.04 tapes.

So you're trying to dump less than 3 Gbyte.


sendbackup: time 3.245:  91:  normal(|):   DUMP: Dumping (Pass III) 
[directories]
sendbackup: time 16237.077: index tee cannot write [Broken pipe]

Here the index tcp-connection got broken, proabably because the server
gave up on this host.  The real problem happened much earlier.
(more than 4 hours after startup -- a really patient server :-) )


sendbackup: time 16237.077: pid 14531 finish time Thu Aug  2 06:26:04 2007



vReading conf file "/etc/opt/slis/amanda/amanda-client.conf".
[...]

amandad: time 0.005: security_getdriver(name=BSD) returns ff2dd95c
amandad: version 2.5.2
amandad: time 0.005: build: VERSION="Amanda-2.5.2"
amandad: time 0.005:        BUILT_DATE="Thu May 24 22:05:16 EDT 2007"

Did it work correct before?  Of do you struggle with the problem
until now?


[...]
amandad: time 0.024: accept recv REQ pkt:
<<<<<
SERVICE sendbackup
DUMP /zones  0 1970:1:1:0:0:0 OPTIONS |;auth=BSD;index;
 >>>>>
amandad: time 0.024: creating new service: sendbackup
DUMP /zones  0 1970:1:1:0:0:0 OPTIONS |;auth=BSD;index;

amandad: time 0.028: sending ACK pkt:
<<<<<
 >>>>>
amandad: time 0.028: dgram_send_addr(addr=30db0, dgram=ff2dddac)
amandad: time 0.028: (sockaddr_in *)30db0 = { 2, 731, 192.168.36.38 }
amandad: time 0.028: dgram_send_addr: ff2dddac->socket = 0
amandad: time 0.072: security_streaminit(stream=418f8, driver=ff2dd95c (BSD))
amandad: time 0.072: try_socksize: send buffer size is 65536
amandad: time 0.072: try_socksize: receive buffer size is 65536
amandad: time 0.072: stream_server: Could not bind to any port: Invalid argument


Here is the problem, I believe, although I do not know what is wrong.
Some syscall is giving "EINVAL" on one of the arguments.
Do you have some weird network setup?
e.g. do you have a lower limit then 64k for UDP-packets etc.


amandad: time 0.072: stream_server: Retrying entire range after 10 second delay.

... and so on ...

amandad: time 1800.842: security_seterror(handle=30d90, driver=ff2dd95c (BSD) error=can't create server stream: Invalid argument) amandad: time 1800.842: couldn't open stream to server: can't create server stream: Invalid argument amandad: time 1800.842: security_streaminit(stream=418f8, driver=ff2dd95c (BSD))

The upper layers repeating what the lower layers were saying...


amandad: time 1800.842: try_socksize: send buffer size is 65536
amandad: time 1800.842: try_socksize: receive buffer size is 65536
amandad: time 1800.842: stream_server: Could not bind to any port: Invalid 
argument
amandad: time 1800.842: stream_server: Retrying entire range after 10 second 
delay.

and starting again (not sure why??)


amandad: time 5402.381: stream_server: bind(in6addr_any) failed: Invalid 
argument

in6addr_any??  suddenly using IPV6??

I believe 2.5.2p1 fixed some IPV6 bugs.  (It has at least less bugs, so
better try that one.  I'm using it without problems.)





amandad: time 16247.142: sending REP pkt:
<<<<<
CONNECT DATA -1 MESG -1 INDEX -1
OPTIONS features=ffffffff9ffeffffffff00;
 >>>>>

More trouble, proabably because of the previous ones.
Normally amandad should reply with the tcp-ports it has opened
for the three streams, and "-1" is of course not valid.


amandad: time 16247.142: dgram_send_addr(addr=49e40, dgram=ff2dddac)
amandad: time 16247.142: (sockaddr_in *)49e40 = { 2, 731, 192.168.36.38 }
amandad: time 16247.142: dgram_send_addr: ff2dddac->socket = 0
amandad: time 16257.150: timeout
amandad: time 16257.150: timeout waiting for ACK for our REP

But after 16000 seconds the server isn't expecting anything from this
host probably.



--
Paul Bijnens, xplanation Technology Services        Tel  +32 16 397.511
Technologielaan 21 bus 2, B-3001 Leuven, BELGIUM    Fax  +32 16 397.512
http://www.xplanation.com/          email:  Paul.Bijnens AT xplanation DOT com
***********************************************************************
* I think I've got the hang of it now:  exit, ^D, ^C, ^\, ^Z, ^Q, ^^, *
* F6, quit, ZZ, :q, :q!, M-Z, ^X^C, logoff, logout, close, bye, /bye, *
* stop, end, F3, ~., ^]c, +++ ATH, disconnect, halt,  abort,  hangup, *
* PF4, F20, ^X^X, :D::D, KJOB, F14-f-e, F8-e,  kill -1 $$,  shutdown, *
* init 0, kill -9 1, Alt-F4, Ctrl-Alt-Del, AltGr-NumLock, Stop-A, ... *
* ...  "Are you sure?"  ...   YES   ...   Phew ...   I'm out          *
***********************************************************************


<Prev in Thread] Current Thread [Next in Thread>