Edson Noboru Yamada schreef:
I´ve been facing a problem when trying to backup one of our clients.
The backup starts normally, but after some time, the following message shows up
in the taper log:
dumper: stream_client: our side is 0.0.0.0.45740
driver: result time 553.824 from dumper0: FAILED 01-00002 [mesg read:
Connection reset by peer]
dumper: kill index command
taper: reader-side: got label DMX224 filenum 1
Note: it is the "mesg" channel that was closed by the peer.
Probably because it was idle for too long.
On the client side, I can read something like this on the sendbackup log:
sendbackup-gnutar: time 0.248: /usr/local/libexec/runtar: pid 15147
sendbackup: time 0.309: started index creator: "/usr/bin/tar -tf - 2>/dev/null | sed
-e 's/^\.//'"
sendbackup: time 301.700: index tee cannot write [Broken pipe]
sendbackup: time 301.700: pid 15145 finish time Tue Mar 21 15:39:18 2006
sendbackup: time 301.712: 124: strange(?): sendbackup: index tee cannot write
[Broken pipe]
The index was closed by the server, after the mesg channel broke down.
Because the client does not need to send through the mesg channel yet,
it did not notice that. But it tries to write to the index channel,
which was closed by the server already.
I've already tried to turn off index and the holding disk, but no success.
One important thing I´ve noticed is that the error allways occurs after 300
seconds.
Is there some tunable timeout I´m forgetting?
Additional info: strangely, the backup appears successful, even when this
message shows up.
The same client is able to backup other file systems, and the one that fails
the most
is the / filesystem.
Any ideas?
Is it the problem described here:
http://wiki.zmanda.com/index.php/Amdump_fails_to_backup_large_DLEs
Increase tcp keepalive probes:
echo 90 > /proc/sys/net/ipv4/tcp_keepalive_time
--
Paul Bijnens, Xplanation Tel +32 16 397.511
Technologielaan 21 bus 2, B-3001 Leuven, BELGIUM Fax +32 16 397.512
http://www.xplanation.com/ email: Paul.Bijnens AT xplanation DOT com
***********************************************************************
* I think I've got the hang of it now: exit, ^D, ^C, ^\, ^Z, ^Q, F6, *
* quit, ZZ, :q, :q!, M-Z, ^X^C, logoff, logout, close, bye, /bye, *
* stop, end, F3, ~., ^]c, +++ ATH, disconnect, halt, abort, hangup, *
* PF4, F20, ^X^X, :D::D, KJOB, F14-f-e, F8-e, kill -1 $$, shutdown, *
* kill -9 1, Alt-F4, Ctrl-Alt-Del, AltGr-NumLock, Stop-A, ... *
* ... "Are you sure?" ... YES ... Phew ... I'm out *
***********************************************************************
|