On 2008-05-25 18:55, jehan procaccia wrote:
hello,
some clients with "big" partitions (>100Gbytes) freezes my amdump, I
usually get dumps errors which cannot end properly.
I have 2 questions,
1) how can I resolve that "client" error, timeout or whatever ?
This look suspiciously like the problem (and solution) described here:
http://wiki.zmanda.com/index.php/Mesg_read:_Connection_reset_by_peer
2) why the amanda server doesnt giveup after dtimeout (1800s) and
finishes its amdump ?
I think the intermediate "% done" messages that dump normally generates
are too far apart on the large dumps to keep the idle TCP connection in the
firewall open. Same problem can happen on the index TCP connection, when
dumping a few very big files.
Amanda does not give up, because the data itself is still
flowing through the data connection, not exceeding the "dtimeout" value.
However some firewall in between has closed the idle TCP connection carrying
the messages or index.
I use amanda-2.5.0p2-4 on a centos 5.1 server with disk (virtual tapes
on a raid 5) backup media.
here's the client error:
sendbackup: time 2185.293: 87: normal(|): DUMP: 4.10% done at 2642
kB/s, finished in 13:38
sendbackup: time 2485.300: 87: normal(|): DUMP: 4.67% done at 2632
kB/s, finished in 13:37
....
sendbackup: time 33385.302: 87: normal(|): DUMP: 60.34% done at 2453
kB/s, finished in 6:04
sendbackup: time 33685.307: 87: normal(|): DUMP: 60.89% done at 2453
kB/s, finished in 5:59
sendbackup: time 33753.599: index tee cannot write [Broken pipe]
sendbackup: time 33753.599: pid 25328 finish time Sun May 25 07:37:50 2008
sendbackup: time 33753.600: 109: normal(|):
sendbackup: time 33753.601: 112: strange(?): gzip: stdout: Broken pipe
sendbackup: time 33753.601: 112: strange(?): sendbackup: index tee
cannot write [Broken pipe]
sendbackup: time 33753.610: 87: normal(|): DUMP: Broken pipe
sendbackup: time 33753.611: 87: normal(|): DUMP: The ENTIRE dump is
aborted.
sendbackup: time 33753.611: error [compress returned 1, /sbin/dump
returned 3]
sendbackup: time 33753.611: pid 25325 finish time Sun May 25 07:37:50 2008
[root@backup /var/lib/amanda/int]
$ amstatus int --dumping
Using /var/lib/amanda/int/amdump.1 from sam mai 24 22:06:48 CEST 2008
helios:/home4 0 85174m dumping 37083m
( 43.54%) (22:22:03)
helios:/home9 0 109374m dumping 30169m
( 27.58%) (22:15:15)
amanda.conf:
etimeout -1200
dtimeout 1800
tpchanger "chg-multi"
define dumptype bi-comp-user-size {
comp-user
comment "Non-root partitions on bi-proc machines"
maxdumps 2
estimate calcsize
}
Disklist exemple of big partition
helios /home4 bi-comp-user-size
Thanks .
--
Paul Bijnens, xplanation Technology Services Tel +32 16 397.511
Technologielaan 21 bus 2, B-3001 Leuven, BELGIUM Fax +32 16 397.512
http://www.xplanation.com/ email: Paul.Bijnens AT xplanation DOT com
***********************************************************************
* I think I've got the hang of it now: exit, ^D, ^C, ^\, ^Z, ^Q, ^^, *
* F6, quit, ZZ, :q, :q!, M-Z, ^X^C, logoff, logout, close, bye, /bye, *
* stop, end, F3, ~., ^]c, +++ ATH, disconnect, halt, abort, hangup, *
* PF4, F20, ^X^X, :D::D, KJOB, F14-f-e, F8-e, kill -1 $$, shutdown, *
* init 0, kill -9 1, Alt-F4, Ctrl-Alt-Del, AltGr-NumLock, Stop-A, ... *
* ... "Are you sure?" ... YES ... Phew ... I'm out *
***********************************************************************
|