I am still getting a timeout on one of my solaris clients. I have increased the etimeout in amanda.conf to 900!! Once again, buserver is a RH8.0 running 2.4.2p2, client is a 2.6 running 2.4.2p2. I have been running amanda on this for over a year with no problems.
Report looks like this:
These dumps were to tape isymmdaily01.
The next tape Amanda expects to use is: isymmdaily06.
FAILURE AND STRANGE DUMP SUMMARY:
www.iclear / lev 0 FAILED [mesg read: Connection timed out]
STATISTICS:
Total Full Daily
-------- -------- --------
Estimate Time (hrs:min) 0:45
Run Time (hrs:min) 4:03
Dump Time (hrs:min) 1:00 1:00 0:00
Output Size (meg) 5923.7 5923.7 0.0
Original Size (meg) 5923.7 5923.7 0.0
Avg Compressed Size (%) -- -- --
Filesystems Dumped 10 10 0
Avg Dump Rate (k/s) 1671.1 1671.1 --
Tape Time (hrs:min) 1:34 1:34 0:00
Tape Size (meg) 5924.0 5924.0 0.0
Tape Used (%) 51.2 51.2 0.0
Filesystems Taped 10 10 0
Avg Tp Write Rate (k/s) 1073.6 1073.6 --
FAILED AND STRANGE DUMP DETAILS:
/-- www.iclear / lev 0 FAILED [mesg read: Connection timed out]
sendbackup: start [www.iclear.com:/ level 0]
sendbackup: info BACKUP=/usr/local/bin/tar
sendbackup: info RECOVER_CMD=/usr/local/bin/tar -f... -
sendbackup: info end
\--------
NOTES:
planner: Incremental of www.iclear.com:/ bumped to level 2.
taper: tape isymmdaily01 kb 6066144 fm 10 [OK]
DUMP SUMMARY:
DUMPER STATS TAPER STATS
HOSTNAME DISK L ORIG-KB OUT-KB COMP% MMM:SS KB/s MMM:SS KB/s
-------------------------- --------------------------------- ------------
isymmdb /etc 0 2336 2336 -- 0:021044.2 0:021126.4
isymmdb /export 0 77760 77760 -- 0:203949.1 1:121082.1
isymmdb /opt 0 371488 371488 -- 2:112828.6 5:441078.9
isymmdb -acle/admin 0 28868802886880 -- 9:245116.0 44:391077.6
isymmdb /usr 0 695872 695872 -- 5:232152.5 10:451079.4
isymmdb /var 0 176192 176192 -- 1:002928.6 2:431080.1
web.isymmetr /etc 0 2656 2656 -- 0:009490.6 0:15 175.9
web.isymmetr /home 0 64448 64448 -- 0:0234819.8 1:001080.0
web.isymmetr /var 0 213760 213760 -- 0:1811808.7 3:181079.1
www.iclear.c / 0 FAILED ---------------------------------------
xfer.iclear. / 0 15744321574432 -- 41:49 627.6 24:321069.8
(brought to you by Amanda version 2.4.2p2)
The interesting thing is that I have 3 entries under amandad.*.debug. My cron job calls my dump script at 2:30 am. I just realized this client was 6 minutes or so off on the time, so it looks like it tried it at 2:22 am and got the timeout:
amandad.20030516022256:
<snip>
bsd security: remote host web.isymmetrics.com user amanda local user amanda
amandahosts security check passed
amandad: running service "/usr/local/libexec/sendsize"
amandad: sending REP packet:
----
Amanda 2.4 REP HANDLE 002-A0AB0708 SEQ 1053073823
OPTIONS maxdumps=1;
/ 0 SIZE 5407450
/ 1 SIZE 560220
/ 2 SIZE 349790
----
amandad: dgram_recv: timeout after 10 seconds
amandad: waiting for ack: timeout, retrying
amandad: dgram_recv: timeout after 10 seconds
amandad: waiting for ack: timeout, retrying
amandad: dgram_recv: timeout after 10 seconds
amandad: waiting for ack: timeout, retrying
amandad: dgram_recv: timeout after 10 seconds
amandad: waiting for ack: timeout, retrying
amandad: dgram_recv: timeout after 10 seconds
amandad: waiting for ack: timeout, giving up!
amandad: pid 3711 finish time Fri May 16 02:24:44 2003
Then I have another amandad.*.debug:
amandad.20030516030756.debug :
<snip>
bsd security: remote host web.isymmetrics.com user amanda local user amanda
amandahosts security check passed
amandad: running service "/usr/local/libexec/sendsize"
amandad: sending REP packet:
----
Amanda 2.4 REP HANDLE 002-A0AB0708 SEQ 1053073823
OPTIONS maxdumps=1;
/ 0 SIZE 5407480
/ 1 SIZE 560250
/ 2 SIZE 349820
----
amandad: got packet:
----
Amanda 2.4 ACK HANDLE 002-A0AB0708 SEQ 1053073823
and again:
amandad.20030516041437.debug
bsd security: remote host web.isymmetrics.com user amanda local user amanda
amandahosts security check passed
amandad: running service "/usr/local/libexec/sendbackup"
amandad: sending REP packet:
----
Amanda 2.4 REP HANDLE 000-D8B80708 SEQ 1053073828
CONNECT DATA 52738 MESG 52739 INDEX 52740
OPTIONS ;bsd-auth;index;
----
amandad: got packet:
----
Amanda 2.4 ACK HANDLE 000-D8B80708 SEQ 1053073828
----
amandad: pid 3744 finish time Fri May 16 04:14:37 2003
Why do I have 3 amandad entries for that dumpcycle, one of which timed out, two of which didn't, but the dump still failed??? HELP
Rebecca A. Crum
Systems Administrator
Unterberg & Associates, P.C.
(219) 736-5579 ext. 184
|