Amanda-Users

Connection timed out: STILL A PROBLEM:long entry

2003-05-16 10:04:45
Subject: Connection timed out: STILL A PROBLEM:long entry
From: "Rebecca Pakish Crum" <rebecca AT unterlaw DOT com>
To: <amanda-users AT amanda DOT org>
Date: Fri, 16 May 2003 08:59:37 -0500

I am still getting a timeout on one of my solaris clients. I have increased the etimeout in amanda.conf to 900!! Once again, buserver is a RH8.0 running 2.4.2p2, client is a 2.6 running 2.4.2p2. I have been running amanda on this for over a year with no problems.

Report looks like this:
These dumps were to tape isymmdaily01.
The next tape Amanda expects to use is: isymmdaily06.

FAILURE AND STRANGE DUMP SUMMARY:
  www.iclear / lev 0 FAILED [mesg read: Connection timed out]


STATISTICS:
                          Total       Full      Daily
                        --------   --------   --------
Estimate Time (hrs:min)    0:45
Run Time (hrs:min)         4:03
Dump Time (hrs:min)        1:00       1:00       0:00
Output Size (meg)        5923.7     5923.7        0.0
Original Size (meg)      5923.7     5923.7        0.0
Avg Compressed Size (%)     --         --         --
Filesystems Dumped           10         10          0
Avg Dump Rate (k/s)      1671.1     1671.1        --

Tape Time (hrs:min)        1:34       1:34       0:00
Tape Size (meg)          5924.0     5924.0        0.0
Tape Used (%)              51.2       51.2        0.0
Filesystems Taped            10         10          0
Avg Tp Write Rate (k/s)  1073.6     1073.6        --


FAILED AND STRANGE DUMP DETAILS:

/-- www.iclear / lev 0 FAILED [mesg read: Connection timed out]
sendbackup: start [www.iclear.com:/ level 0]
sendbackup: info BACKUP=/usr/local/bin/tar
sendbackup: info RECOVER_CMD=/usr/local/bin/tar -f... -
sendbackup: info end
\--------


NOTES:
  planner: Incremental of www.iclear.com:/ bumped to level 2.
  taper: tape isymmdaily01 kb 6066144 fm 10 [OK]


DUMP SUMMARY:
                                     DUMPER STATS            TAPER STATS
HOSTNAME     DISK        L ORIG-KB OUT-KB COMP% MMM:SS  KB/s MMM:SS  KB/s
-------------------------- --------------------------------- ------------
isymmdb      /etc        0    2336   2336   --    0:021044.2   0:021126.4
isymmdb      /export     0   77760  77760   --    0:203949.1   1:121082.1
isymmdb      /opt        0  371488 371488   --    2:112828.6   5:441078.9
isymmdb      -acle/admin 0 28868802886880   --    9:245116.0  44:391077.6
isymmdb      /usr        0  695872 695872   --    5:232152.5  10:451079.4
isymmdb      /var        0  176192 176192   --    1:002928.6   2:431080.1
web.isymmetr /etc        0    2656   2656   --    0:009490.6   0:15 175.9
web.isymmetr /home       0   64448  64448   --    0:0234819.8   1:001080.0
web.isymmetr /var        0  213760 213760   --    0:1811808.7   3:181079.1
www.iclear.c /           0 FAILED ---------------------------------------
xfer.iclear. /           0 15744321574432   --   41:49 627.6  24:321069.8

(brought to you by Amanda version 2.4.2p2)


The interesting thing is that I have 3 entries under amandad.*.debug. My cron job calls my dump script at 2:30 am. I just realized this client was 6 minutes or so off on the time, so it looks like it tried it at 2:22 am and got the timeout:

amandad.20030516022256:
<snip>
bsd security: remote host web.isymmetrics.com user amanda local user amanda
amandahosts security check passed
amandad: running service "/usr/local/libexec/sendsize"
amandad: sending REP packet:
----
Amanda 2.4 REP HANDLE 002-A0AB0708 SEQ 1053073823
OPTIONS maxdumps=1;
/ 0 SIZE 5407450
/ 1 SIZE 560220
/ 2 SIZE 349790
----

amandad: dgram_recv: timeout after 10 seconds
amandad: waiting for ack: timeout, retrying
amandad: dgram_recv: timeout after 10 seconds
amandad: waiting for ack: timeout, retrying
amandad: dgram_recv: timeout after 10 seconds
amandad: waiting for ack: timeout, retrying
amandad: dgram_recv: timeout after 10 seconds
amandad: waiting for ack: timeout, retrying
amandad: dgram_recv: timeout after 10 seconds
amandad: waiting for ack: timeout, giving up!
amandad: pid 3711 finish time Fri May 16 02:24:44 2003

Then I have another amandad.*.debug:
amandad.20030516030756.debug :
<snip>
bsd security: remote host web.isymmetrics.com user amanda local user amanda
amandahosts security check passed
amandad: running service "/usr/local/libexec/sendsize"
amandad: sending REP packet:
----
Amanda 2.4 REP HANDLE 002-A0AB0708 SEQ 1053073823
OPTIONS maxdumps=1;
/ 0 SIZE 5407480
/ 1 SIZE 560250
/ 2 SIZE 349820
----

amandad: got packet:
----
Amanda 2.4 ACK HANDLE 002-A0AB0708 SEQ 1053073823

and again:
amandad.20030516041437.debug
bsd security: remote host web.isymmetrics.com user amanda local user amanda
amandahosts security check passed
amandad: running service "/usr/local/libexec/sendbackup"
amandad: sending REP packet:
----
Amanda 2.4 REP HANDLE 000-D8B80708 SEQ 1053073828
CONNECT DATA 52738 MESG 52739 INDEX 52740
OPTIONS ;bsd-auth;index;
----

amandad: got packet:
----
Amanda 2.4 ACK HANDLE 000-D8B80708 SEQ 1053073828
----

amandad: pid 3744 finish time Fri May 16 04:14:37 2003

Why do I have 3 amandad entries for that dumpcycle, one of which timed out, two of which didn't, but the dump still failed??? HELP

Rebecca A. Crum 
Systems Administrator
Unterberg & Associates, P.C.
(219) 736-5579 ext. 184


<Prev in Thread] Current Thread [Next in Thread>