Two kinds of timeouts

This morning my summary included this:

glastcolor /db lev 0 FAILED [hmm, disk was stranded on waitq]
glastcolor /usr/local lev 0 FAILED [hmm, disk was stranded on waitq]
glastcolor /home lev 0 FAILED [hmm, disk was stranded on waitq]
glastcolor /boot lev 0 FAILED [hmm, disk was stranded on waitq]
glastcolor / lev 0 FAILED [hmm, disk was stranded on waitq]
razzle /glast/03 lev 0 FAILED [disk /glast/03, all estimate timed out]
razzle /glast/02 lev 0 FAILED [disk /glast/02, all estimate timed out]
razzle /glast/01 lev 0 FAILED [disk /glast/01, all estimate timed out]
razzle /glast/00 lev 0 FAILED [disk /glast/00, all estimate timed out]
razzle /disk6 lev 0 FAILED [disk /disk6, all estimate timed out]
razzle /disk5 lev 0 FAILED [disk /disk5, all estimate timed out]

I saw the same thing once last week. There was a successful dump betweenthe two failures.

The server is Linux with Amanda version 2.5.0. Glastcolor is Linux withversion 2.4.4. Razzle is ancient Solaris 7 with 2.4.5. Of course, Ihaven't changed anything related to Amanda for several weeks.


I looked through sendsize.debug on razzle. All the filesystems which
succeeded sent their estimates before 900 seconds.  Those that failed

came after 900 seconds. It looks like a simple problem with etimeout onthe server, doesn't it? Last week I saw that etimeout was set to 900.

That's supposed to be 900 seconds _per filesystem_, which should be
plenty. I thought there might be a misinterpretation or bug, so I
increased etimeout to 1200. It didn't help. The whole sendsize process
took only 1107 seconds. The server log shows "planner: time 26406.488:
getting estimates took 26405.915 secs", but the last sendsize estimate
came in at 349 seconds.

I don't know what to make of glastcolor. Its sendsize.debug looks
entirely normal, finishing after 442 seconds. The last few lines look
like this:

sendsize[14265]: argument list: /bin/tar --create --file /dev/null--directory /db --one-file-system --listed-incremental/var/lib/amanda/gnutar-lists/glastcolor_db_1.new --sparse--ignore-failed-read --totals .sendsize[14265]: time 442.278: Total bytes written: 41461760 (40MiB,737KiB/s)

sendsize[14265]: time 442.279: .....
sendsize[14265]: estimate time for /db level 1: 55.211
sendsize[14265]: estimate size for /db level 1: 40490 KB
sendsize[14265]: time 442.279: waiting for /bin/tar "/db" child
sendsize[14265]: time 442.282: after /bin/tar "/db" wait

sendsize[14265]: time 442.286: done with amname '/db', dirname '/db',spindle -1

sendsize[14237]: time 442.286: child 14265 terminated normally
sendsize: time 442.287: pid 14237 finish time Sun Oct 14 23:37:30 2007

The server log doesn't show any partial results from glastcolor at all.