Amanda-Users

RE: amdump waits forever for estimates from one host

2005-06-11 23:32:44
Subject: RE: amdump waits forever for estimates from one host
From: Frank Smith <fsmith AT hoovers DOT com>
To: "Lengyel, Florian" <FLengyel AT gc.cuny DOT edu>, "''amanda-users AT amanda DOT org' '" <amanda-users AT amanda DOT org>
Date: Sat, 11 Jun 2005 22:21:27 -0500
--On Saturday, June 11, 2005 23:03:22 -0400 "Lengyel, Florian" <FLengyel AT 
gc.cuny DOT edu> wrote:

> This is what I have in m254:/tmp/amanda/amandad.20050611172037000.debug
> 
>  ...
> /home/m254/yfsong/ 0 SIZE 91630
> /home/m254/yzhu/ 0 SIZE 10
> ----
> 
> amandad: time 260.577: dgram_recv: timeout after 10 seconds
> amandad: time 260.577: waiting for ack: timeout, retrying
> amandad: time 270.577: dgram_recv: timeout after 10 seconds
> amandad: time 270.577: waiting for ack: timeout, retrying
> amandad: time 280.577: dgram_recv: timeout after 10 seconds
> amandad: time 280.577: waiting for ack: timeout, retrying
> amandad: time 290.577: dgram_recv: timeout after 10 seconds
> amandad: time 290.577: waiting for ack: timeout, retrying
> amandad: time 300.577: dgram_recv: timeout after 10 seconds
> amandad: time 300.577: waiting for ack: timeout, giving up!
> amandad: time 300.577: pid 15364 finish time Sat Jun 11 17:25:37 2005
> [root@m254 amanda]#                                                      
> 
> I previously set
> 
> etimeout 400
> 
> up slightly from the original 300 seconds.
> 
> So it looks like a UDP packet never made it...Oh woe.

Looks like a firewall problem.  Do you have one on either machine
and/or one in between them?

Frank
> 
> -----Original Message-----
> From: Frank Smith
> To: Lengyel, Florian; 'amanda-users AT amanda DOT org'
> Sent: 6/11/2005 10:35 PM
> Subject: Re: amdump waits forever for estimates from one host
> 
> --On Saturday, June 11, 2005 17:45:17 -0400 "Lengyel, Florian"
> <FLengyel AT gc.cuny DOT edu> wrote:
> 
>> Amanda version: amanda-2.4.4p3-1
>> OS: CentOS
>> Kernel: uname -a
>> Linux amanda.grid.cuny.edu 2.6.9-5.0.3.ELsmp #1 SMP Sat Feb 19
> 19:38:02 CST
>> 2005 i686 i686 i386 GNU/Linux
>> 
>> Trouble: amanda is set up on a tape server; there are two clients so
> far.
>> One is running RH linux 7.3 but has the latest (2.4.4) amanda code
> built
>> from 
>> source...the other is using an older rpm under RH linux 9. The source
> build
>> machine
>> gives estimates for its DLEs, and the other seems to want to wait
> until
>> grass to grows 
>> under its mounting bracket, according to amstatus Daily, part of which
>> reads:
>> 
>> m254.gc.cuny.edu:/home/m254/yzhu/                        getting
> estimate
>> m254.gc.cuny.edu:/home/www                               getting
> estimate
>> m254.gc.cuny.edu:sda1                                    getting
> estimate
>> m254.gc.cuny.edu:sda2                                    getting
> estimate
>> m254.gc.cuny.edu:sda3                                    getting
> estimate
>> m254.gc.cuny.edu:sda5                                    getting
> estimate
>> m254.gc.cuny.edu:sda7                                    getting
> estimate
>> neptune-gw.gc.cuny.edu:hda1                  0     8390k estimate done
>> neptune-gw.gc.cuny.edu:hda5                  0  6361840k estimate done
>> neptune-gw.gc.cuny.edu:hda6                  0  1361030k estimate done
>> neptune-gw.gc.cuny.edu:hda7                  0   163620k estimate done
>> 
>> I'm checking through the documentation...amcheck succeeds. Have I made
> one
>> of the usual configuration oversights?
> 
> Try checking the debug files on m254.gc.cuny.edu (default is in
> /tmp/amanda)
> and see if there is more information there.
> 
> One possibility is a firewall blocking the estimate response, since the
> response usually occurs long after most firewall connection timeouts.
> Look for 'no response' errors in the client debg files.
> 
> Just so your backups don't hang forever you might want to make sure your
> etimout isn't set larger than necessary. Don't forget it is multiplied
> by
> the number of DLEs on the host, so a setting of 1 hour on your m254 host
> could result in a wait of up to 7 hours. You can use a negative number
> it will be the per host timeout instead of per DLE.
> 
> Frank
> 
> 
> --
> Frank Smith
> fsmith AT hoovers DOT com
> Sr. Systems Administrator                                 Voice:
> 512-374-4673
> Hoover's Online                                             Fax:
> 512-374-4501



--
Frank Smith                                                fsmith AT hoovers 
DOT com
Sr. Systems Administrator                                 Voice: 512-374-4673
Hoover's Online                                             Fax: 512-374-4501