Amanda-Users

Re: etimeout ignored?

2007-09-04 11:30:07
Subject: Re: etimeout ignored?
From: Paul Lussier <pll+amanda AT permabit DOT com>
To: amanda-users AT amanda DOT org
Date: Tue, 04 Sep 2007 10:14:30 -0400
Ralf Auer <Ralf.Auer AT physik.uni-erlangen DOT de> writes:

> Hi everybody,
>
> I have a little problem with the 'etimeout' setting in my
> amanda.conf.  I have set 'etimeout' to 900. To my understanding this
> makes Amanda wait for 15 minutes per DLE and client, then a timeout
> should occur.
>
> For some reason this value seems to be ignored here, because for my
> buisy clients Amanda still waits for the estimates after several
> hours! The clients have only two DLEs, so I would expect her to wait
> at max 30 Minutes, not more.
>
> I'm using 2.5.2p1 version, everything else is doing fine, nothing
> special to be found in the log-files.
>
> Any ideas what I could have done wrong?

I'm seeing something similar.  I have a host which routinely gets no
estimates for sever of the (NFS mounted) file systems.  I've got my
etimeout set way high (10800 sec, or 3 hours) because the file systems
are really big.

I'm using 2.5.1p1-2.1 from the debian packages.  The server being
backed up is (unfortunately) my backup server itself.  There are 22
file systems on this host, 18 of which are NFS mounts from our NFS
appliance (an OnStor). Of those 18, 6 never complete the estimate.

I've moved etimeout from as low as 3600 to as high as 10800 thinking
it might be timing out too soon (these are, in some cases, 100+ GB
file systems.  However, this latest run was kicked off Sat Sep 1
21:49:31 EDT 2007.  Which at this point, is over 84 hours.  Even if
the timeout for the host was set to (etimeout * Num_hosts_DLEs),
that's still only 66 hours.  My backups haven't even begun dumping
yet, because amanda is still waiting for estimates from those 6 file
systems.  That estimate is *never* going to happen, given that amandad
isn't currently running, nor is there an estimate process (tar to
/dev/null) running on the host in question.

Can someone point me at which logs I should be looking at to find out
why amandad died?

And can someone tell me why these file systems are not timing out when
they should?

--
Thanks,
Paul

<Prev in Thread] Current Thread [Next in Thread>