Amanda-Users

Re: Question about "data timeout".

2005-08-23 12:05:12
Subject: Re: Question about "data timeout".
From: Paul Bijnens <paul.bijnens AT xplanation DOT com>
To: amanda-users AT amanda DOT org
Date: Tue, 23 Aug 2005 17:51:25 +0200
Jon LaBadie wrote:
On Tue, Aug 23, 2005 at 11:19:59AM +0200, Erik P. Olsen wrote:

I have recently added a set of disks (file systems) to my back-up set
and that ended up with a failure due to "data timeout". I didn't even
know there was a dtimeout value to be specified in amanda.conf. I have
learnt that it is an idle time measured against the disks in question.

My question is now, how is this idle time measured and where is it
reported?
Only by knowing what amanda sees of the idle time am I able to specify a
reasonable dtimeout value.


I may be totally wrong here, but I don't think it is tracking "idle" time.
I believe it is total time to dump.  This would take care of "stuck" or
"runaway" dump scenarios.



Correct me if I'm wrong -- the coffee machine is broken here, writing
this on a diet of pure fresh water!

Reading through the sources, it seems that dtimeout is used as
timeout value on a select() call in dumper.c, around line 1356 (amanda
2.4.5 sources).  The select waits for activity on the data stream or
on the messages stream.
That means that if there is no traffic received within dtimeout seconds
on one of those streams, you get a "data timeout".

The default 1800 seconds seems more than reasonable to me in that case.

A pathological case could be a sequence of very compressable data (all
"aaaaaaaaaaaaaaa"s or zero's, like an empty database file). Compressing
such a sequence, together with some buffering on client and server,
it could well take a long time before any bytes come out of such pipe.
But 1800 seconds seems to me more than enough even for those cases.

There is also one of the last "enhancements" in gnutar for handling
sparse files, which could result in a large time without emiting any data (and some systems create sparse files with 64 bit sizes...):

https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=154882
http://lists.gnu.org/archive/html/bug-tar/2005-07/msg00025.html

But that is only when doing estimates, or does it also affect the
backup itself?

And of course firewall timeouts come into play too, blocking one of
the streams (e.g. the messages stream has almost no traffic usually)
resulting in never receiving the end-of-file indication on that stream.
Which results after dtimetout seconds in "data timeout" too.

--
Paul Bijnens, Xplanation                            Tel  +32 16 397.511
Technologielaan 21 bus 2, B-3001 Leuven, BELGIUM    Fax  +32 16 397.512
http://www.xplanation.com/          email:  Paul.Bijnens AT xplanation DOT com
***********************************************************************
* I think I've got the hang of it now:  exit, ^D, ^C, ^\, ^Z, ^Q, ^^, *
* F6, quit, ZZ, :q, :q!, M-Z, ^X^C, logoff, logout, close, bye, /bye, *
* stop, end, F3, ~., ^]c, +++ ATH, disconnect, halt,  abort,  hangup, *
* PF4, F20, ^X^X, :D::D, KJOB, F14-f-e, F8-e,  kill -1 $$,  shutdown, *
* init 0, kill -9 1, Alt-F4, Ctrl-Alt-Del, AltGr-NumLock, Stop-A, ... *
* ...  "Are you sure?"  ...   YES   ...   Phew ...   I'm out          *
***********************************************************************



<Prev in Thread] Current Thread [Next in Thread>