Amanda-Users

FIXED: Re: estimate timeouts

2002-12-04 11:47:44
Subject: FIXED: Re: estimate timeouts
From: Matthew Boeckman <matthewb AT saepio DOT com>
To: amanda-users AT amanda DOT org
Date: Wed, 04 Dec 2002 09:56:14 -0600
The combination of increasing my etimeout to 900, and setting up the spindle stuff in disklist :
ultra           /dev/dsk/c0t0d0s0       nocomp-high 1
ultra           /dev/dsk/c0t0d0s1       nocomp-high 1
ultra           /webhome/bkup1          bkup1 2
ultra           /webhome/bkup2          bkup2 2
ultra           /webhome/bkup3          bkup3 2
ultra           /webhome/bkup4          bkup4 2

seems to have done the trick. The real test, of course, is getting through a full cycle, not just one night. Things look great though for one night. FWIW, my backup times:
                          Total       Full      Daily
                        --------   --------   --------
Estimate Time (hrs:min)    1:10
Run Time (hrs:min)         5:26
Dump Time (hrs:min)        4:10       3:41       0:29


Thanks to Jay, John, and the list for all their help.


-Matthew

Jay Lessert wrote:
On Tue, Dec 03, 2002 at 10:57:19AM -0600, Matthew Boeckman wrote:

After resolving my earlier problems, I've uncovered a new one. Amanda is timeing out, apparently while waiting for estimates on one of my hosts. I think I can fix this by bumping the timeout value,


Looks like this would be the thing to do.

[clip]

nocomp tar on 4 directories (bkup1, 2, 3, 4) on the 100+GB filesystem

[clip]

the behavior: If I remove all but 1 of the bkup partitions from disklist, amanda runs fine, whenever I try to run with bkup1-4 in the disklist, I get:
 ultra      /webhome/bkup4 lev 0 FAILED [Request to ultra timed out.]
for all partitions.

[clip]


perplexed, as it kind of appears from these two that amanda was trying to do both lvl 0's and lvl 1's of some of the partitions!


Rather, she's estimating what would happen if she did a level 0 or level 1,
so she can make a rational decision.

Total allowed estimate time for a client is (etimeout * # of DLEs).  If
all estimates for that client are not finished by then, the client is
skipped completely.  It is common for estimates to take a loooong
time on DLEs with a large number of small files; crank up etimeout
by 2X/4X and see what happens.

You may hit dtimeout next, you know what to do.  :-)

Make sure your GNU tar is appropriately recent (1.13.25), I've observed
older versions running pathologically long --listed-incremental times
on Solaris.


sendsize: debug 1 pid 1796 ruid 602 euid 602 start time Tue Dec  3 01:00:07 2002

[clip]

sendsize: pid 1796 finish time Tue Dec  3 02:19:31 2002


So the estimate does finish, in just over an hour.  Default etimeout is
5 minutes, and you're doing 6 DLEs on ultra, so amdump is only waiting
30 minutes, and you lose.  The one hour+ estimate time is survivable,
so make sure GNU tar is new, bump etimeout to 800 or 1000 or 1200
seconds, and go for it.


--
Matthew Boeckman                        (816) 777-2160
Manager - Systems Integration           Saepio Technologies


<Prev in Thread] Current Thread [Next in Thread>