Amanda-Users

Re: gtar program still running after backup failed

2007-08-17 03:43:16
Subject: Re: gtar program still running after backup failed
From: Paul Bijnens <Paul.Bijnens AT xplanation DOT com>
To: fedora <zuki AT abamon DOT com>
Date: Fri, 17 Aug 2007 09:39:43 +0200
On 2007-08-17 06:09, fedora wrote:

On Wed, Aug 15, 2007 at 07:57:32PM -0700, fedora wrote:
In "normal" failure modes, this should be taken care of.  Can you give
some detail on the type of failure that's triggering this?

Also, amcleanup should function as a second line of defense for killing
such processes.
Here is the error in mail report:
FAILED [data timeout]
FAILED [cannot read header: got 0 instead of 32768]
FAILED [too many dumper retry: "[request failed: timeout waiting for
ACK]"]

It looks like you have a communication problem.  What auth are you using
for that client?  This is, unfortunately, not the sort of error that
amcheck will pick up on.  It's usually caused by bad firewall settings.

That was not the firewall problem. When amanda was amchecking that time, the
network got problem. So, amcheck did not run properly. I am using
"-auth=bsd" auth. Let's forget about this kind of error or network problem.
No matter what happen whether in this kind of situation or else, what do I
need to set to kill gtar program in client automatically? or any opinions?

Instead of fighting the symptoms, you better find out what is causing the gtar program to run on that client. In normal orders, that should not happen. And instead of blindly killing it find the cause, and
eliminate that.  Blindly running amcleanup could make things even worse,
IMHO.

In all my experience with amanda (> 8 years now) I had to run amcleanup
only 2 or 3 times.

And, moreover, amcleanup will not reach out to the clients and kill
processes there, so it won't help you anyway.



Here is my cronjob:
0 21 * * * /usr/local/sbin/amcleanup DailySet1
10 21 * * * /usr/local/sbin/amcheck DailySet1
30 21 * * * /usr/local/sbin/amdump DailySet1

I put amcleanup before amcheck and amdump. Was it a proper sequences?

No -- you shouldn't need to run amcleanup regularly, but only when a
failure occurs.

And if you frequently run into failures, fight those failures, instead
of leaving them and cleaning up each time.



I am running amcleanup before amcheck because if it failed I don't have to
run amcleanup manually. We don't know when it will fail rite?

You should run amcheck during working hours instead, and, if it finds
problems, correct those problems before leaving home.  Maybe (once
every few years) you need to run amcleanup manually to fix the problems.
But don't schedule it needlessly daily.



Also, running amcheck 20 minutes before your dump doesn't give you much
time to fix anything.  Most people run amcheck in the late afternoon --
after any tape swapping is done, but with enough time to correct any
errors before heading home for dinner.

So, I should set amcleanup after amdump finished (at noon)?

You should not run amcleanup at all.   If you find you need to run
in regularly, there is something wrong in your procedures.


--
Paul Bijnens, xplanation Technology Services        Tel  +32 16 397.511
Technologielaan 21 bus 2, B-3001 Leuven, BELGIUM    Fax  +32 16 397.512
http://www.xplanation.com/          email:  Paul.Bijnens AT xplanation DOT com
***********************************************************************
* I think I've got the hang of it now:  exit, ^D, ^C, ^\, ^Z, ^Q, ^^, *
* F6, quit, ZZ, :q, :q!, M-Z, ^X^C, logoff, logout, close, bye, /bye, *
* stop, end, F3, ~., ^]c, +++ ATH, disconnect, halt,  abort,  hangup, *
* PF4, F20, ^X^X, :D::D, KJOB, F14-f-e, F8-e,  kill -1 $$,  shutdown, *
* init 0, kill -9 1, Alt-F4, Ctrl-Alt-Del, AltGr-NumLock, Stop-A, ... *
* ...  "Are you sure?"  ...   YES   ...   Phew ...   I'm out          *
***********************************************************************


<Prev in Thread] Current Thread [Next in Thread>