Amanda-Users

Re: AMANDA backup fails silently with "taper: Received signal 1"

2008-01-18 18:11:11
Subject: Re: AMANDA backup fails silently with "taper: Received signal 1"
From: Jordan Desroches <jordan.d.desroches AT Dartmouth DOT EDU>
To: amanda-users AT amanda DOT org
Date: Fri, 18 Jan 2008 11:35:12 -0500
John,

After looking up what you suggested, I'm almost entirely sure you're right. If I start the job manually, I'll use nohup, disown or screen. Thanks so much for your help!

Jordan

On Jan 18, 2008, at 10:48 AM, John E Hein wrote:

Jordan Desroches wrote at 10:09 -0500 on Jan 18, 2008:
Greetings all,

In experimenting with AMANDA I've been running into a problem where
I'll start a backup, everything will go swimmingly, and sometime down
the line, the backup is stopped, and no AMANDA processes are running.
No failure report is sent, the backup just stops. When I run a manual
amreport, there is a suspicious message "taper: Received signal 1".
This error does not happen all the time. In this case, I had blown
away all the log, index and gnutar-lists files to try to start afresh.
Any ideas what maybe causing this?

My amanda system:

Ubuntu 7.10 server install
Amanda 2.5.2p1
GNU Tar 1.18
IBM LTO-3 drive
netapp filer nfs mounted. network: gig-e, MTU9000, rsize and wsize
both 32768

I've attached the amreport results and the taper debug file.

Signal 1 is SIGHUP.

The shell, if it exits, will send SIGHUP to processes under it's
control unless they have detached from the shell as process group
leader.

Ways around this include nohup(1), daemon(8) (not available on all
OS's), run in the background in a subshell... among others.

Usually people start amdump in a cron job, and cron probably detaches
its jobs so they are parented by init(8) or parented by a forked cron
process which is parented by init (OS-dependent, but typical).  It's
not typical (I won't say it's not possible since I haven't checked the
code for this behavior) that init or cron is sending SIGHUP.

Anyway, SIGHUP appears to be getting sent to taper by something.  I am
not aware of anything in amanda sends SIGHUP, but someone here who has
looked will supply the correct information on that point I'm sure.

How do you start amdump?


Another cause of random process death is running out of memory/swap.
Some OS's will kill processes when under low memory pressure.
Typically that will be a SIGTERM (15) and/or SIGKILL (9) that is sent,
however.  I'm not sure about Linux's behavior in this situation, but
examining resource usage (with a script that records vmstat(8) /
top(1) types of info) may be useful.