Amanda-Users

Re: dead processes

2007-04-24 14:20:10
Subject: Re: dead processes
From: Don Murray <samba AT geeksrus DOT ca>
To: Joshua Baker-LePain <jlb17 AT duke DOT edu>
Date: Tue, 24 Apr 2007 11:14:56 -0700
Joshua Baker-LePain wrote:
On Mon, 23 Apr 2007 at 1:53pm, Don Murray wrote

Note the "selfchecks" that are running with "D" process state - meaning they
are sleeping in the kernel and are uninterruptible and therefore unkillable.

So - it looks like I need to reboot my client before I can get a backup from
it again, which is a little harsh.

I was wondering whether anyone knows why Amanda client 2.4.4 would get wedged
like that, is there something I can do to minimize the problem?  Also, if
anyone has ideas about avoiding the estimate issues all together, I would
appreciate any advice.

Look in /tmp/amanda on the clients for the *debug files relating to the
hung processes.  They should have more details on what went wrong.  Also,
the alternate estimate methods went in before 2.5 -- I'm running 2.4.5p1
and 'man amanda.conf' says "estimate client|calcsize|server".

Joshua - thanks for the reply.


Whenever I include "estimate calcsize" or "estimate client" in my amanda.conf I get :

"/etc/amanda/daily/amanda.conf", line 36: configuration keyword expected
"/etc/amanda/daily/amanda.conf", line 36: end of line expected

on the "estimate" line. I've tried as a general parameter and I've tried within a particular dump type definition. Maybe I'm doing something wrong? There doesn't appear to be an "amanda.conf" man page installed by the RPM but I did go to /usr/share/doc/amanda-server-2.4.4p3 and grepped for "calcsize". There is mention of it in the "whats.new" file... says it must be installed with setuid to root. Which I believe it is:

[root@windsor amanda-server-2.4.4p3]# locate calcsize
/usr/lib/amanda/calcsize
[root@windsor amanda-server-2.4.4p3]# cd /usr/lib/amanda
[root@windsor amanda]# ls -l calcsize
-rwsr-x---  1 root disk 19401 Jun 28  2004 calcsize

So the calcsize program exists but I am too daft to enable it in the amanda.conf file. Is this incorrect - this is the kind of definition where if I comment out the estimate line, all is good, otherwise I get a check error.

define dumptype remote {
   global
   compress client fast
   estimate calcsize
}


As for the selfcheck hanging up.

The log files are kept in /var/log/amanda on this system.

"selfcheck" is the process that is hung and its log isn't very helpful...

[root@gilmore amanda]# cat  selfcheck.20070420200001.debug
selfcheck: debug 1 pid 12910 ruid 33 euid 33: start at Fri Apr 20 20:00:01 2007
/usr/lib/amanda/selfcheck: version 2.4.4p3
selfcheck: time 0.000: checking disk /backedup/home

I'm also pasting below the run up in the "amandad.<date>.debug" file to the first ERROR encountered. It all seems ok to me until it gets to the "ERROR amandad busy".

I sure wish I didn't have to reboot to get rid of those hung processes. Each time I try to run amcheck I get:

Amanda Backup Client Hosts Check
--------------------------------
ERROR: NAK gilmore: amandad busy
Client check: 4 hosts checked in 10.120 seconds, 1 problem found

:(

Don



----
Amanda 2.4 REQ HANDLE 001-28D005F7 SEQ 1177124409
SECURITY USER amanda
SERVICE selfcheck
OPTIONS features=fffffeff9ffe0f;maxdumps=1;hostname=gilmore;
GNUTAR /backedup/home 0 OPTIONS |;bsd-auth;compress-fast;index;exclude-optional; GNUTAR /backedup/project 0 OPTIONS |;bsd-auth;compress-fast;index;exclude-optional; GNUTAR /nonbackedup/work3/backups/vancouver 0 OPTIONS |;bsd-auth;compress-fast;index;exclude-optional; GNUTAR /nonbackedup/work3/backups/spruce 0 OPTIONS |;bsd-auth;compress-fast;index;exclude-optional; GNUTAR /nonbackedup/work3/backups/princeedward 0 OPTIONS |;bsd-auth;compress-fast;index;exclude-optional; GNUTAR /nonbackedup/work3/backups/nootka 0 OPTIONS |;bsd-auth;compress-fast;index;exclude-optional; GNUTAR /nonbackedup/work3/backups/glen 0 OPTIONS |;bsd-auth;compress-fast;index;exclude-optional; GNUTAR /nonbackedup/work3/backups/fs2 0 OPTIONS |;bsd-auth;compress-fast;index;exclude-optional; GNUTAR /nonbackedup/work3/backups/burrard 0 OPTIONS |;bsd-auth;compress-fast;index;exclude-optional; GNUTAR / 0 OPTIONS |;bsd-auth;compress-fast;index;exclude-list=/tmp;exclude-optional;
----

amandad: time 30.047: received dup P_REQ packet, ACKing it
amandad: time 30.047: sending ack:
----
Amanda 2.4 ACK HANDLE 001-28D005F7 SEQ 1177124409
----

amandad: time 60.048: got packet:
----
Amanda 2.4 REQ HANDLE 001-28D005F7 SEQ 1177124409
SECURITY USER amanda
SERVICE selfcheck
OPTIONS features=fffffeff9ffe0f;maxdumps=1;hostname=gilmore;
GNUTAR /backedup/home 0 OPTIONS |;bsd-auth;compress-fast;index;exclude-optional; GNUTAR /backedup/project 0 OPTIONS |;bsd-auth;compress-fast;index;exclude-optional; GNUTAR /nonbackedup/work3/backups/vancouver 0 OPTIONS |;bsd-auth;compress-fast;index;exclude-optional; GNUTAR /nonbackedup/work3/backups/spruce 0 OPTIONS |;bsd-auth;compress-fast;index;exclude-optional; GNUTAR /nonbackedup/work3/backups/princeedward 0 OPTIONS |;bsd-auth;compress-fast;index;exclude-optional; GNUTAR /nonbackedup/work3/backups/nootka 0 OPTIONS |;bsd-auth;compress-fast;index;exclude-optional; GNUTAR /nonbackedup/work3/backups/glen 0 OPTIONS |;bsd-auth;compress-fast;index;exclude-optional; GNUTAR /nonbackedup/work3/backups/fs2 0 OPTIONS |;bsd-auth;compress-fast;index;exclude-optional; GNUTAR /nonbackedup/work3/backups/burrard 0 OPTIONS |;bsd-auth;compress-fast;index;exclude-optional; GNUTAR / 0 OPTIONS |;bsd-auth;compress-fast;index;exclude-list=/tmp;exclude-optional;
----

amandad: time 60.048: received dup P_REQ packet, ACKing it
amandad: time 60.048: sending ack:
----
Amanda 2.4 ACK HANDLE 001-28D005F7 SEQ 1177124409
----

amandad: time 25500.733: got packet:
----
Amanda 2.4 REQ HANDLE 001-00AD92F7 SEQ 1177149903
SECURITY USER amanda
SERVICE noop
OPTIONS features=fffffeff9ffe0f;
----

amandad: time 25500.748: received other packet, NAKing it
 addr: peer 192.168.0.63 dup 192.168.0.63, port: peer 583 dup 919
amandad: time 25500.748: sending nack:
----
Amanda 2.4 NAK HANDLE 001-00AD92F7 SEQ 1177149903
ERROR amandad busy



<Prev in Thread] Current Thread [Next in Thread>