Joshua Baker-LePain wrote:
On Mon, 23 Apr 2007 at 1:53pm, Don Murray wrote
Note the "selfchecks" that are running with "D" process state - meaning they
are sleeping in the kernel and are uninterruptible and therefore unkillable.
So - it looks like I need to reboot my client before I can get a backup from
it again, which is a little harsh.
I was wondering whether anyone knows why Amanda client 2.4.4 would get wedged
like that, is there something I can do to minimize the problem? Also, if
anyone has ideas about avoiding the estimate issues all together, I would
appreciate any advice.
Look in /tmp/amanda on the clients for the *debug files relating to the
hung processes. They should have more details on what went wrong. Also,
the alternate estimate methods went in before 2.5 -- I'm running 2.4.5p1
and 'man amanda.conf' says "estimate client|calcsize|server".
Joshua - thanks for the reply.
Whenever I include "estimate calcsize" or "estimate client" in my
amanda.conf I get :
"/etc/amanda/daily/amanda.conf", line 36: configuration keyword expected
"/etc/amanda/daily/amanda.conf", line 36: end of line expected
on the "estimate" line. I've tried as a general parameter and I've
tried within a particular dump type definition. Maybe I'm doing
something wrong? There doesn't appear to be an "amanda.conf" man page
installed by the RPM but I did go to
/usr/share/doc/amanda-server-2.4.4p3 and grepped for "calcsize". There
is mention of it in the "whats.new" file... says it must be installed
with setuid to root. Which I believe it is:
[root@windsor amanda-server-2.4.4p3]# locate calcsize
/usr/lib/amanda/calcsize
[root@windsor amanda-server-2.4.4p3]# cd /usr/lib/amanda
[root@windsor amanda]# ls -l calcsize
-rwsr-x--- 1 root disk 19401 Jun 28 2004 calcsize
So the calcsize program exists but I am too daft to enable it in the
amanda.conf file. Is this incorrect - this is the kind of definition
where if I comment out the estimate line, all is good, otherwise I get a
check error.
define dumptype remote {
global
compress client fast
estimate calcsize
}
As for the selfcheck hanging up.
The log files are kept in /var/log/amanda on this system.
"selfcheck" is the process that is hung and its log isn't very helpful...
[root@gilmore amanda]# cat selfcheck.20070420200001.debug
selfcheck: debug 1 pid 12910 ruid 33 euid 33: start at Fri Apr 20
20:00:01 2007
/usr/lib/amanda/selfcheck: version 2.4.4p3
selfcheck: time 0.000: checking disk /backedup/home
I'm also pasting below the run up in the "amandad.<date>.debug" file to
the first ERROR encountered. It all seems ok to me until it gets to the
"ERROR amandad busy".
I sure wish I didn't have to reboot to get rid of those hung processes.
Each time I try to run amcheck I get:
Amanda Backup Client Hosts Check
--------------------------------
ERROR: NAK gilmore: amandad busy
Client check: 4 hosts checked in 10.120 seconds, 1 problem found
:(
Don
----
Amanda 2.4 REQ HANDLE 001-28D005F7 SEQ 1177124409
SECURITY USER amanda
SERVICE selfcheck
OPTIONS features=fffffeff9ffe0f;maxdumps=1;hostname=gilmore;
GNUTAR /backedup/home 0 OPTIONS
|;bsd-auth;compress-fast;index;exclude-optional;
GNUTAR /backedup/project 0 OPTIONS
|;bsd-auth;compress-fast;index;exclude-optional;
GNUTAR /nonbackedup/work3/backups/vancouver 0 OPTIONS
|;bsd-auth;compress-fast;index;exclude-optional;
GNUTAR /nonbackedup/work3/backups/spruce 0 OPTIONS
|;bsd-auth;compress-fast;index;exclude-optional;
GNUTAR /nonbackedup/work3/backups/princeedward 0 OPTIONS
|;bsd-auth;compress-fast;index;exclude-optional;
GNUTAR /nonbackedup/work3/backups/nootka 0 OPTIONS
|;bsd-auth;compress-fast;index;exclude-optional;
GNUTAR /nonbackedup/work3/backups/glen 0 OPTIONS
|;bsd-auth;compress-fast;index;exclude-optional;
GNUTAR /nonbackedup/work3/backups/fs2 0 OPTIONS
|;bsd-auth;compress-fast;index;exclude-optional;
GNUTAR /nonbackedup/work3/backups/burrard 0 OPTIONS
|;bsd-auth;compress-fast;index;exclude-optional;
GNUTAR / 0 OPTIONS
|;bsd-auth;compress-fast;index;exclude-list=/tmp;exclude-optional;
----
amandad: time 30.047: received dup P_REQ packet, ACKing it
amandad: time 30.047: sending ack:
----
Amanda 2.4 ACK HANDLE 001-28D005F7 SEQ 1177124409
----
amandad: time 60.048: got packet:
----
Amanda 2.4 REQ HANDLE 001-28D005F7 SEQ 1177124409
SECURITY USER amanda
SERVICE selfcheck
OPTIONS features=fffffeff9ffe0f;maxdumps=1;hostname=gilmore;
GNUTAR /backedup/home 0 OPTIONS
|;bsd-auth;compress-fast;index;exclude-optional;
GNUTAR /backedup/project 0 OPTIONS
|;bsd-auth;compress-fast;index;exclude-optional;
GNUTAR /nonbackedup/work3/backups/vancouver 0 OPTIONS
|;bsd-auth;compress-fast;index;exclude-optional;
GNUTAR /nonbackedup/work3/backups/spruce 0 OPTIONS
|;bsd-auth;compress-fast;index;exclude-optional;
GNUTAR /nonbackedup/work3/backups/princeedward 0 OPTIONS
|;bsd-auth;compress-fast;index;exclude-optional;
GNUTAR /nonbackedup/work3/backups/nootka 0 OPTIONS
|;bsd-auth;compress-fast;index;exclude-optional;
GNUTAR /nonbackedup/work3/backups/glen 0 OPTIONS
|;bsd-auth;compress-fast;index;exclude-optional;
GNUTAR /nonbackedup/work3/backups/fs2 0 OPTIONS
|;bsd-auth;compress-fast;index;exclude-optional;
GNUTAR /nonbackedup/work3/backups/burrard 0 OPTIONS
|;bsd-auth;compress-fast;index;exclude-optional;
GNUTAR / 0 OPTIONS
|;bsd-auth;compress-fast;index;exclude-list=/tmp;exclude-optional;
----
amandad: time 60.048: received dup P_REQ packet, ACKing it
amandad: time 60.048: sending ack:
----
Amanda 2.4 ACK HANDLE 001-28D005F7 SEQ 1177124409
----
amandad: time 25500.733: got packet:
----
Amanda 2.4 REQ HANDLE 001-00AD92F7 SEQ 1177149903
SECURITY USER amanda
SERVICE noop
OPTIONS features=fffffeff9ffe0f;
----
amandad: time 25500.748: received other packet, NAKing it
addr: peer 192.168.0.63 dup 192.168.0.63, port: peer 583 dup 919
amandad: time 25500.748: sending nack:
----
Amanda 2.4 NAK HANDLE 001-00AD92F7 SEQ 1177149903
ERROR amandad busy
|