Amanda-Users

Re: sendsize finishes, planner doesn't notice...

2007-10-11 15:50:20
Subject: Re: sendsize finishes, planner doesn't notice...
From: Paul Lussier <pll+amanda AT permabit DOT com>
To: amanda-users AT amanda DOT org
Date: Thu, 11 Oct 2007 15:39:24 -0400
Jean-Louis Martineau <martineau AT zmanda DOT com> writes:

> It's weird.
>
> Do you have an amdump log file or just amdump.1?
> The only way to get this is if you killed amanda process on the
> server, maybe a server crash.
> Do you still have amanda process running on the server?

I do now. I started amanda off Tuesday night at "Tue Oct  9 22:48:34 2007".

According the /var/log/amanda/amandad/amandad.20071009224834.debug file:

  amandad: time 21604.147: pid 26218 finish time Wed Oct 10 04:48:39 2007

According to sendsize.20071009224835.debug:

amanda2:/var/log/amanda/client/offsite# tail sendsize.20071009224835.debug 
errmsg is /usr/local/libexec/runtar exited with status 1: see 
/var/log/amanda/client/offsite/sendsize.20071009224835.debug
sendsize[26687]: time 37138.237: done with amname /permabit/user/uz dirname 
/permabit/user spindle -1
sendsize[26379]: time 37823.330: Total bytes written: 541649408000 (505GiB, 
14MiB/s)
sendsize[26379]: time 37823.453: .....
sendsize[26379]: time 37823.453: estimate time for /permabit/user/eh level 0: 
37823.251
sendsize[26379]: time 37823.453: estimate size for /permabit/user/eh level 0: 
528954500 KB
sendsize[26379]: time 37823.453: waiting for runtar "/permabit/user/eh" child
sendsize[26379]: time 37823.453: after runtar /permabit/user/eh wait
errmsg is /usr/local/libexec/runtar exited with status 1: see 
/var/log/amanda/client/offsite/sendsize.20071009224835.debug
sendsize[26379]: time 37823.537: done with amname /permabit/user/eh dirname 
/permabit/user spindle -1

So, sendsize claims to be done, yet planner doesn't think so:

  planner: time 16531.383: got partial result for host amanda2 disk \
     /permabit/user/uz: 0 -> -2K, -1 -> -2K, -1 -> -2K
  [...]
  planner: time 16531.384: got partial result for host amanda2 disk \
     /permabit/user/eh: 0 -> -2K, -1 -> -2K, -1 -> -2K

amdump is currently still running, amandad has finished, but we're
still waiting for estimates which will never arrive.

I also find it disturbing that the debug log I'm looking at,
sendsize.20071009224835.debug, tells me to look at the log I'm looking
at for further information:
 
    errmsg is /usr/local/libexec/runtar exited with status 1: see \
    /var/log/amanda/client/offsite/sendsize.20071009224835.debug

Any idea why amandad is dying before sending the estimate data back to
the planner?  My etimeout is currently set to:

  # grep timeout /etc/amanda/offsite/amanda.conf
  etimeout  72000          # number of seconds per filesystem for estimates.
  dtimeout  72000         # number of idle seconds before a dump is aborted.
  ctimeout    30          # maximum number of seconds that amcheck waits
  amanda2:/var/log/amanda/server/offsite# su - backup -c 'amadmin offsite 
config' | grep -i timeout
  ETIMEOUT              72000
  DTIMEOUT              72000
  CTIMEOUT              30

  amanda2:/var/log/amanda/server/offsite# /usr/local/sbin/amgetconf offsite 
etimeout
72000

su - backup -c 'amadmin offsite version'
build: VERSION="Amanda-2.5.2p1"
       BUILT_DATE="Tue Sep 4 15:45:27 EDT 2007"
       BUILT_MACH="Linux amanda2 2.6.18-4-686 #1 SMP Mon Mar 26 17:17:36 UTC 
2007 i686 GNU/Linux"
       CC="gcc-4.2"
       CONFIGURE_COMMAND="'./configure' '--prefix=/usr/local' '--enable-shared' 
'--sysconfdir=/etc' '--localstatedir=/var/lib' 
'--with-gnutar-listdir=/var/lib/amanda/gnutar-lists' 
'--with-index-server=localhost' '--with-user=backup' '--with-group=backup' 
'--with-bsd-security' '--with-amandahosts' 
'--with-smbclient=/usr/bin/smbclient' '--with-debugging=/var/log/amanda' 
'--with-dumperdir=/usr/lib/amanda/dumper.d' '--with-tcpportrange=50000,50100' 
'--with-udpportrange=840,860' '--with-maxtapeblocksize=256' 
'--with-ssh-security'"
paths: bindir="/usr/local/bin" sbindir="/usr/local/sbin"
       libexecdir="/usr/local/libexec" mandir="/usr/local/man"
       AMANDA_TMPDIR="/tmp/amanda"
       AMANDA_DBGDIR="/var/log/amanda" CONFIG_DIR="/etc/amanda"
       DEV_PREFIX="/dev/" RDEV_PREFIX="/dev/" DUMP=UNDEF
       RESTORE=UNDEF VDUMP=UNDEF VRESTORE=UNDEF XFSDUMP=UNDEF
       XFSRESTORE=UNDEF VXDUMP=UNDEF VXRESTORE=UNDEF
       SAMBA_CLIENT=UNDEF GNUTAR="/bin/tar"
       COMPRESS_PATH="/bin/gzip" UNCOMPRESS_PATH="/bin/gzip"
       LPRCMD="/usr/bin/lpr" MAILER="/usr/bin/Mail"
       listed_incr_dir="/var/lib/amanda/gnutar-lists"
defs:  DEFAULT_SERVER="localhost" DEFAULT_CONFIG="DailySet1"
       DEFAULT_TAPE_SERVER="localhost" HAVE_MMAP NEED_STRSTR
       HAVE_SYSVSHM LOCKING=POSIX_FCNTL SETPGRP_VOID DEBUG_CODE
       AMANDA_DEBUG_DAYS=4 BSD_SECURITY RSH_SECURITY USE_AMANDAHOSTS
       CLIENT_LOGIN="backup" FORCE_USERID HAVE_GZIP
       COMPRESS_SUFFIX=".gz" COMPRESS_FAST_OPT="--fast"
       COMPRESS_BEST_OPT="--best" UNCOMPRESS_OPT="-dc"


Am I missing something extremely obvious?  I've been using amanda for
over a decade, and I can't figure out why she's behaving like this.

If there's any more information you need in order to help me figure
this out, please let me know, the suspense here is killing me :)

-- 
Thanks,
Paul