Amanda-Users

Re: Disk was stranded on waitq, but estimates OK.. Meh? (server 2.5.2p1 - client 2.4.4p3)

2007-11-22 06:57:24
Subject: Re: Disk was stranded on waitq, but estimates OK.. Meh? (server 2.5.2p1 - client 2.4.4p3)
From: Jean-Louis Martineau <martineau AT zmanda DOT com>
To: Francis Galiegue <fg AT one2team DOT com>
Date: Thu, 22 Nov 2007 06:51:18 -0500
I only want debug files for a failed run.

I think I found the problem, can you try the attached patch on the server.
The "hmm, disk was stranded on waitq" message hide a connection problem.

Jean-Louis

Francis Galiegue wrote:
Le Wednesday 21 November 2007 20:07:47, vous avez écrit :
You should never get a "hmm, disk was stranded on waitq" error message.
Post the complete amdump.X log file from the server and the complete
amanda.*.debug and sendsize.*.debug files from the client.


I'll do them one by one. My first attempt to join them all didn't make it to the list :(

This night BTW, the backup didn't fail... Do you want the files for this night as well?

------------------------------------------------------------------------

amandad: debug 1 pid 8227 ruid 33 euid 33: start at Wed Nov 21 01:10:20 2007
amandad: version 2.4.4p3
amandad: build: VERSION="Amanda-2.4.4p3"
amandad:        BUILT_DATE="Mon Jun 28 15:53:32 EDT 2004"
amandad:        BUILT_MACH="Linux daffy.perf.redhat.com 2.4.21-15.5.ELsmp #1 SMP Sat 
May 15 17:30:24 EDT 2004 i686 i686 i386 GNU/Linux"
amandad:        CC="i386-redhat-linux-gcc"
amandad:        CONFIGURE_COMMAND="'./configure' '--build=i386-redhat-linux' 
'--host=i386-redhat-linux' '--target=i386-redhat-linux-gnu' '--program-prefix=' 
'--prefix=/usr' '--exec-prefix=/usr' '--bindir=/usr/bin' '--sbindir=/usr/sbin' 
'--sysconfdir=/etc' '--datadir=/usr/share' '--includedir=/usr/include' 
'--libdir=/usr/lib' '--libexecdir=/usr/lib/amanda' '--localstatedir=/var/lib' 
'--sharedstatedir=/usr/com' '--mandir=/usr/share/man' '--infodir=/usr/share/info' 
'--enable-shared' '--with-index-server=localhost' 
'--with-gnutar-listdir=/var/lib/amanda/gnutar-lists' 
'--with-smbclient=/usr/bin/smbclient' '--with-amandahosts' '--with-user=amanda' 
'--with-group=disk' '--with-tmpdir=/var/log/amanda' '--with-gnutar=/bin/tar'"
amandad: paths: bindir="/usr/bin" sbindir="/usr/sbin"
amandad:        libexecdir="/usr/lib/amanda" mandir="/usr/share/man"
amandad:        AMANDA_TMPDIR="/var/log/amanda"
amandad:        AMANDA_DBGDIR="/var/log/amanda" CONFIG_DIR="/etc/amanda"
amandad:        DEV_PREFIX="/dev/" RDEV_PREFIX="/dev/r"
amandad:        DUMP="/sbin/dump" RESTORE="/sbin/restore" VDUMP=UNDEF
amandad:        VRESTORE=UNDEF XFSDUMP=UNDEF XFSRESTORE=UNDEF VXDUMP=UNDEF
amandad:        VXRESTORE=UNDEF SAMBA_CLIENT="/usr/bin/smbclient"
amandad:        GNUTAR="/bin/tar" COMPRESS_PATH="/usr/bin/gzip"
amandad:        UNCOMPRESS_PATH="/usr/bin/gzip" LPRCMD="/usr/bin/lpr"
amandad:        MAILER="/usr/bin/Mail"
amandad:        listed_incr_dir="/var/lib/amanda/gnutar-lists"
amandad: defs:  DEFAULT_SERVER="localhost" DEFAULT_CONFIG="DailySet1"
amandad:        DEFAULT_TAPE_SERVER="localhost"
amandad:        DEFAULT_TAPE_DEVICE="/dev/null" HAVE_MMAP HAVE_SYSVSHM
amandad:        LOCKING=POSIX_FCNTL SETPGRP_VOID DEBUG_CODE
amandad:        AMANDA_DEBUG_DAYS=4 BSD_SECURITY USE_AMANDAHOSTS
amandad:        CLIENT_LOGIN="amanda" FORCE_USERID HAVE_GZIP
amandad:        COMPRESS_SUFFIX=".gz" COMPRESS_FAST_OPT="--fast"
amandad:        COMPRESS_BEST_OPT="--best" UNCOMPRESS_OPT="-dc"
amandad: time 0.000: got packet:
--------
Amanda 2.5 REQ HANDLE 000-00000001 SEQ 1196328157
SECURITY USER amanda
SERVICE noop
OPTIONS features=ffffffff9ffeffffffff00;
--------

amandad: time 0.000: sending ack:
----
Amanda 2.4 ACK HANDLE 000-00000001 SEQ 1196328157
----

amandad: time 0.001: bsd security: remote host circe.olympe.o2t user amanda 
local user amanda
amandad: time 0.002: amandahosts security check passed
amandad: time 0.002: running service "noop"
amandad: time 0.002: sending REP packet:
----
Amanda 2.4 REP HANDLE 000-00000001 SEQ 1196328157
OPTIONS features=fffffeff9ffe0f;
----

amandad: time 0.002: got packet:
----
Amanda 2.5 ACK HANDLE 000-00000001 SEQ 1196328157
----

amandad: time 0.002: pid 8227 finish time Wed Nov 21 01:10:20 2007

Index: server-src/planner.c
===================================================================
--- server-src/planner.c        (revision 9117)
+++ server-src/planner.c        (working copy)
@@ -1909,10 +1909,8 @@
  error_return:
     i = 0;
     for(dp = hostp->disks; dp != NULL; dp = dp->hostnext) {
-       if(est(dp)->state != DISK_ACTIVE) continue;
-       qname = quote_string(dp->name);
-       est(dp)->state = DISK_DONE;
        if(est(dp)->state == DISK_ACTIVE) {
+           qname = quote_string(dp->name);
            est(dp)->state = DISK_DONE;
            remove_disk(&waitq, dp);
            enqueue_disk(&failq, dp);
@@ -1921,8 +1919,8 @@
            est(dp)->errstr = stralloc(errbuf);
            g_fprintf(stderr, _("error result for host %s disk %s: %s\n"),
                    dp->host->hostname, qname, errbuf);
+           amfree(qname);
        }
-       amfree(qname);
     }
     if(i == 0) {
        /*