Amanda-Users

one client stopped

2007-06-11 15:35:01
Subject: one client stopped
From: Jon LaBadie <jon AT jgcomp DOT com>
To: Jon LaBadie <amanda-users AT amanda DOT org>
Date: Mon, 11 Jun 2007 15:27:04 -0400
I'm stumped.  A few days ago, 6/6, one client stopped backing
up completely, all DLE's.  The basic error message(s) are

  "cannot read header: got 0 instead of 32768"
  "data timeout"

Of course I made no changes to either client or server around
that date  :))   (big smiley just in case)

Estimates are done on the server, so I assume the timeout
is waiting for the tar data.  Matching this is that the
previously successful amdump runs would complete in 60-90
minutes, now amdump doesn't finish for 5-6 hours,
waiting for data no doubt.

Server logs show the command being prepared but the
holding disk temp file is empty at at the timeout.

A surprise to me was the /tmp/amanda debug files on
the client.  On the successful days only two types of
file are present:

  sendbackup.*.exclude (empty)
  runtar.*.debug

On the failing days neither of those are present, instead
there are:

  amandad.*.debug               (1 included below)
  selfcheck.*.debug
  selfcheck.*.exclude
  sendbackup.*.debug            (1 included below)

amcheck is clean

client is Sol 9 x86 running amanda 2.4.4
server is Fedora 4 running amanda 2.5.0.p2


=== amandad.20070609063322.debug ===

amandad: debug 1 pid 7008 ruid 27 euid 27: start at Sat Jun  9 06:33:22 2007
amandad: version 2.4.4-20030529
amandad: build: VERSION="Amanda-2.4.4-20030529"
amandad:        BUILT_DATE="Thu May 29 10:57:59 EDT 2003"
amandad:        BUILT_MACH="SunOS butch 5.9 Generic_112234-04 i86pc i386 i86pc"
amandad:        CC="gcc"
amandad:        CONFIGURE_COMMAND="'./configure' '--prefix=/usr/local' 
'--enable-shared' '--with-user=amanda' '--with-group=backup' 
'--with-config=DS1' '--with-includes=/opt/sfw/include' 
'--with-libraries=/opt/sfw/lib' '--with-amandahosts' 
'--with-tape-device=/dev/rmt/0ln' '--with-changer-device=/dev/rmt/1ln' 
'--with-readline' '--with-mmap' '--with-gnutar=/usr/local/libexec/amgtar' 
'--with-pid-debug-files' '--without-termcap' '--without-ltermcap' 
'--without-LIBTERMCAP' '--with-ncurses=yes' '--with-lncurses=yes' 
'--with-LIBNCURSES=yes'"
amandad: paths: bindir="/usr/local/bin" sbindir="/usr/local/sbin"
amandad:        libexecdir="/usr/local/libexec" mandir="/usr/local/man"
amandad:        AMANDA_TMPDIR="/tmp/amanda" AMANDA_DBGDIR="/tmp/amanda"
amandad:        CONFIG_DIR="/usr/local/etc/amanda" DEV_PREFIX="/dev/dsk/"
amandad:        RDEV_PREFIX="/dev/rdsk/" DUMP="/usr/sbin/ufsdump"
amandad:        RESTORE="/usr/sbin/ufsrestore"
amandad:        SAMBA_CLIENT="/usr/sfw/bin/smbclient"
amandad:        GNUTAR="/usr/local/libexec/amgtar"
amandad:        COMPRESS_PATH="/usr/bin/gzip"
amandad:        UNCOMPRESS_PATH="/usr/bin/gzip" MAILER="/usr/ucb/Mail"
amandad:        listed_incr_dir="/usr/local/var/amanda/gnutar-lists"
amandad: defs:  DEFAULT_SERVER="butch" DEFAULT_CONFIG="DS1"
amandad:        DEFAULT_TAPE_SERVER="butch"
amandad:        DEFAULT_TAPE_DEVICE="/dev/rmt/0ln" HAVE_MMAP
amandad:        LOCKING=POSIX_FCNTL SETPGRP_VOID DEBUG_CODE
amandad:        AMANDA_DEBUG_DAYS=4 BSD_SECURITY USE_AMANDAHOSTS
amandad:        CLIENT_LOGIN="amanda" FORCE_USERID HAVE_GZIP
amandad:        COMPRESS_SUFFIX=".gz" COMPRESS_FAST_OPT="--fast"
amandad:        COMPRESS_BEST_OPT="--best" UNCOMPRESS_OPT="-dc"
amandad: time 0.000: got packet:
--------
Amanda 2.5 REQ HANDLE 000-00000011 SEQ 1181629152
SECURITY USER amandabackup
SERVICE sendbackup
OPTIONS features=fffffeff9ffeffff07;hostname=butch;
GNUTAR HOME /home 4 2007:6:4:7:9:20 OPTIONS 
|;auth=BSD;compress-fast;index;exclude-list=/usr/local/lib/amanda/exclude.gtar;exclude-optional;
--------

amandad: time 0.000: sending ack:
----
Amanda 2.4 ACK HANDLE 000-00000011 SEQ 1181629152
----

amandad: time 0.001: bsd security: remote host bigcow.jgcomp.com user 
amandabackup local user amanda
amandad: time 0.002: amandahosts security check passed
amandad: time 0.002: running service "/usr/local/libexec/sendbackup"
amandad: time 0.019: sending REP packet:
----
Amanda 2.4 REP HANDLE 000-00000011 SEQ 1181629152
CONNECT DATA 35365 MESG 35366 INDEX 35367
OPTIONS features=fffffeff9ffe0f;
----

amandad: time 0.019: got packet:
----
Amanda 2.5 ACK HANDLE 000-00000011 SEQ 1181629152
----

amandad: time 0.019: pid 7008 finish time Sat Jun  9 06:33:22 2007



=== sendbackup.20070609053322.debug ===

sendbackup: debug 1 pid 6593 ruid 27 euid 27: start at Sat Jun  9 05:33:22 2007
/usr/local/libexec/sendbackup: version 2.4.4-20030529
  parsed request as: program `GNUTAR'
                     disk `OPT'
                     device `/opt'
                     level 2
                     since 2007:6:4:7:5:45
                     options 
`|;auth=BSD;compress-fast;index;exclude-list=/usr/local/lib/amanda/exclude.gtar;exclude-optional;'
sendbackup: try_socksize: send buffer size is 65536
sendbackup: time 0.005: stream_server: waiting for connection: 0.0.0.0.35282
sendbackup: time 0.005: stream_server: waiting for connection: 0.0.0.0.35283
sendbackup: time 0.006: stream_server: waiting for connection: 0.0.0.0.35284
sendbackup: time 0.006: waiting for connect on 35282, then 35283, then 35284
sendbackup: time 0.008: stream_accept: connection from 205.166.177.140.49101
sendbackup: time 0.008: stream_accept: connection from 205.166.177.140.35098
sendbackup: time 0.009: stream_accept: connection from 205.166.177.140.54287
sendbackup: time 0.009: got all connections
sendbackup: time 0.009: spawning /usr/bin/gzip in pipeline
sendbackup: argument list: /usr/bin/gzip --fast
sendbackup-gnutar: time 0.013: pid 6595: /usr/bin/gzip --fast
sendbackup-gnutar: time 0.255: doing level 2 dump as listed-incremental from 
/usr/local/var/amanda/gnutar-lists/butchOPT_1 to 
/usr/local/var/amanda/gnutar-lists/butchOPT_2.new


jl
-- 
Jon H. LaBadie                  jon AT jgcomp DOT com
 JG Computing
 4455 Province Line Road        (609) 252-0159
 Princeton, NJ  08540-4322      (609) 683-7220 (fax)

<Prev in Thread] Current Thread [Next in Thread>