Amanda-Users

Re: GNUTAR hanging

2007-03-29 10:39:21
Subject: Re: GNUTAR hanging
From: Chris Cameron <chris.cameron AT netthruput DOT com>
To: amanda-users AT amanda DOT org
Date: Thu, 29 Mar 2007 08:28:19 -0600
sendbackup:

sendbackup: debug 1 pid 28355 ruid 2 euid 2: start at Thu Mar 29 07:15:09 2007
sendbackup: version 2.5.1p3
Could not open conf file "/etc/amanda/amanda-client.conf": No such file or directory Could not open conf file "/etc/amanda/test/amanda-client.conf": No such file or directory sendbackup: debug 1 pid 28355 ruid 2 euid 2: rename at Thu Mar 29 07:15:09 2007 sendbackup req: <GNUTAR /dev/raid0g 0 1970:1:1:0:0:0 OPTIONS |;auth=BSD;compress-fast;encrypt-serv-cust=/opt/amanda/sbin/amcrypt-ossl;server-decrypt-option=-d;index;>
  parsed request as: program `GNUTAR'
                     disk `/dev/raid0g'
                     device `/dev/raid0g'
                     level 0
                     since 1970:1:1:0:0:0
options `|;auth=BSD;compress-fast;encrypt-serv-cust=/opt/amanda/sbin/amcrypt-ossl;server-decrypt-option=-d;index;'
sendbackup: start: carp0:/dev/raid0g lev 0
sendbackup: time 0.041: spawning /usr/bin/gzip in pipeline
sendbackup: argument list: /usr/bin/gzip --fast
sendbackup-gnutar: time 0.046: pid 31136: /usr/bin/gzip --fast
sendbackup-gnutar: time 0.064: doing level 0 dump as listed-incremental to '/var/amanda/gnutar-lists/carp0_dev_raid0g_0.new' sendbackup-gnutar: time 0.069: doing level 0 dump from date: 1970-01-01 0:00:00 GMT
sendbackup: time 0.072: spawning /usr/local/libexec/runtar in pipeline
sendbackup: argument list: runtar test gtar --create --file - --directory /home --one-file-system --listed-incremental /var/amanda/gnutar-lists/carp0_dev_raid0g_0.new --sparse --ignore-failed-read --totals .
sendbackup-gnutar: time 0.129: /usr/local/libexec/runtar: pid 14139
sendbackup: time 0.130: started backup
sendbackup: time 0.139: started index creator: "/usr/local/bin/gtar -tf - 2>/dev/null | sed -e 's/^\.//'"


sendbackup:
-auth=bsd: debug 1 pid 9120 ruid 2 euid 2: start at Thu Mar 29 08:21:53 2007
security_getdriver(name=BSD) returns 0x42af4038
-auth=bsd: version 2.5.1p3
-auth=bsd: build: VERSION="Amanda-2.5.1p3"
-auth=bsd:        BUILT_DATE="Mon Mar 26 08:10:30 MDT 2007"
-auth=bsd: BUILT_MACH="OpenBSD carp1.netthruput.com 4.0 CARP#0 sparc64"
-auth=bsd:        CC="gcc"
-auth=bsd: CONFIGURE_COMMAND="'./configure' '--prefix=/usr/local' '--sysconfdir=/etc' '--localstatedir=/var' '--without-server' '--with-user=operator' '--with-group=operator' '--with-gnutar=/usr/local/bin/gtar'"
-auth=bsd: paths: bindir="/usr/local/bin" sbindir="/usr/local/sbin"
-auth=bsd:        libexecdir="/usr/local/libexec" mandir="/usr/local/man"
-auth=bsd:        AMANDA_TMPDIR="/tmp/amanda" AMANDA_DBGDIR="/tmp/amanda"
-auth=bsd:        CONFIG_DIR="/etc/amanda" DEV_PREFIX="/dev/"
-auth=bsd:        RDEV_PREFIX="/dev/r" DUMP="/sbin/dump"
-auth=bsd:        RESTORE="/sbin/restore" VDUMP=UNDEF VRESTORE=UNDEF
-auth=bsd: XFSDUMP=UNDEF XFSRESTORE=UNDEF VXDUMP=UNDEF VXRESTORE=UNDEF
-auth=bsd:        SAMBA_CLIENT=UNDEF GNUTAR="/usr/local/bin/gtar"
-auth=bsd:        COMPRESS_PATH="/usr/bin/gzip"
-auth=bsd:        UNCOMPRESS_PATH="/usr/bin/gzip" LPRCMD="/usr/bin/lpr"
-auth=bsd:        MAILER="/usr/bin/Mail"
-auth=bsd:        listed_incr_dir="/var/amanda/gnutar-lists"
-auth=bsd: defs:  DEFAULT_SERVER="carp1.netthruput.com"
-auth=bsd:        DEFAULT_CONFIG="DailySet1"
-auth=bsd:        DEFAULT_TAPE_SERVER="carp1.netthruput.com" HAVE_MMAP
-auth=bsd:        HAVE_SYSVSHM LOCKING=POSIX_FCNTL DEBUG_CODE
-auth=bsd: AMANDA_DEBUG_DAYS=4 BSD_SECURITY RSH_SECURITY USE_AMANDAHOSTS
-auth=bsd:        CLIENT_LOGIN="operator" FORCE_USERID HAVE_GZIP
-auth=bsd:        COMPRESS_SUFFIX=".gz" COMPRESS_FAST_OPT="--fast"
-auth=bsd:        COMPRESS_BEST_OPT="--best" UNCOMPRESS_OPT="-dc"
-auth=bsd: time 0.001: dgram_recv(dgram=0x42ef8960, timeout=0, fromaddr=0x42f08958) -auth=bsd: time 0.001: (sockaddr_in *)0x42f08958 = { 2, 906, 192.168.120.50 }
security_handleinit(handle=0x45a96000, driver=0x42af4038 (BSD))
-auth=bsd: time 0.006: accept recv REQ pkt:
<<<<<
SERVICE noop
OPTIONS features=fffffeff9ffeffffff7f;
>>>>>
-auth=bsd: time 0.007: creating new service: noop
OPTIONS features=fffffeff9ffeffffff7f;

-auth=bsd: time 0.012: sending ACK pkt:
<<<<<
>>>>>
-auth=bsd: dgram_send_addr(addr=0xfffffffffffee1b0, dgram=0x42ef8960)
-auth=bsd: time 0.013: (sockaddr_in *)0xfffffffffffee1b0 = { 2, 906, 192.168.120.50 }
-auth=bsd: dgram_send_addr: 0x42ef8960->socket = 0
-auth=bsd: time 0.056: sending REP pkt:
<<<<<
OPTIONS features=fffffeff9ffeffffff7f;
>>>>>
-auth=bsd: dgram_send_addr(addr=0xfffffffffffee340, dgram=0x42ef8960)
-auth=bsd: time 0.056: (sockaddr_in *)0xfffffffffffee340 = { 2, 906, 192.168.120.50 }
-auth=bsd: dgram_send_addr: 0x42ef8960->socket = 0
-auth=bsd: time 0.057: dgram_recv(dgram=0x42ef8960, timeout=0, fromaddr=0x42f08958) -auth=bsd: time 0.057: (sockaddr_in *)0x42f08958 = { 2, 906, 192.168.120.50 }
-auth=bsd: time 0.057: received ACK pkt:
<<<<<
>>>>>
security_close(handle=0x45a96000, driver=0x42af4038 (BSD))
-auth=bsd: time 0.062: dgram_recv(dgram=0x42ef8960, timeout=0, fromaddr=0x42f08958) -auth=bsd: time 0.062: (sockaddr_in *)0x42f08958 = { 2, 906, 192.168.120.50 }
security_handleinit(handle=0x47c62000, driver=0x42af4038 (BSD))
-auth=bsd: time 0.064: accept recv REQ pkt:
<<<<<
SERVICE sendsize
OPTIONS features=fffffeff9ffeffffff7f;maxdumps=1;hostname=carp1;config=test;
GNUTAR /dev/raid0g 0 1970:1:1:0:0:0 -1 OPTIONS |;auth=BSD;compress-fast;encrypt-serv-cust=/opt/amanda/sbin/amcrypt-ossl;server-decrypt-option=-d;index;
>>>>>
-auth=bsd: time 0.064: creating new service: sendsize
OPTIONS features=fffffeff9ffeffffff7f;maxdumps=1;hostname=carp1;config=test;
GNUTAR /dev/raid0g 0 1970:1:1:0:0:0 -1 OPTIONS |;auth=BSD;compress-fast;encrypt-serv-cust=/opt/amanda/sbin/amcrypt-ossl;server-decrypt-option=-d;index;

-auth=bsd: time 0.071: sending ACK pkt:
<<<<<
>>>>>
-auth=bsd: dgram_send_addr(addr=0xfffffffffffee1b0, dgram=0x42ef8960)
-auth=bsd: time 0.072: (sockaddr_in *)0xfffffffffffee1b0 = { 2, 906, 192.168.120.50 }
-auth=bsd: dgram_send_addr: 0x42ef8960->socket = 0
-auth=bsd: time 0.136: sending PREP pkt:
<<<<<
OPTIONS features=fffffeff9ffeffffff7f;
>>>>>
-auth=bsd: dgram_send_addr(addr=0xfffffffffffee320, dgram=0x42ef8960)
-auth=bsd: time 0.137: (sockaddr_in *)0xfffffffffffee320 = { 2, 906, 192.168.120.50 }
-auth=bsd: dgram_send_addr: 0x42ef8960->socket = 0
-auth=bsd: time 0.169: sending REP pkt:
<<<<<
OPTIONS features=fffffeff9ffeffffff7f;
>>>>>
-auth=bsd: dgram_send_addr(addr=0xfffffffffffee340, dgram=0x42ef8960)
-auth=bsd: time 0.169: (sockaddr_in *)0xfffffffffffee340 = { 2, 906, 192.168.120.50 }
-auth=bsd: dgram_send_addr: 0x42ef8960->socket = 0
-auth=bsd: time 0.169: dgram_recv(dgram=0x42ef8960, timeout=0, fromaddr=0x42f08958) -auth=bsd: time 0.169: (sockaddr_in *)0x42f08958 = { 2, 906, 192.168.120.50 }
-auth=bsd: time 0.169: received ACK pkt:
<<<<<
>>>>>
security_close(handle=0x47c62000, driver=0x42af4038 (BSD))
-auth=bsd: time 30.184: pid 9120 finish time Thu Mar 29 08:22:24 2007



Thanks,
Chris


Jean-Louis Martineau wrote:
The 'Illegel seek' is not a problem.
The sendsize is working correctly, the problem is with sendbackup

Could you post the sendbackup and amandad debug files?

Jean-Louis

Chris Cameron wrote:
I have Amanda 2.5.1p3 running on Solaris 8 for the server, and the client is the same version running on OpenBSD 4.0.

Using GNUTAR, the dump times out. The error on the client machine is:

sendsize[3860]: time 0.058: waiting for any estimate child: 1 running
sendsize[28668]: time 0.061: calculating for amname /dev/raid0g, dirname /home, spindle -1 sendsize[28668]: time 0.063: getting size via gnutar for /dev/raid0g level 0 sendsize[28668]: time 0.094: spawning /usr/local/libexec/runtar in pipeline sendsize[28668]: argument list: runtar test /usr/local/bin/gtar --create --file /dev/null --directory /home --one-file-system --listed-incremental /var/amanda/gnutar-lists/carp0_dev_raid0g_0.new --sparse --ignore-failed-read --totals . sendsize[28668]: time 0.337: Total bytes written: 15718400 (15MiB, 124MiB/s)
sendsize[28668]: time 0.342: .....
sendsize[28668]: estimate time for /dev/raid0g level 0: 0.248
sendsize[28668]: estimate size for /dev/raid0g level 0: 15350 KB
sendsize[28668]: time 0.342: waiting for runtar "/dev/raid0g" child
sendsize[28668]: time 0.347: after runtar /dev/raid0g wait
gnutar_calc_estimates: warning - seek failed: Illegal seek
sendsize[28668]: time 0.365: done with amname /dev/raid0g dirname /home spindle -1
sendsize[3860]: time 0.370: child 28668 terminated normally



The tar part seems to work, as I can also run it just at a command line without error. However the message:

warning - seek failed: Illegal seek

Doesn't seem good.


In the report email:

  carp0  /dev/raid0g     lev 0  FAILED [data timeout]
carp0 /dev/raid0g lev 0 FAILED [too many dumper retry: "[request failed: timeout waiting for ACK]"] carp0 /dev/raid0g lev 0 FAILED [cannot read header: got 0 instead of 32768]


When amdump gives up, it leaves the following processes on the client machine:

operator 30460  0.0  0.2   968  2200 ??  Is     9:09AM    0:00.13 amandad
operator 10245 0.0 0.2 816 2240 ?? I 9:10AM 0:00.08 /usr/local/libexec/sendbackup amandad bsd operator 15664 0.0 0.1 824 1528 ?? I 9:10AM 0:00.04 /usr/bin/gzip --fast operator 31389 0.0 0.1 816 1392 ?? I 9:10AM 0:00.02 /usr/local/libexec/sendbackup amandad bsd operator 22560 0.0 0.1 768 576 ?? I 9:10AM 0:00.01 sh -c sh operator 28437 0.0 0.1 432 1192 ?? I 9:10AM 0:00.05 sed -e s/^\\.// operator 18449 0.0 0.1 792 648 ?? I 9:10AM 0:00.03 /usr/local/bin/gtar -tf -


Watching tcpdump I don't see any obvious retrying going on. I have a whack of other Solaris 8 machines that don't have any problem with tar (1.16)


Anyone know what I've done wrong here?


Thanks,
Chris



<Prev in Thread] Current Thread [Next in Thread>