Amanda-Users

RE: odd dump timeout symptoms

2005-08-17 10:20:52
Subject: RE: odd dump timeout symptoms
From: "Scott R. Burns" <Scott.Burns AT Netcontech DOT Com>
To: "Jamie Wilkinson" <jaq AT spacepants DOT org>, <amanda-users AT amanda DOT org>
Date: Tue, 16 Aug 2005 22:50:56 -0400
Can the version of GNU tar you are using handle single archives of this size
? There were some older versions that used signed long internals that
overflowed on me in the past and caused problems.

Have you tried to tar/gzip it directly without amanda to see if that works ?

Scott R. Burns
NETCON Technologies Inc.
Suite 135 - 4474 Blakie Road
London, Ontario, Canada
N6L 1G7
Voice: +1.519.652.0401
Fax: +1.519.652.9275


-----Original Message-----
From: owner-amanda-users AT amanda DOT org
[mailto:owner-amanda-users AT amanda DOT org]On Behalf Of Jamie Wilkinson
Sent: Tuesday, August 16, 2005 10:36 PM
To: amanda-users AT amanda DOT org
Subject: odd dump timeout symptoms


I have a very large DLE, approaching 100GB, on my fileserver.  The backup
server is running 2.4.5, and the fileserver is running 2.4.5b1.

The dump on this DLE is returning the following error:

  bulkhead.b /data/home lev 0 FAILED [data read: Connection reset by peer]

in the summary, which looks like this in the sendbackup log on this client:

sendbackup: time 10.050: spawning /usr/lib/amanda/runtar in pipeline
sendbackup: argument list: gtar --create --file - --directory
/home --one-file-system --listed-incremental
/var/lib/amanda/gnutar-lists/bulkhead.backup_data_home_0.new --sparse --igno
re-failed-read --totals --exclude-from
/var/log/amanda/sendbackup._data_home.20050817013729.exclude .
sendbackup-gnutar: time 10.051: /usr/lib/amanda/runtar: pid 6101
sendbackup: time 10.055: started index creator: "/bin/tar -tf - 2>/dev/null
| sed -e 's/^\.//'"
sendbackup: time 22182.305: index tee cannot write [Broken pipe]
sendbackup: time 22182.305: 126: strange(?):
sendbackup: time 22182.326: pid 6099 finish time Wed Aug 17 07:47:01 2005
sendbackup: time 22182.327: 126: strange(?): gzip: stdout: Connection timed
out
sendbackup: time 22182.328: 126: strange(?): sendbackup: index tee cannot
write
[Broken pipe]
sendbackup: time 22182.385: error [compress returned 1, /bin/tar got signal
13]
sendbackup: time 22182.385: pid 6096 finish time Wed Aug 17 07:47:01 2005


and in the server's amdump.1, relevant lines are:

planner: time 0.082: setting up estimates for bulkhead.backup:/data/home
bulkhead.backup:/data/home overdue 21 days for level 0
setup_estimate: bulkhead.backup:/data/home: command 0, options: none
last_level 1 next_level0 -21 level_days 1    getting estimates 0 (-2) 1 (-2)
2 (-2)
planner time 2.638: got result for host bulkhead.backup disk /data/home:
0 -> 88513630K, 1 -> 2750443K, 2 -> 3325880K
  0: bulkhead.backup /data/home
pondering bulkhead.backup:/data/home... next_level0 -21 last_level 1 (due
for level 0) (picking inclevel for degraded mode)   pick: size 2750443 level
1 days 1 (thresh 20480K, 1 days)
  bulkhead.backup /data/home pri 23 lev 0 size 57655638
DUMP bulkhead.backup fffffeff9ffe0f /data/home 20050817 23 0 1970:1:1:0:0:0
57655638 32360 1 2005:7:19:15:50:32 850574 9129
driver: send-cmd time 2237.343 to dumper3: FILE-DUMP 03-00004
/data/amanda/anchor/20050817010002/bulkhead.backup._data_home.0
bulkhead.backup fffffeff9ffe0f /data/home NODEVICE 0 1970:1:1:0:0:0 1048576
GNUTAR 57657440
|;bsd-auth;compress-best;index;exclude-list=.amandaexclude;exclude-optional;


Now my dtimeout was set to 7200 at the start of this, and over the last few
nights have set it to 14400, 21600, and now back to 7200.  These have had no
effect on the actual timeout; the times reported at the 'index tee' failure
vary between 6000, 22000, 37000 regardless of the dtimeout.  I've now set it
back to 7200, which had been working for this DLE about a month ago (you can
see it's been 21 days since the DLE was correctly backed up).

So I'm stumped as to what to try next, can anyone think of anything I might
have missed, or hand out a clue?


<Prev in Thread] Current Thread [Next in Thread>