Amanda-Users

Anyone have any idea why this is happening?

2003-01-19 11:56:23
Subject: Anyone have any idea why this is happening?
From: stan <stanb AT awod DOT com>
To: "amanda; amanda users lisr" <amanda-users AT amanda DOT org>
Date: Sat, 18 Jan 2003 14:28:14 -0500
Here is the scenario. I have a cluster of machines at home thta I've had
Amanda working on for years. I even had to go through the protocol
conversion. Ot's been working well for all these years. 

Last weelend I decided to move the Amanda tape/index server from an HP-UX
10.20 9000/835 to and Athalon 1.2GHZ machien with a 400G holding disk.

There is one machien that I can't get to accept authnetiactian yet (an NIS
problem I think), but other than that amcheck runs fine. I _do_ have some
data size vs tape size issues since one of the things that drove me to do
this was the addition of 2 more 40G drives that are over 50% full, but I
think I've got that under control, as I am adding more tapes to the
tapecyclem and double dumpcycle.

However, over and above those issues, I'm seeing _a lot_ of srnage
failures. Some of these are on machines that I have yet to get 2.43.B4 to
compile on, but the example below is from a Debina GNU Linux machine that
_is_ runing 2.4.3B4:


sendbackup: debug 1 pid 8969 ruid 106 euid 106: start at Sat Jan 18 14:01:25 
2003
/usr/local/amanda/libexec/sendbackup: version 2.4.3b4
  parsed request as: program `DUMP'
                     disk `hda1'
                     device `hda1'
                     level 2
                     since 2003:1:18:12:50:58
                     options `|;auth=bsd;compress-best;index;'
sendbackup: try_socksize: send buffer size is 65536
sendbackup: time 0.000: stream_server: waiting for connection: 0.0.0.0.32793
sendbackup: time 0.000: stream_server: waiting for connection: 0.0.0.0.32794
sendbackup: time 0.000: stream_server: waiting for connection: 0.0.0.0.32795
sendbackup: time 0.000: waiting for connect on 32793, then 32794, then 32795
sendbackup: time 0.002: stream_accept: connection from 205.159.77.224.2575
sendbackup: time 0.003: stream_accept: connection from 205.159.77.224.2576
sendbackup: time 0.004: stream_accept: connection from 205.159.77.224.2577
sendbackup: time 0.004: got all connections
sendbackup: time 0.004: spawning /bin/gzip in pipeline
sendbackup: argument list: /bin/gzip --best
sendbackup-dump: time 0.005: pid 8971: /bin/gzip --best
sendbackup: time 0.061: spawning /sbin/dump in pipeline
sendbackup: argument list: dump 2usf 1048576 - /dev/hda1
sendbackup: time 0.078: started index creator: "/sbin/restore -tvf - 2>&1 | sed 
-e '
s/^leaf[        ]*[0-9]*[       ]*\.//
t
/^dir[  ]/ {
s/^dir[         ]*[0-9]*[       ]*\.//
s%$%/%
t
}
d
'"
sendbackup: time 0.088:  91:  normal(|):   DUMP: Date of this level 2 dump: Sat 
Jan 18 14:01:25 2003
sendbackup: time 0.089:  91:  normal(|):   DUMP: Date of last level 1 dump: Sat 
Jan 18 07:50:59 2003
sendbackup: time 0.090:  91:  normal(|):   DUMP: Dumping /dev/hda1 (/) to 
standard output
sendbackup: time 0.091:  91:  normal(|):   DUMP: Added inode 7 to exclude list 
(resize inode)
sendbackup: time 0.264:  91:  normal(|):   DUMP: Label: none
sendbackup: time 0.265:  91:  normal(|):   DUMP: mapping (Pass I) [regular 
files]
sendbackup: time 167.711:  91:  normal(|):   DUMP: mapping (Pass II) 
[directories]
sendbackup: time 217.670:  91:  normal(|):   DUMP: estimated 124486 tape blocks.
sendbackup: time 217.706:  91:  normal(|):   DUMP: Volume 1 started with block 
1 at: Sat Jan 18 14:05:03 2003
sendbackup: time 217.845:  91:  normal(|):   DUMP: dumping (Pass III) 
[directories]
sendbackup: time 218.193:  91:  normal(|):   DUMP: dumping (Pass IV) [regular 
files]
sendbackup: time 517.247:  91:  normal(|):   DUMP: 52.86% done at 219 kB/s, 
finished in 0:04
sendbackup: time 817.596:  91:  normal(|):   DUMP: 80.75% done at 167 kB/s, 
finished in 0:02
sendbackup: time 943.223: index tee cannot write [Broken pipe]
sendbackup: time 943.223: pid 8972 finish time Sat Jan 18 14:17:09 2003
sendbackup: time 943.829: 112:  normal(|): 
sendbackup: time 943.830: 115: strange(?): gzip: stdout: Connection reset by 
peer
sendbackup: time 943.831: 115: strange(?): sendbackup: index tee cannot write 
[Broken pipe]
sendbackup: time 943.832:  91:  normal(|):   DUMP: Broken pipe
sendbackup: time 943.833:  91:  normal(|):   DUMP: The ENTIRE dump is aborted.
sendbackup: time 944.265: error [compress returned 1, /sbin/dump returned 3]
sendbackup: time 944.265: pid 8969 finish time Sat Jan 18 14:17:10 2003

Thes failures don't seem localized to any one machien or filesystem. Can
anyone sugest what steps to take to debug this?


-- 
"They that would give up essential liberty for temporary safety deserve
neither liberty nor safety."
                                                -- Benjamin Franklin

<Prev in Thread] Current Thread [Next in Thread>
  • Anyone have any idea why this is happening?, stan <=