Amanda-Users

Re: backup run stalled- how to restart?

2004-11-19 14:48:12
Subject: Re: backup run stalled- how to restart?
From: Eric Sproul <esproul AT ntelos DOT net>
To: Gavin Henry <ghenry AT suretecsystems DOT com>
Date: Fri, 19 Nov 2004 14:38:57 -0500
On Fri, 2004-11-19 at 11:37, Gavin Henry wrote:
> Alexander Jolk said:
> > Eric Sproul wrote:
> >> the dumps seemed to have picked up
> >> again on their own.
> >
> > Data timeout from the hung machine, which aborted the dumper and freed
> > it up for other tasks?  dtimeout is half an hour at my place...
> 
> Of course. Eric, if this happens again, experiment with this value.

Agreed.  This was an unusual, catastrophic event on the core router
(hardware bug with no workaround) that is being addressed by our routing
guys.  IP connectivity was severely impaired for several hours this
morning, and I noticed on the final report that the stalled partition's
dumper (FreeBSD-4, UFS dump) did in fact give up because it was unable
to send:

|   DUMP: Date of this level 1 dump: Fri Nov 19 08:12:40 2004
|   DUMP: Date of last level 0 dump: Thu Nov 18 04:38:24 2004
|   DUMP: Dumping /dev/da1s1f (/var/netlog) to standard output
|   DUMP: mapping (Pass I) [regular files]
|   DUMP: mapping (Pass II) [directories]
|   DUMP: estimated 5752130 tape blocks.
|   DUMP: dumping (Pass III) [directories]
|   DUMP: dumping (Pass IV) [regular files]
|   DUMP: 3.35% done, finished in 2:24
|   DUMP: 6.29% done, finished in 2:29
|   DUMP: 8.71% done, finished in 2:37
|   DUMP: 10.77% done, finished in 2:45
? sendbackup: index tee cannot write [Broken pipe]
|   DUMP:   DUMP: 12.73% done, finished in 3:45
? Broken pipe
|   DUMP: The ENTIRE dump is aborted.
sendbackup: error [/sbin/dump returned 3, index returned 1]
\--------

I have otherwise had no problems with data timeouts, so I am going to
leave dtimeout alone.

Thanks again,
Eric


<Prev in Thread] Current Thread [Next in Thread>