Re: Backup issues with OpenBSD 4.5 machines
2009-09-01 14:07:50
Since the logs indicate an error with the "index tee", I thought I'd try turning off the index generation. The backup still failed, but with a different error:
1251826429.341455: sendbackup: pid 26726 ruid 150 euid 150 version 2.6.2alpha: start at Tue Sep 1 11:33:49 2009
1251826429.342070: sendbackup: Version 2.6.2alpha 1251826429.364640: sendbackup: pid 26726 ruid 150 euid 150 version 2.6.2alpha: rename at Tue Sep 1 11:33:49 2009 1251826429.366628: sendbackup: Parsed request as: program `DUMP'
1251826429.366665: sendbackup: disk `/' 1251826429.366690: sendbackup: device `/' 1251826429.366714: sendbackup: level 0 1251826429.366737: sendbackup: since NODATE
1251826429.366761: sendbackup: options `' 1251826429.367758: sendbackup: start: zirconium.example.com:/ lev 0 1251826429.369745: sendbackup: dumping device '/dev/rsd1a' with 'ffs'
1251826429.370181: sendbackup: Spawning "/sbin/dump dump 0usf 1048576 - /dev/rsd1a" in pipeline 1251826429.375384: sendbackup: Started backup 1251826429.395855: sendbackup: 90: normal(|): DUMP: Date of this level 0 dump: Tue Sep 1 11:33:49 2009
1251826429.460067: sendbackup: 90: normal(|): DUMP: Date of last level 0 dump: the epoch 1251826429.461111: sendbackup: 90: normal(|): DUMP: Dumping /dev/rsd1a (/) to standard output 1251826429.538792: sendbackup: 90: normal(|): DUMP: mapping (Pass I) [regular files]
1251826430.811052: sendbackup: 90: normal(|): DUMP: mapping (Pass II) [directories] 1251826430.812848: sendbackup: 90: normal(|): DUMP: estimated 47600 tape blocks. 1251826430.814960: sendbackup: 90: normal(|): DUMP: Volume 1 started at: Tue Sep 1 11:33:50 2009
1251826430.820737: sendbackup: 90: normal(|): DUMP: dumping (Pass III) [directories] 1251826430.829936: sendbackup: 90: normal(|): DUMP: End of tape detected 1251826430.833563: sendbackup: 90: normal(|): DUMP: Volume 1 completed at: Tue Sep 1 11:33:50 2009
1251826430.836947: sendbackup: 90: normal(|): DUMP: Change Volumes: Mount volume #2 1251826430.838051: sendbackup: 86: strange(?): DUMP: fopen on /dev/tty fails: Device not configured 1251826430.839081: sendbackup: 90: normal(|): DUMP: The ENTIRE dump is aborted.
1251826430.839642: sendbackup: critical (fatal): error [dump (1011) /sbin/dump returned 3]
Hope that helps.
-- Michael
On Tue, Sep 1, 2009 at 11:31 AM, Michael Burk <burkml AT gmail DOT com> wrote:
I applied the 3-line patch to the 0831 snapshot and ran a full backup on both machines, with 4 file systems each. All 8 completed successfully with no "strange" messages.
Next, I commented out the 3 new lines and tried the backup again on one of the machines. This time all 4 file systems failed; e.g.:
1251825776.552104: sendbackup: pid 24374 ruid 150 euid 150 version 2.6.2alpha: start at Tue Sep 1 11:22:56 2009 1251825776.552688: sendbackup: Version 2.6.2alpha 1251825776.575326: sendbackup: pid 24374 ruid 150 euid 150 version 2.6.2alpha: rename at Tue Sep 1 11:22:56 2009
1251825776.577315: sendbackup: Parsed request as: program `DUMP' 1251825776.577352: sendbackup: disk `/' 1251825776.577377: sendbackup: device `/' 1251825776.577400: sendbackup: level 0
1251825776.577424: sendbackup: since NODATE 1251825776.577446: sendbackup: options `' 1251825776.578210: sendbackup: start: zirconium.example.com:/ lev 0 1251825776.579426: sendbackup: dumping device '/dev/rsd1a' with 'ffs'
1251825776.582918: sendbackup: Spawning "/sbin/dump dump 0usf 1048576 - /dev/rsd1a" in pipeline 1251825776.587089: sendbackup: Started backup 1251825776.619734: sendbackup: 90: normal(|): DUMP: Date of this level 0 dump: Tue Sep 1 11:22:56 2009
1251825776.632529: sendbackup: Started index creator: "/sbin/restore -tvf - 2>&1 | sed -e ' s/^leaf[ ]*[0-9]*[ ]*\.// t /^dir[ ]/ { s/^dir[ ]*[0-9]*[ ]*\.//
s%$%/%
t } d '"
1251825776.721646: sendbackup: 90: normal(|): DUMP: Date of last level 0 dump: the epoch 1251825776.722686: sendbackup: 90: normal(|): DUMP: Dumping /dev/rsd1a (/) to standard output
1251825776.779507: sendbackup: 90: normal(|): DUMP: mapping (Pass I) [regular files] 1251825778.010107: sendbackup: 90: normal(|): DUMP: mapping (Pass II) [directories] 1251825778.012085: sendbackup: 90: normal(|): DUMP: estimated 47600 tape blocks.
1251825778.013367: sendbackup: 90: normal(|): DUMP: Volume 1 started at: Tue Sep 1 11:22:58 2009 1251825778.020808: sendbackup: critical (fatal): index tee cannot write [Resource temporarily unavailable] 1251825778.020918: sendbackup: 90: normal(|): DUMP: dumping (Pass III) [directories]
1251825778.022805: sendbackup: 115: strange(?): sendbackup: index tee cannot write [Resource temporarily unavailable] 1251825778.042344: sendbackup: 90: normal(|): DUMP: Broken pipe 1251825778.047559: sendbackup: 90: normal(|): DUMP: The ENTIRE dump is aborted.
1251825778.048234: sendbackup: critical (fatal): error [dump (21317) /sbin/dump returned 3]
So it seems reliable that those 3 lines fix the problem somehow. Anything else you want to try before I ask for help on the OpenBSD list?
Thanks, MichaelOn Tue, Sep 1, 2009 at 9:32 AM, Michael Burk <burkml AT gmail DOT com> wrote:
I checked the errata for OpenBSD 4.5, but saw nothing that looked related. I applied the patch to the 0831 snapshot and am building it now. After we find the minimal patch, as Jean-Louis said, I'll post on the OpenBSD-misc list to see if anyone has an explanation.
Thanks guys for working on this - I know it's a frustrating one.
-- MichaelOn Tue, Sep 1, 2009 at 8:33 AM, Dustin J. Mitchell <dustin AT zmanda DOT com> wrote:
On Tue, Sep 1, 2009 at 7:45 AM, Jean-Louis
> We need to find a minimal patch that fix the problem.
> Cat you try the attached patch?
This is starting to look like a kernel bug -- is there an associated
OpenBSD bug or something that we could reference in comments in the
code to explain this strange formulation? Also, we'll probably need
to do this everywhere we fork and plumb a new process -- in
pipespawn.c at least.
|
|
|