Amanda-Users

Re: Backup issues with OpenBSD 4.5 machines

2009-08-31 16:36:33
Subject: Re: Backup issues with OpenBSD 4.5 machines
From: Michael Burk <burkml AT gmail DOT com>
To: Jean-Louis Martineau <martineau AT zmanda DOT com>
Date: Mon, 31 Aug 2009 14:13:29 -0600
Here are the two sendbackup.*.debug files for the / fs.

First:

1251693102.343436: sendbackup: pid 29087 ruid 150 euid 150 version 2.6.2alpha: start at Sun Aug 30 22:31:42 2009
1251693102.345326: sendbackup: Version 2.6.2alpha
1251693102.379331: sendbackup: pid 29087 ruid 150 euid 150 version 2.6.2alpha: rename at Sun Aug 30 22:31:42 2009
1251693102.382860: sendbackup:   Parsed request as: program `DUMP'
1251693102.382940: sendbackup:                      disk `/'
1251693102.382991: sendbackup:                      device `/'
1251693102.383039: sendbackup:                      level 0
1251693102.383086: sendbackup:                      since NODATE
1251693102.383134: sendbackup:                      options `'
1251693102.388833: sendbackup: start: bromine.example.com:/ lev 0
1251693102.391475: sendbackup: dumping device '/dev/rsd0a' with 'ffs'
1251693102.398325: sendbackup: Spawning "/sbin/dump dump 0usf 1048576 - /dev/rsd0a" in pipeline
1251693102.403120: sendbackup: Started backup
1251693102.461663: sendbackup:  90:  normal(|):   DUMP: Date of this level 0 dump: Sun Aug 30 22:31:42 2009
1251693102.468669: sendbackup:  90:  normal(|):   DUMP: Date of last level 0 dump: the epoch
1251693102.487536: sendbackup: Started index creator: "/sbin/restore -tvf - 2>&1 | sed -e '
s/^leaf[        ]*[0-9]*[       ]*\.//
t
/^dir[  ]/ {
s/^dir[         ]*[0-9]*[       ]*\.//
s%$%/%
t
}
d
'"
1251693102.519939: sendbackup:  90:  normal(|):   DUMP: Dumping /dev/rsd0a (/) to standard output
1251693102.588086: sendbackup:  90:  normal(|):   DUMP: mapping (Pass I) [regular files]
1251693104.583392: sendbackup:  90:  normal(|):   DUMP: mapping (Pass II) [directories]
1251693104.585824: sendbackup:  90:  normal(|):   DUMP: estimated 35134 tape blocks.
1251693104.595314: sendbackup:  90:  normal(|):   DUMP: Volume 1 started at: Sun Aug 30 22:31:44 2009
1251693104.615444: sendbackup:  90:  normal(|):   DUMP: dumping (Pass III) [directories]
1251693104.919691: sendbackup:  90:  normal(|):   DUMP: dumping (Pass IV) [regular files]
1251693105.800795: sendbackup: critical (fatal): index tee cannot write [Resource temporarily unavailable]
1251693105.803486: sendbackup: 115: strange(?): sendbackup: index tee cannot write [Resource temporarily unavailable]
1251693105.827339: sendbackup:  90:  normal(|):   DUMP: Broken pipe
1251693105.832068: sendbackup:  90:  normal(|):   DUMP: The ENTIRE dump is aborted.
1251693105.839675: sendbackup: critical (fatal): error [dump (7257) /sbin/dump returned 3]

Second:

1251693123.840560: sendbackup: pid 32144 ruid 150 euid 150 version 2.6.2alpha: start at Sun Aug 30 22:32:03 2009
1251693123.841613: sendbackup: Version 2.6.2alpha
1251693123.892773: sendbackup: pid 32144 ruid 150 euid 150 version 2.6.2alpha: rename at Sun Aug 30 22:32:03 2009
1251693123.896360: sendbackup:   Parsed request as: program `DUMP'
1251693123.896437: sendbackup:                      disk `/'
1251693123.896487: sendbackup:                      device `/'
1251693123.896535: sendbackup:                      level 0
1251693123.896584: sendbackup:                      since NODATE
1251693123.896632: sendbackup:                      options `'
1251693123.902332: sendbackup: start: bromine.example.com:/ lev 0
1251693123.905012: sendbackup: dumping device '/dev/rsd0a' with 'ffs'
1251693123.911778: sendbackup: Spawning "/sbin/dump dump 0usf 1048576 - /dev/rsd0a" in pipeline
1251693123.916860: sendbackup: Started backup
1251693123.992261: sendbackup: Started index creator: "/sbin/restore -tvf - 2>&1 | sed -e '
s/^leaf[        ]*[0-9]*[       ]*\.//
t
/^dir[  ]/ {
s/^dir[         ]*[0-9]*[       ]*\.//
s%$%/%
t
}
d
'"
1251693123.995743: sendbackup:  90:  normal(|):   DUMP: Date of this level 0 dump: Sun Aug 30 22:32:03 2009
1251693124.009013: sendbackup:  90:  normal(|):   DUMP: Date of last level 0 dump: the epoch
1251693124.038977: sendbackup:  90:  normal(|):   DUMP: Dumping /dev/rsd0a (/) to standard output
1251693124.143454: sendbackup:  90:  normal(|):   DUMP: mapping (Pass I) [regular files]
1251693126.173916: sendbackup:  90:  normal(|):   DUMP: mapping (Pass II) [directories]
1251693126.176287: sendbackup:  90:  normal(|):   DUMP: estimated 35134 tape blocks.
1251693126.185677: sendbackup:  90:  normal(|):   DUMP: Volume 1 started at: Sun Aug 30 22:32:06 2009
1251693126.205305: sendbackup:  90:  normal(|):   DUMP: dumping (Pass III) [directories]
1251693126.505055: sendbackup:  90:  normal(|):   DUMP: dumping (Pass IV) [regular files]
1251693126.797271: sendbackup: critical (fatal): index tee cannot write [Resource temporarily unavailable]
1251693126.799929: sendbackup: 115: strange(?): sendbackup: index tee cannot write [Resource temporarily unavailable]
1251693126.821488: sendbackup:  90:  normal(|):   DUMP: Broken pipe
1251693126.825377: sendbackup:  90:  normal(|):   DUMP: The ENTIRE dump is aborted.
1251693126.833573: sendbackup: critical (fatal): error [dump (22791) /sbin/dump returned 3]

Thanks,
Michael

On Mon, Aug 31, 2009 at 12:49 PM, Jean-Louis Martineau <martineau AT zmanda DOT com> wrote:
You get the error in the index pipe instead of the data path. The backup is correct but your index is empty. That's why you get a STRANGE result instead of a failure.

Can you post a sendbackup.*.debug for a dle that failed?

Jean-Louis

Michael Burk wrote:
Hello,

I applied the patches to the 20090827 snapshot. I tried it on two OpenBSD 4.5 sparc64 systems, forcing both to do full backups. One system seemed to work on all 3 file systems, the other failed on all 4 file systems with the same errors as before (exactly like what Stan reported). I'm using bsdtcp auth.

The system that worked, however, did give a "strange" result in the report on one filesystem. Here's the corresponding "sendbackup" debug file:

1251692700.529842: sendbackup: pid 30459 ruid 150 euid 150 version 2.6.2alpha: start at Sun Aug 30 2
2:25:00 2009
1251692700.530411: sendbackup: Version 2.6.2alpha
1251692700.553135: sendbackup: pid 30459 ruid 150 euid 150 version 2.6.2alpha: rename at Sun Aug 30
22:25:00 2009
1251692700.555394: sendbackup:   Parsed request as: program `DUMP'
1251692700.555435: sendbackup:                      disk `/usr'
1251692700.555461: sendbackup:                      device `/usr'
1251692700.555484: sendbackup:                      level 0
1251692700.555507: sendbackup:                      since NODATE
1251692700.555530: sendbackup:                      options `'
1251692700.556750: sendbackup: start: zirconium.example.com:/usr lev 0
1251692700.559495: sendbackup: dumping device '/dev/rsd1d' with 'ffs'
1251692700.566094: sendbackup: Spawning "/sbin/dump dump 0usf 1048576 - /dev/rsd1d" in pipeline
1251692700.571185: sendbackup: Started backup
1251692700.572425: sendbackup: fd 3 is set with O_NONBLOCK before dup2
1251692700.597388: sendbackup: Started index creator: "/sbin/restore -tvf - 2>&1 | sed -e '
s/^leaf[        ]*[0-9]*[       ]*\.//
t
/^dir[  ]/ {
s/^dir[         ]*[0-9]*[       ]*\.//
s%$%/%
t
}
d
'"
1251692700.611526: sendbackup:  90:  normal(|):   DUMP: Date of this level 0 dump: Sun Aug 30 22:25:
00 2009
1251692700.669703: sendbackup:  90:  normal(|):   DUMP: Date of last level 0 dump: the epoch
1251692700.670724: sendbackup:  90:  normal(|):   DUMP: Dumping /dev/rsd1d (/usr) to standard output
1251692700.763600: sendbackup:  90:  normal(|):   DUMP: mapping (Pass I) [regular files]
1251692705.618345: sendbackup:  90:  normal(|):   DUMP: mapping (Pass II) [directories]
1251692705.620040: sendbackup:  90:  normal(|):   DUMP: estimated 1330085 tape blocks.
1251692705.623976: sendbackup:  90:  normal(|):   DUMP: Volume 1 started at: Sun Aug 30 22:25:05 2009
1251692705.650687: sendbackup:  90:  normal(|):   DUMP: dumping (Pass III) [directories]
1251692745.051547: sendbackup:  90:  normal(|):   DUMP: dumping (Pass IV) [regular files]
1251692747.915767: sendbackup: 115: strange(?): sed: stdout: Resource temporarily unavailable
1251692954.633083: sendbackup:  43:    size(|):   DUMP: 1437829 tape blocks
1251692954.637414: sendbackup:  90:  normal(|):   DUMP: Date of this level 0 dump: Sun Aug 30 22:25:00 2009
1251692954.641414: sendbackup:  90:  normal(|):   DUMP: Volume 1 completed at: Sun Aug 30 22:29:14 2009
1251692954.641858: sendbackup: Index pipe exited with status 1
1251692954.642719: sendbackup:  90:  normal(|):   DUMP: Volume 1 took 0:04:09
1251692954.643918: sendbackup:  90:  normal(|):   DUMP: Volume 1 transfer rate: 5774 KB/s
1251692954.645095: sendbackup:  90:  normal(|):   DUMP: Date this dump completed:  Sun Aug 30 22:29:14 2009
1251692954.646239: sendbackup:  90:  normal(|):   DUMP: Average transfer rate: 5774 KB/s
1251692954.647430: sendbackup:  90:  normal(|):   DUMP: level 0 dump on Sun Aug 30 22:25:00 2009
1251692954.653480: sendbackup:  90:  normal(|):   DUMP: DUMP IS DONE
1251692954.653841: sendbackup: Parsed backup messages
1251692954.654032: sendbackup: pid 30459 finish time Sun Aug 30 22:29:14 2009

-- Michael

On Fri, Aug 28, 2009 at 5:22 AM, stan <stanb AT panix DOT com <mailto:stanb AT panix DOT com>> wrote:

   On Thu, Aug 27, 2009 at 03:35:47PM -0600, Michael Burk wrote:
   > Cool - can you send me the patch? Or is it already in the 0826
   snapshot?
   >
   > FYI - I got 2.5.0p2 working, but could never get 2.5.1p3
   working, even with
   > the same config. No idea why.
   > I started looking at the source code last night in the 0825
   snapshot. I'll
   > discontinue that search if there's a patch.
   >
   This is interesting. I sent Jean-Louis Martineau an email, saying
   teaks for
   fixing things, and he replied that the patch he sent me just added
   debugging,
   it was not intended to fix anything.

   Bit, using this snapshot 20090813, and the patch, which I will
   send to you.
   I have it working on my 3 OpenBSD 4.5 machines that I have
   upgraded so far.
   I ran several test yesterday, and a full fledged backup run last
   night,
   and I have not seen any issues since installing this code.

   I am going to attach the patch to this email.

   I would greatly appreciate it, if you would apply this patch to
   the above
   snapshot, test, and tell me what your results are.

   I am very puzzled at the moment.

   BTW, is your failure the same as mine? That is amcheck passes, and
   most
   times you get a PARTIAL backup?

   BTW, I am using bsdtcp "auth" and client compression now, but I
   have made
   this work with this code, with bsd auth, and no compression, so I
   don't
   think that has any affect on this issue.

   Thanks for the help!

   --
   One of the main causes of the fall of the roman empire was that,
   lacking
   zero, they had no way to indicate successful termination of their C
   programs.