Amanda-Users

Re: post-upgrade, multiple errors

2007-03-31 08:01:20
Subject: Re: post-upgrade, multiple errors
From: Jean-Louis Martineau <martineau AT zmanda DOT com>
To: Charles Sprickman <spork AT bway DOT net>
Date: Sat, 31 Mar 2007 07:55:54 -0400
Could you post the amdump.1 log files and the curinfo/b02.foo.com/_var_db_pkg/info file.

Charles Sprickman wrote:
Hello all,

I recently upgraded from 2.4.3 to 2.5.1p3 and things have mostly been working correctly except for a few rough edges. To complicate matters, we also upgraded from an AIT drive to an LTO drive. We're currently using 100GB tapes, which comfortably fit an entire level 0 dump (about 60GB).

We run bsdtcp auth on all hosts (the main thing driving the update, some remote sites had mtu issues that would cause big estimates to get dropped).

We have about 20GB of holding disk space available.

We use a mix of gtar (1.16.1) and dump, all on FreeBSD 4.11.

Going through the logs it looks like I've got a number of problems.

First is the tape size issue:

[devel2]/var/log/amanda # grep FAIL log.20070329.0 FAIL planner b02.foo.com /var/qmail 20070329 0 [dump larger than available tape space, 2147483647 KB, but cannot incremental dump new disk] FAIL planner b02.foo.com /var/db/pkg 20070329 0 [dump larger than available tape space, 2147483647 KB, but cannot incremental dump new disk]

This seems odd since not only do we have more than enough tape space, but those are very, very small tar backups:

[b02]/var/db/pkg # du -hs
3.8M    .
[b02]/var/qmail # du -hs
1.5M    .

Then we have issues that look like permissions issues, but I can't reproduce them by hand:

FAIL dumper b02.foo.com / 20070329 0 [err create /var/db/amanda/index/b02.foo.com/_/20070329_0.gz.tmp: Operation not permitted]

[devel2]/var/log/amanda # su -m operator -c "touch /var/db/amanda/index/b02.foo.com/_/20070329_0.gz.tmp" [devel2]/var/log/amanda # ls -l /var/db/amanda/index/b02.fo.com/_/20070329_0.gz.tmp -rw-r--r-- 1 operator wheel 0 Mar 30 20:24 /var/db/amanda/index/b02.foo.com/_/20070329_0.gz.tmp

And then I assume this error is related:

FAIL chunker b02.foo.com / 20070329 0 [cannot read header: got 0 instead of 32768]

And on the client side:

sendbackup: time 1.226: 87: normal(|): DUMP: dumping (Pass III) [directories] sendbackup: time 1.387: 87: normal(|): DUMP: dumping (Pass IV) [regular files]
sendbackup: time 6426.054: index tee cannot write [Broken pipe]
sendbackup: time 6426.055: pid 64720 finish time Thu Mar 29 05:19:43 2007

And also some chunker errors that have no corresponding dumper errors:

FAIL chunker h10.foo.com /spool 20070329 0 [cannot read header: got 0 instead of 32768]

On the client:

sendbackup: time 5543.622: 87: normal(|): DUMP: 3.63% done, finished in 37:33 sendbackup: time 5843.560: 87: normal(|): DUMP: 3.86% done, finished in 37:19 sendbackup: time 6143.781: 87: normal(|): DUMP: 4.09% done, finished in 37:07
sendbackup: time 6358.556: index tee cannot write [Broken pipe]
sendbackup: time 6358.557: pid 67054 finish time Thu Mar 29 05:20:00 2007

And then at the end of the log I have a number of warnings:

WARNING driver chunker4 pid 18178 exited with signal 1
WARNING driver dumper4 pid 18032 exited with signal 1
WARNING driver dumper3 pid 18031 exited with signal 1
WARNING driver dumper2 pid 18030 exited with signal 1
WARNING driver chunker2 pid 18852 exited with signal 1

Can anyone help me kind of focus on a few root issues here? I'm having a really hard time chasing down all these different errors at once.

Thanks,

Charles


<Prev in Thread] Current Thread [Next in Thread>