Amanda-Users

Re: post-upgrade, multiple errors

2007-03-31 00:57:25
Subject: Re: post-upgrade, multiple errors
From: Charles Sprickman <spork AT bway DOT net>
To: Gene Heskett <gene.heskett AT verizon DOT net>
Date: Fri, 30 Mar 2007 23:48:36 -0400 (EDT)
On Fri, 30 Mar 2007, Gene Heskett wrote:

On Friday 30 March 2007, Charles Sprickman wrote:
Hello all,

I recently upgraded from 2.4.3 to 2.5.1p3 and things have mostly been
working correctly except for a few rough edges.  To complicate matters,
we also upgraded from an AIT drive to an LTO drive.  We're currently
using 100GB tapes, which comfortably fit an entire level 0 dump (about
60GB).

We run bsdtcp auth on all hosts (the main thing driving the update, some
remote sites had mtu issues that would cause big estimates to get
dropped).

We have about 20GB of holding disk space available.

We use a mix of gtar (1.16.1) and dump, all on FreeBSD 4.11.

Going through the logs it looks like I've got a number of problems.

First is the tape size issue:

[devel2]/var/log/amanda # grep FAIL log.20070329.0
FAIL planner b02.foo.com /var/qmail 20070329 0 [dump larger than
available tape space, 2147483647 KB, but cannot incremental dump new
disk] FAIL planner b02.foo.com /var/db/pkg 20070329 0 [dump larger than
available tape space, 2147483647 KB, but cannot incremental dump new
disk]

Hummm, why do I seem to detect the odor of a 2GB file size limit in your
BSD filesystems?  I'd surely think this has been fixed, but I believe
that number is 2GB-1 in decimal notation.

Nope, no such limitation. Plus the sizes of those two disks are only a few MB, not GB (see next paragraph). It seems like somewhere amanda is getting very confused, or some number is wrapping.

This seems odd since not only do we have more than enough tape space,
but those are very, very small tar backups:

[b02]/var/db/pkg # du -hs
3.8M    .
[b02]/var/qmail # du -hs
1.5M    .

Then we have issues that look like permissions issues, but I can't
reproduce them by hand:

FAIL dumper b02.foo.com / 20070329 0 [err create
/var/db/amanda/index/b02.foo.com/_/20070329_0.gz.tmp: Operation not
permitted]

Selinux?  Or are you perchance running amdump as root?

Nope, all clients and the server are FreeBSD 4.11. Amanda runs as operator. As I show below, the operator user is able to create the file with no problems.

[devel2]/var/log/amanda # su -m operator -c "touch
/var/db/amanda/index/b02.foo.com/_/20070329_0.gz.tmp"
[devel2]/var/log/amanda # ls -l
/var/db/amanda/index/b02.fo.com/_/20070329_0.gz.tmp
-rw-r--r--  1 operator  wheel  0 Mar 30 20:24
/var/db/amanda/index/b02.foo.com/_/20070329_0.gz.tmp

And then I assume this error is related:

FAIL chunker b02.foo.com / 20070329 0 [cannot read header: got 0
instead of 32768]

And on the client side:

sendbackup: time 1.226:  87:  normal(|):   DUMP: dumping (Pass III)
[directories]
sendbackup: time 1.387:  87:  normal(|):   DUMP: dumping (Pass IV)
[regular files]
sendbackup: time 6426.054: index tee cannot write [Broken pipe]
sendbackup: time 6426.055: pid 64720 finish time Thu Mar 29 05:19:43
2007

And also some chunker errors that have no corresponding dumper errors:

FAIL chunker h10.foo.com /spool 20070329 0 [cannot read header: got 0
instead of 32768]

On the client:

sendbackup: time 5543.622:  87:  normal(|):   DUMP: 3.63% done, finished
in 37:33
sendbackup: time 5843.560:  87:  normal(|):   DUMP: 3.86% done, finished
in 37:19
sendbackup: time 6143.781:  87:  normal(|):   DUMP: 4.09% done, finished
in 37:07
sendbackup: time 6358.556: index tee cannot write [Broken pipe]
sendbackup: time 6358.557: pid 67054 finish time Thu Mar 29 05:20:00
2007

And then at the end of the log I have a number of warnings:

WARNING driver chunker4 pid 18178 exited with signal 1
WARNING driver dumper4 pid 18032 exited with signal 1
WARNING driver dumper3 pid 18031 exited with signal 1
WARNING driver dumper2 pid 18030 exited with signal 1
WARNING driver chunker2 pid 18852 exited with signal 1

Can anyone help me kind of focus on a few root issues here?  I'm having
a really hard time chasing down all these different errors at once.

Thanks,

Charles

First, and it might not be related, amanda is to be built by a normal
user, preferably a user named amanda that you have added, and who has
been made a member of the group disk (or backup, there are others too)
that have enough perms to wander about the system.

I built amanda from the ports collection, and the build does run as root, but as there are thousands of other people using that same method I'm going to assume that should not cause any problems.

I'll also add that I did the update over the period of a month or so by upgrading the clients first. There were not any problems during that trial, the oddities only started showing up after the server was also upgraded.

Any other ideas? I had a hard time finding any upgrade guides, but looking at the changelog I think I found all the major gotchas to look for (underscores in disk names, quoting of all amanda.conf values, changes to .amandahosts that make permissions more granular, ownership of .amandahosts).

Thanks,

Charles

Only the install is to be done as root.

Doing that will eliminate one potential list of perms problems.

--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
We were so poor we couldn't afford a watchdog.  If we heard a noise at
night,
we'd bark ourselves.
                -- Crazy Jimmy


<Prev in Thread] Current Thread [Next in Thread>