Amanda-Users

Re: planner: disk xxx:/, estimate of level N failed

2006-07-13 02:06:24
Subject: Re: planner: disk xxx:/, estimate of level N failed
From: Geert Uytterhoeven <geert AT linux-m68k DOT org>
To: Frank Smith <fsmith AT hoovers DOT com>
Date: Thu, 13 Jul 2006 11:56:18 +0200 (CEST)
On Wed, 12 Jul 2006, Frank Smith wrote:
> In the last few days I've started seeing a couple of messages like
> the above in the NOTES section of my daily reports.  The DLEs in
> question do get backed up, but it is a level 0 each night. Originally
> it was just one server, now it is two.

Interesting. I'm seeing the same since July 8, except that for me it's the
estimates of level 2 or 3 that fail, so Amanda falls back to level 1 or 2,
filling my (now too small) vtapes using the same (too low) level each night in
a row :-(

>    Both are running 2.5.0p2-1 Debian etch packages, one is x86, the

Same for me.

> other is arm.   A third amd64 machine is still working fine so far.
> The appearance of the errors may have coincided with package updates,
> but I'm not sure (although they have been 2.5.x for awhile).
>    Since I've been running Amanda for years, my first thought was
> just to check the /tmp/amanda/*debug files, but was surprised to
> find that those files don't exist on any of the 2.5.0 machines,
> all that is in /tmp/amanda are selfcheck*exclude, sendbackup*exclude,
> and sendsize*exclude files.

They're in /var/log/amanda

This is what I see:

| endsize[11906]: time 1.363: calculating for amname '<MY_DLE>', dirname 
'/home/p72', spindle 1
| sendsize[11906]: time 1.363: getting size via gnutar for /home/p72/orig level 0
| sendsize[11906]: time 1.365: spawning /usr/lib/amanda/runtar in pipeline
| sendsize[11906]: argument list: /bin/tar --create --file /dev/null 
--directory /home/p72 --one-file-system --numeric-owner --listed-incremental 
/var/lib/amanda/gnutar-lists/anakin_home_p72_orig_0.new --sparse 
--ignore-failed-read --totals --exclude-from 
/tmp/amanda/sendsize._home_p72_orig.20060713004503.exclude --files-from 
/tmp/amanda/sendsize._home_p72_orig.20060713004503.include
| sendsize[11806]: time 1.367: waiting for any estimate child: 1 running
| sendsize[11906]: time 1.495: Total bytes written: 1435545600 (1.4GiB, 11GiB/s)
| sendsize[11906]: time 1.496: .....
| sendsize[11906]: estimate time for /home/p72/orig level 0: 0.130
| sendsize[11906]: estimate size for /home/p72/orig level 0: 1401900 KB
| sendsize[11906]: time 1.496: waiting for /bin/tar "/home/p72/orig" child
| sendsize[11906]: time 1.496: after /bin/tar "/home/p72/orig" wait
| sendsize[11906]: time 1.496: getting size via gnutar for /home/p72/orig level 
2
| sendsize[11906]: time 1.510: spawning /usr/lib/amanda/runtar in pipeline
| sendsize[11906]: argument list: /bin/tar --create --file /dev/null 
--directory /home/p72 --one-file-system --numeric-owner --listed-incremental 
/var/lib/amanda/gnutar-lists/anakin_home_p72_orig_2.new --sparse 
--ignore-failed-read --totals --exclude-from 
/tmp/amanda/sendsize._home_p72_orig.20060713004504.exclude --files-from 
/tmp/amanda/sendsize._home_p72_orig.20060713004504.include
| sendsize[11906]: time 1.548: Total bytes written: 1435545600 (1.4GiB, 43GiB/s)
| sendsize[11906]: time 1.549: .....
| sendsize[11906]: estimate time for /home/p72/orig level 2: 0.039
| sendsize[11906]: estimate size for /home/p72/orig level 2: 1401900 KB
| sendsize[11906]: time 1.549: waiting for /bin/tar "/home/p72/orig" child
| sendsize[11906]: time 1.549: after /bin/tar "/home/p72/orig" wait
| sendsize[11906]: time 1.549: getting size via gnutar for /home/p72/orig level 
3
| sendsize[11906]: time 1.569: spawning /usr/lib/amanda/runtar in pipeline
| sendsize[11906]: argument list: /bin/tar --create --file /dev/null 
--directory /home/p72 --one-file-system --numeric-owner --listed-incremental 
/var/lib/amanda/gnutar-lists/anakin_home_p72_orig_3.new --sparse 
--ignore-failed-read --totals --exclude-from 
/tmp/amanda/sendsize._home_p72_orig.20060713004504000.exclude --files-from 
/tmp/amanda/sendsize._home_p72_orig.20060713004504000.include
| sendsize[11906]: time 1.574: /bin/tar: Unexpected field value in snapshot file
| sendsize[11906]: time 1.574: /bin/tar: Error is not recoverable: exiting now
| sendsize[11906]: time 1.575: .....
| sendsize[11906]: estimate time for /home/p72/orig level 3: 0.005
| sendsize[11906]: no size line match in /bin/tar output for "/home/p72/orig"
| sendsize[11906]: .....
| sendsize[11906]: estimate size for /home/p72/orig level 3: -1 KB
| sendsize[11906]: time 1.575: waiting for /bin/tar "/home/p72/orig" child
| sendsize[11906]: time 1.575: after /bin/tar "/home/p72/orig" wait
| sendsize[11906]: time 1.575: done with amname '/home/p72/orig', dirname 
'/home/p72', spindle 1
| sendsize[11806]: time 1.576: child 11906 terminated normally

If I run the tar commands by hand (but that's _after_ the amanda run), I get:

| Total bytes written: 1435545600 (1.4GiB, 45GiB/s)
| Total bytes written: 1435545600 (1.4GiB, 42GiB/s)
| Total bytes written: 20480 (20KiB, 4.0MiB/s)

I.e. all succeeded.

>    It's not critical, theses are my home machines and I have the
> space to handle nightly level 0s, but I had been considering an
> upgrade of my work machines to 2.5 from 2.4.5 and am now concerned
> that this might be a bug that needs fixing before I upgrade. The
> estimate error may be just a Debian package thing, either in the
> Amanda package or possibly in tar, but evidently the debug files have
> been missing longer than the estimate error.

I thought about a tar issue, too. In the mean time I upgraded tar to the one in
Debian unstable, but no improvement.

BTW, I'm using server side estimates. Which makes me wonder: why would it want
to run tar for the estimates in that case? Or is this just an argument list for
another Amanda program, which ignores the tar command when using server side
estimates?

Gr{oetje,eeting}s,

                                                Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert AT linux-m68k 
DOT org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                                            -- Linus Torvalds