Amanda-Users

Re: "got partial result" and "seek failed: Illegal seek"

2007-04-11 07:29:24
Subject: Re: "got partial result" and "seek failed: Illegal seek"
From: Jean-Louis Martineau <martineau AT zmanda DOT com>
To: Charles Sprickman <spork AT bway DOT net>
Date: Wed, 11 Apr 2007 07:21:42 -0400
When it say "was successfully retried", that's because the dump succeeded on the second try. It still give the error of the first try "FAILED [cannot read header: got 0 instead of 32768]" You should be able to find more about it in the amdump.1 log file and the dumper.*.debug and chunker.*.debug on the server and amandad.*.debug and sendbackup.*.debug in the client.

Jean-louis

Charles Sprickman wrote:
On Tue, 10 Apr 2007, Jean-Louis Martineau wrote:

The "seek failed: Illegal seek" are normal error.
The "got partial result" lines are expected.

You have no problem with the estimate phase, don't try to search for bug there.

Just for my own curiosity, what are those errors telling me? The partial result error does show up for every host.

I'm down to one host that has recurring errors. The dumps eventually complete, but I'd like to track down what's happening here. Looking through the older amreport output, I see one host consistently getting this error and then retrying:

h10.biglist.com / lev 3 FAILED [cannot read header: got 0 instead of 32768]
  h10.biglist.com     /                lev 3  the dump "


h10.biglist.com /spool lev 4 FAILED [cannot read header: got 0 instead of 32768]
  h10.biglist.com     /spool         lev 4  was successfully retried


h10.biglist.com / lev 2 FAILED [cannot read header:
got 0 instead of 32768]
  h10.biglist.com     /                   lev 2  was successfully retried

The only errors of note I could find in either the server or client logs were what I had posted below. This host is in the same location and running the same OS version as other hosts that dump without errors.

What should I be looking for?

I'm not seeing any errors in the sendbackup debug logs.

Thanks,

Charles

Jean-Louis


Charles Sprickman wrote:
Perhaps the root of the problems I'm having, perhaps not.

I keep saying "I'm done with this" and then I go back and correlate more server/client logs.

This is my latest finding.

While backups seem to be mostly working (even though I'm having some amrecover issues in another thread), I have not seen anyone comment on the insane amount of errors in amdump about "partial result":

[devel2]/var/log/amanda # grep "partial result" amdump.1 | wc -l
    2468

Lines look basically like this, both for tar AND dump, and on EVERY client:

planner: time 20.742: got partial result for host h16.biglist.com disk /usr/local: 0 -> 130180K, 1 -> 68K, -1 -> -2K planner: time 20.742: got partial result for host h16.biglist.com disk /home: 0 -> 65074K, 2 -> 44K, -1 -> -2K planner: time 20.742: got partial result for host h16.biglist.com disk /: 0 -> 45968K, 1 -> 66K, -1 -> -2K planner: time 20.815: got partial result for host h20.biglist.com disk /var/qmail/bin: 0 -> -2K, 1 -> -2K, -1 -> -2K

And then on a client when it's sending the size estimate:

root@h07[/tmp/amanda/client/DailySet1]# grep "seek failed" sendsize.20070409* | wc -l
      13

And both for gtar and dump types:

dump_calc_estimates: warning - seek failed: Illegal seek
gnutar_calc_estimates: warning - seek failed: Illegal seek

And the errors in context:

(dump)
sendsize: debug 1 pid 56571 ruid 2 euid 2: rename at Mon Apr 9 03:10:02 2007
sendsize[56571]: time 0.014: waiting for any estimate child: 1 running
sendsize[56573]: time 0.014: calculating for amname /, dirname /, spindle -1
sendsize[56573]: time 0.015: getting size via dump for / level 0
sendsize[56573]: time 0.015: calculating for device /dev/da0s1a with ufs sendsize[56573]: time 0.016: running "/sbin/dump 0Shsf 0 1048576 - /dev/da0s1a"
sendsize[56573]: time 0.017: running /usr/local/libexec/amanda/killpgrp
sendsize[56573]: time 0.022: DUMP: Date of this level 0 dump: Mon Apr 9 03:10:02 2007 sendsize[56573]: time 0.023: DUMP: Date of last level 0 dump: the epoch sendsize[56573]: time 0.024: DUMP: Dumping /dev/da0s1a (/) to standard output
sendsize[56573]: time 0.077:   DUMP: mapping (Pass I) [regular files]
sendsize[56573]: time 0.648:   DUMP: mapping (Pass II) [directories]
sendsize[56573]: time 0.649:   DUMP: estimated 43991 tape blocks.
sendsize[56573]: time 0.650: .....
sendsize[56573]: estimate time for / level 0: 0.633
sendsize[56573]: estimate size for / level 0: 43991 KB
sendsize[56573]: time 0.650: asking killpgrp to terminate
dump_calc_estimates: warning - seek failed: Illegal seek
--> Do we start over here??
sendsize[56573]: time 1.660: getting size via dump for / level 1
sendsize[56573]: time 1.660: calculating for device /dev/da0s1a with ufs sendsize[56573]: time 1.660: running "/sbin/dump 1Shsf 0 1048576 - /dev/da0s1a"
sendsize[56573]: time 1.661: running /usr/local/libexec/amanda/killpgrp
sendsize[56573]: time 1.667: DUMP: Date of this level 1 dump: Mon Apr 9 03:10:04 2007 sendsize[56573]: time 1.668: DUMP: Date of last level 0 dump: Sat Apr 7 04:03:16 2007 sendsize[56573]: time 1.668: DUMP: Dumping /dev/da0s1a (/) to standard output
sendsize[56573]: time 1.710:   DUMP: mapping (Pass I) [regular files]
sendsize[56573]: time 2.204:   DUMP: mapping (Pass II) [directories]
sendsize[56573]: time 2.347:   DUMP: estimated 76 tape blocks.
sendsize[56573]: time 2.347: .....
sendsize[56573]: estimate time for / level 1: 0.686
sendsize[56573]: estimate size for / level 1: 76 KB
sendsize[56573]: time 2.347: asking killpgrp to terminate
dump_calc_estimates: warning - seek failed: Illegal seek
sendsize[56573]: time 3.350: done with amname / dirname / spindle -1
sendsize[56571]: time 3.351: child 56573 terminated normally
sendsize[56571]: time 3.352: waiting for any estimate child: 1 running

(gtar)

sendsize[56571]: time 13.583: waiting for any estimate child: 1 running
sendsize[56623]: time 13.583: calculating for amname /var/db/pkg, dirname /var/db/pkg, spindle -1 sendsize[56623]: time 13.583: getting size via gnutar for /var/db/pkg level 0 sendsize[56623]: time 13.585: spawning /usr/local/libexec/amanda/runtar in pipeline sendsize[56623]: argument list: runtar DailySet1 /usr/local/bin/gtar --create -- file /dev/null --directory /var/db/pkg --one-file-system --listed-incremental /usr/local/var/amanda/gnutar-lists/h07.biglist.com_var_db_pkg_0.new --sparse --ignore-failed-read --totals . sendsize[56623]: time 13.675: Total bytes written: 2816000 (2.7MiB, 117MiB/s)
sendsize[56623]: time 13.676: .....
sendsize[56623]: estimate time for /var/db/pkg level 0: 0.091
sendsize[56623]: estimate size for /var/db/pkg level 0: 2750 KB
sendsize[56623]: time 13.676: waiting for runtar "/var/db/pkg" child
sendsize[56623]: time 13.677: after runtar /var/db/pkg wait
gnutar_calc_estimates: warning - seek failed: Illegal seek
--> Again, looks like we repeat
sendsize[56623]: time 13.678: getting size via gnutar for /var/db/pkg level 1 sendsize[56623]: time 13.679: spawning /usr/local/libexec/amanda/runtar in pipeline sendsize[56623]: argument list: runtar DailySet1 /usr/local/bin/gtar --create --file /dev/null --directory /var/db/pkg --one-file-system --listed-incremental /usr/local/var/amanda/gnutar-lists/h07.biglist.com_var_db_pkg_1.new --sparse --ignore-failed-read --totals . sendsize[56623]: time 13.759: Total bytes written: 71680 (70KiB, 5.3MiB/s)
sendsize[56623]: time 13.760: .....
sendsize[56623]: estimate time for /var/db/pkg level 1: 0.080
sendsize[56623]: estimate size for /var/db/pkg level 1: 70 KB
sendsize[56623]: time 13.760: waiting for runtar "/var/db/pkg" child
sendsize[56623]: time 13.761: after runtar /var/db/pkg wait
gnutar_calc_estimates: warning - seek failed: Illegal seek
sendsize[56623]: time 13.761: getting size via gnutar for /var/db/pkg level 2 sendsize[56623]: time 13.763: spawning /usr/local/libexec/amanda/runtar in pipeline sendsize[56623]: argument list: runtar DailySet1 /usr/local/bin/gtar --create --file /dev/null --directory /var/db/pkg --one-file-system --listed-incremental /usr/local/var/amanda/gnutar-lists/h07.biglist.com_var_db_pkg_2.new --sparse --ignore-failed-read --totals . sendsize[56623]: time 13.843: Total bytes written: 71680 (70KiB, 5.3MiB/s)
sendsize[56623]: time 13.844: .....
sendsize[56623]: estimate time for /var/db/pkg level 2: 0.081
sendsize[56623]: estimate size for /var/db/pkg level 2: 70 KB
sendsize[56623]: time 13.844: waiting for runtar "/var/db/pkg" child
sendsize[56623]: time 13.845: after runtar /var/db/pkg wait
gnutar_calc_estimates: warning - seek failed: Illegal seek
sendsize[56623]: time 13.846: done with amname /var/db/pkg dirname /var/db/pkg spindle -1
sendsize[56571]: time 13.846: child 56623 terminated normally
sendsize[56571]: time 13.847: waiting for any estimate child: 1 running

Am I correct in assuming these sendsize problems are causing the "partial result" messages in the amdump log?

If so, I wonder if I'm really getting proper size estimates into amanda...

Any thoughts?

Thanks,

Charles