Amanda-Users

Re: failures in 2.6.0 ("cannot read header ...")

2008-04-11 09:23:58
Subject: Re: failures in 2.6.0 ("cannot read header ...")
From: Charles Stroom <charles AT stremen.xs4all DOT nl>
To: amanda-users AT amanda DOT org
Date: Fri, 11 Apr 2008 14:40:52 +0200
I have been looking at the enormous amount of debug files, and what I
can see is that always there are "fail" lines in the dumper.*.debug
files of the type:
1207905536.305211: dumper: connect_portrange: connect to
127.0.0.2.10080 failed: Cannot assign requested address
"

However, that seems to be normal, because even in normal cases, these
lines are present.

In the case there are these "cannot read header" problems, the
chunker.*.debug file report an error.  For example, today the same
DLE /home/charles/pictures failed in the chuncker:
"
1207647127.918807: chunker: pid 8024 ruid 1002 euid 1002: start at Tue Apr  8 
11:32:07 2008
1207647127.919828: chunker: pid 8024 ruid 1002 euid 1002: rename at Tue Apr  8 
11:32:07 2008
1207647127.919889: chunker: getcmd: START 20080408110501
1207647127.919912: chunker: getcmd: PORT-WRITE 03-00013 
/work-1/amanda/dumps/20080408110501/fiume.localnet._home_charles_pictures.1 
fiume.localnet ffff
ffff9ffeffffffff00 /home/charles/pictures 1 2008:4:4:9:7:56 1048576 GNUTAR 928 
|;auth=bsdtcp;index;
1207647127.920075: chunker: try_socksize: receive buffer size is 65536
1207647127.924152: chunker: bind_portrange2: Try  port 11028: Available - 
Success
1207647127.924186: chunker: stream_server: waiting for connection: 0.0.0.0.11028
1207647127.924193: chunker: putresult: 23 PORT
1207647127.928237: chunker: stream_accept: connection from 127.0.0.1.11002
1207647127.928247: chunker: try_socksize: receive buffer size is 65536
1207647137.985112: chunker: putresult: 10 FAILED
1207647137.985202: chunker: pid 8024 finish time Tue Apr  8 11:32:17 2008
"

I think the reason for the "cannot read header" message is because the
file left in the dumps disk is zero length:
"
ll /work-1/amanda/dumps/200804*/*
-rw------- 1 amandabackup disk 0 Apr  6 11:22 
/work-1/amanda/dumps/20080406110501/fiume.localnet._home_charles_pictures.1
-rw------- 1 amandabackup disk 0 Apr  8 11:32 
/work-1/amanda/dumps/20080408110501/fiume.localnet._home_charles_pictures.1
-rw------- 1 amandabackup disk 0 Apr 11
11:18 
/work-1/amanda/dumps/20080411110501/fiume.localnet._home_charles_pictures.1
"

If there are problems, it is always this /home/charles/pictures DLE which
fails and sometimes also another, but often the other recovers, as today:
"
fiume.localnet /usr       lev 1  FAILED [cannot read header: got 0 bytes 
instead of 32768]fiume.localnet /usr       lev 1  was successfully retried
"

Thanks for looking.

Charles




On Thu, 10 Apr 2008 10:18:10 -0400
"Dustin J. Mitchell" <dustin AT zmanda DOT com> wrote: 
> On Tue, Apr 8, 2008 at 10:38 AM, Charles Stroom
> <charles AT stremen.xs4all DOT nl> wrote:
> >  FAILURE DUMP SUMMARY:
> >    fiume.localnet /home/charles/pictures lev 1  FAILED [cannot read
> >  header: got 0 bytes instead of 32768]
> >  fiume.localnet /home/charles/pictures lev 1  FAILED [too many
> > dumper retry: "[request failed: Resource temporarily unavailable]"]
> >  fiume.localnet /home/charles/pictures lev 1  FAILED [cannot read
> >  header: got 0 bytes instead of 32768]
> 
> My apologies -- now that the 2.6.0 release has been out, I've all been
> excitedly hacking on new work for 2.6.1.  A side-effect is that I've
> just been "skimming" amanda-users, and I missed this email.
> 
> There could be a number of problems underlying this, unfortunately.
> Basically, the header didn't arrive on the server, but there's no
> indication of what caused that.  The next step is to look around in
> the debug logs on both the server and the client for anything unusual
> about those runs (unusual = different from the runs that succeeded).
> On the server, have a look at dumper.*.  On the client, have a look at
> amandad.* and sendbackup*.
> 
> "Resource temporarily unavailable" corresponds to EAGAIN, and is
> usually seen with non-blocking IO, which Amanda doesn't use, so I'm
> not sure what to make of that information.  The dumper debug log may
> clarify that.
> 
> Dustin
> 


-- 
Charles Stroom
email: charles at no-spam.stremen.xs4all.nl (remove the "no-spam.")