Amanda-Users

Re: broken pipe (WAS: Re: timeout while waiting for REP)

2006-07-06 20:37:58
Subject: Re: broken pipe (WAS: Re: timeout while waiting for REP)
From: Paul Bijnens <paul.bijnens AT xplanation DOT com>
To: Cameron Matheson <cameron.matheson AT fjcomm DOT com>
Date: Fri, 07 Jul 2006 02:30:10 +0200
Cameron Matheson schreef:

Well, Changing the etimeout/estimate-method in amanda.conf definitely
helped, but now I'm getting a broken-pipe error.  Here's the excerpt
from this morning's e-mail:

  aspapp2.tonservices.com   /opt/webapp/images  lev 0  FAILED [data
  timeout]
  aspapp2.tonservices.com   /opt/webapp/images  lev 0  FAILED [dump to
  tape failed]

  taper: no split_diskbuffer specified: using fallback split size of 10240kb to 
buffer aspapp2.tonservices.com:/opt/webapp/images.0 in-memory
  taper: tape DailySet1012 kb 59249888 fm 25 [OK]

That looks bad... I looked up what split_diskbuffer was in the
amanda.conf manpage, but I'm not sure how setting that to anything
different would help me (I think I'll just let it use its default until
I understand it better).

When Amanda writes to tape, and it bumps into end of tape, it needs
to rewrite the whole chunk on the next tape  (because you cannot be
sure up to which byte did end up on tape, unless you write a filemark).

The "tape_splitsize" indicates how large such a tapechunk ending with
a filemark will be.  In the worst case Amanda will loose that
amount of tapecapacity at the end of a tape.

That chunk needs to be buffered somewhere. If you're not using a holding disk, which would contain the complete dump file, then
Amanda can use some diskspace to buffer that chunk: some file in
the directory which you specify with "split_diskbuffer".

As a last resort it buffers the data in memory, in which case the tape
chunks now are only 10 Mbyte by default.

A good choice for directory as "split_diskbuffer" is your holdingdisk
directory.

And if your machine has plenty of RAM, then increase the 10 Mbyte "fallback_splitsize" too.

One remark:  Amanda reads from the diskbuffer with a mmap() as large
as the "tape_splitsize" value.  This means that you are limited with
your virtual memory how large that value can be.  If you specify it
too large, than Amanda will fall back to the fallback_splitsize, even
you have have 100 Gbyte free holdingdiskspace.


sendbackup: time 0.080: started backup
sendbackup: time 33267.859: index tee cannot write [Broken pipe]
sendbackup: time 33267.866: pid 15806 finish time Thu Jul  6 11:31:56
2006
sendbackup: time 33267.880: 117: strange(?): sendbackup: index tee
cannot write [Broken pipe]
[...]
It takes 9 hours, so I can see why that might time-out, but I'm not sure
why the pipe is closed... isn't the output from tar going over the
network the entire time?  How do I prevent this from occurring?


Could it be yet another symptom of this problem here:

http://wiki.zmanda.com/index.php/Amdump:_mesg_read:_Connection_reset_by_peer

Try this and see if it helps:

  echo 180 > /proc/sys/net/ipv4/tcp_keepalive_time


--
Paul Bijnens, Xplanation                            Tel  +32 16 397.511
Technologielaan 21 bus 2, B-3001 Leuven, BELGIUM    Fax  +32 16 397.512
http://www.xplanation.com/          email:  Paul.Bijnens AT xplanation DOT com
***********************************************************************
* I think I've got the hang of it now:  exit, ^D, ^C, ^\, ^Z, ^Q, F6, *
* quit,  ZZ, :q, :q!,  M-Z, ^X^C,  logoff, logout, close, bye,  /bye, *
* stop, end, F3, ~., ^]c, +++ ATH, disconnect, halt,  abort,  hangup, *
* PF4, F20, ^X^X, :D::D, KJOB, F14-f-e, F8-e,  kill -1 $$,  shutdown, *
* kill -9 1,  Alt-F4,  Ctrl-Alt-Del,  AltGr-NumLock,  Stop-A,  ...    *
* ...  "Are you sure?"  ...   YES   ...   Phew ...   I'm out          *
***********************************************************************

<Prev in Thread] Current Thread [Next in Thread>