[Fwd: Sendsize Timeout Errors]

An update:

Been working this since the original email - backups are everything, youknow. Thanks in part to Mr. John Hein and Bill Nolf (a guy here where Iwork), I decided to run the same backup scheme using ufsdump, and thatseems to have corrected the problem. Note that I am using 2.5.0p2. Forversions above that, Kevin Till tells us that bsdtcp auth is now tcpexclusively. This negates any udp socket size issues I might be having,I'd imagine.

I originally used gtar (v1.13). Although I've not been able to confirmit yet, I suspect there is a socket size issue with udp. Once I cansafely say my backups are operational, I can test this theory. I don'thave a test bench, so testing at leisure is not possible.


John Hein's note stated:

Sounds similar to the issue I saw in the past when I started backing
up lots of data.  This was on FreeBSD, but it turned out that the udp
socket was not using it's max size.

See these messages and patch:

http://article.gmane.org/gmane.comp.archivers.amanda.devel/1148/match=message+longhttp://article.gmane.org/gmane.comp.archivers.amanda.devel/1152/match=message+long


Best to all,

Sean

-------- Original Message ----------

Hi all,

My first post here. I have perused all resources I know of to answerthis question, but results have been very limited. Forgive me if thisquestion has been posed and answer prior, but I cannot find the answer.

My issue is a sendsize timeout error. When it happens, amstatus showsthe final filesystem as "getting estimate", and it'll hang there fordays. The only error comes out of the amandad.xxx log (in /tmp/amanda),and it is:

amandad: time 21599.605: /usr/local/libexec/sendsize timed out waitingfor REP data

amandad: time 21599.605: sending NAK pkt:
<<<<<
ERROR timeout on reply pipe

amandad: time 21605.615: pid 17898 finish time Thu Jan 4 20:40:06 2007

That's it. All other logs basically say OK to everything. Does anyoneknow anything about this? Is this something that has been seen before?


My environment is:

Two servers, one is the Amanda server, and one is the Amanda client.Workstations are not backed up.Attached to the Amanda server is an Apple X-Serve RAID, RAID 5, andlargest partitions are 250GB ea.

The greatest amount of data on a single partition is 100GB

Note that the failure only began when I installed the Apple X-ServeRAID, and began using very large partitions. I am leaning toward anissue with gtar trying to calculate backup size on 100+GB worth of data.


Amanda Server:
SunFire V240
Solari 8 (patched to 02/06)
Amanda 2.5.0 (presently - error exists up to 2.5.1p2)
gtar (dumper in use) is version 1.13.1

Dumptype uses:
GNUTAR
compress server fast (client fast for client dumptype)
holdingdisk yes

Presently, I am testing this config because of what I think may be agtar issue. As yet I have no data:

Dumptype:
<same as above except>
compress none
estimate calcsize

Anything anyone can contribute to this issue will be greatly appreciated.

Sean

Sean Connors
Systems Administrator
ArgonST