Bacula-users

Re: [Bacula-users] Fatal error: backup.c:892 Network send error to SD. ERR=Connection reset by peer

2010-04-12 10:26:26
Subject: Re: [Bacula-users] Fatal error: backup.c:892 Network send error to SD. ERR=Connection reset by peer
From: Jon Schewe <jpschewe AT mtu DOT net>
To: Matija Nalis <mnalis+bacula AT CARNet DOT hr>
Date: Mon, 12 Apr 2010 09:23:51 -0500
On 4/12/10 9:00 AM, Matija Nalis wrote:
> On Mon, Apr 12, 2010 at 08:45:36AM -0500, Jon Schewe wrote:
>   
>> On 4/12/10 8:39 AM, Matija Nalis wrote:
>>     
>>> echo 60 > /proc/sys/net/ipv4/tcp_keepalive_time
>>>
>>> (or edit /etc/sysctl.d/* or /etc/sysctl.conf to retain value across
>>> reboots). Can you try what "netstat -to" says after you lower that
>>> limit and rerun backups ? 
>>>
>>>       
>> Now I see the timer down where I expect it. Should I only need this on
>> the client?
>>     
> If only that client is having timeout timeout problems, than yes (as
> I understand your Director and SD are on same server, so you should
> not have timeout issues there as no networking is involved).
>
> (SO_KEEPALIVE will work even with only one side of connection having
> it enabled).
>
>   
So I should only need the heartbeat on that client's setup as well,
right? Getting rid of extra heart beats would be nice.

>>> If "netstat -to" then reports smaller timers (60 or less), than it
>>> should fix your problem, so you can try turning accurate back to yes.
>>>
>>> Does that help ?
>>>       
>> It's running, I'll know in a couple of hours.
>>     
> Good, let us know how it fares.
>
>   
It seems to be running, but I've run into a problem with bconsole. Once
I started the job, if I run bconsole and then "status dir", the console
hangs. If I strace the bconsole process it's stuck in a select call.
>strace -p 18452
Process 18452 attached - interrupt to quit
select(4, [3], NULL, NULL, {9, 461287}) = 0 (Timeout)
read(3, 0x655d80, 5)                    = -1 EAGAIN (Resource
temporarily unavailable)
select(4, [3], NULL, NULL, {10, 0})     = 0 (Timeout)
read(3, 0x655d80, 5)                    = -1 EAGAIN (Resource
temporarily unavailable)
select(4, [3], NULL, NULL, {10, 0}


-- 
Jon Schewe | http://mtu.net/~jpschewe
If you see an attachment named signature.asc, this is my digital
signature. See http://www.gnupg.org for more information.


------------------------------------------------------------------------------
Download Intel&#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users

<Prev in Thread] Current Thread [Next in Thread>