Bacula-users

Re: [Bacula-users] Fatal error: backup.c:892 Network send error to SD. ERR=Connection reset by peer

2010-04-18 12:50:28
Subject: Re: [Bacula-users] Fatal error: backup.c:892 Network send error to SD. ERR=Connection reset by peer
From: Jon Schewe <jpschewe AT mtu DOT net>
To: Matija Nalis <mnalis+bacula AT CARNet DOT hr>
Date: Sun, 18 Apr 2010 11:46:33 -0500
On 04/16/2010 08:30 AM, Matija Nalis wrote:
> On Mon, Apr 12, 2010 at 03:59:49PM -0500, Jon Schewe wrote:
>   
>> On 4/12/10 9:40 AM, Matija Nalis wrote:
>>     
>>> It is especially problem with bigger databases and MySQL instead of
>>> PostgreSQL, see http://bugs.bacula.org/view.php?id=1472, where it can
>>> take even several hours! (note that while it talks about "restore"
>>> speed, it is also related to accurate backups which employ similar
>>> SQL queries)
>>>
>>>       
>> Must be what it is then. I've been thinking about switching to postgres,
>> but haven't because the opensuse packages for bacula are only for mysql.
>> This may motivate me more.
>>     
> You should probably switch soon, before you get to like your
> database,,, Exporting bacula mysql tables for import in PostgreSQL
> can be very painful and problematic; it is much better to just drop
> the database and create fresh one.
>
>   
I'll keep that in mind as I go forward.

>> The backup finished, so it seems that in version 3.0.3 bacula does NOT
>> set the socket option SO_KEEPALIVE.
>>     
> Hmm, yeah, I've check the code casually, and it indeed looks like the
> heartbeats are not setting SO_KEEPALIVE timeouts (note that it does
> set SO_KEEPALIVE on the socket, otherwise the advice above wouldn't
> work -- it just doesn't do TCP_KEEPIDLE on that[1] to specify
> user-defined timeouts and instead uses system defaults). 
>
> The heartbeats look like are doing other things though (application-level, 
> not socket-level), but as you saw they are not perfect for fixing network 
> idleness problems - and so you also MUST set system defaults.
>
> I've updated the FAQ at:
> http://wiki.bacula.org/doku.php?id=faq#my_backup_starts_but_dies_after_a_while_with_connection_reset_by_peer_error
>
>
> [1] It actually tries that at one point in src/lib/bsock.c if
>     TCP_KEEPIDLE support is detected, but it fails to detect it
>     properly because <netinet/tcp.h> is not included.
>
>     However, even after fixing that (and missing semicolon in 
>     'int opt = heart_beat' line), it still doesn't look like it sets
>     TCP_KEEPIDLE correctly on FD->SD connection, so maybe this
>     codepath is not used there. 
>
>     Anyway I gave up debugging there and just set the system
>     defaults. But I just though I'd mention that in case someone
>     else wants to continue chasing the bug.
>
>   
Hmm, this sounds like a bug that should be fixed and once it is fixed
may remove a bunch of problems with firewalls.


------------------------------------------------------------------------------
Download Intel&#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users

<Prev in Thread] Current Thread [Next in Thread>