Bacula-users

Re: [Bacula-users] Network send error to SD. ERR=Connection reset by peer

2013-06-13 08:59:05
Subject: Re: [Bacula-users] Network send error to SD. ERR=Connection reset by peer
From: "Clark, Patricia A." <clarkpa AT ornl DOT gov>
To: "bacula-users AT lists.sourceforge DOT net" <bacula-users AT lists.sourceforge DOT net>
Date: Thu, 13 Jun 2013 08:54:43 -0400
On 6/12/13 10:41 AM, "Josh Fisher" <jfisher AT pvct DOT com> wrote:


>
>On 6/11/2013 11:10 AM, Leonardo - Mandic wrote:
>> Hello,
>>
>> After upgrade to bacula 5.2.13 I have bacula storage problems. Appers
>> a network problem, but don't are, I have a gigabit network dedicated
>> to bacula. The problem is on backups running for many hours or days
>> (full backup of 500gb delay 2 days, for example).
>>
>> The time is random, but 70% of servers have this same errors.
>>
>> On old versions never have this problem, and its same network and same
>> servers of old bacula versions.
>>
>> Anybody have this problem on 5.2.13?
>>
>> Erroris:
>>
>>
>> 2013-06-10 23:51:01 servert-fd JobId 266: Error: bsock.c:429 Write
>> error sending 64562 bytes to Storage daemon:10.1.0.60:9103:
>> ERR=Connection reset by peer
>> 2013-06-10 23:51:01 servert-fd JobId 266: Fatal error: backup.c:1200
>> Network send error to SD. ERR=Connection reset by peer
>
>In my experience, it has always been hardware related. In particular,
>aggressive power saving modes will cause this when one of the systems
>cuts power to its Ethernet PHY at an inappropriate time. This can be
>because the device driver's default is geared toward early power savings
>and the op hasn't changed it, or a buggy device driver shuts off the PHY
>when it shouldn't. Bacula requires that TCP connections remain up
>throughout the job lifetime. Anything that might cause a delay could
>cause this if the power save timeout for the Ethernet controller is
>shorter than the delay. For example, if the database server is restarted
>by a nightly cron job and you are not spooling attributes, then the
>delay could allow the device driver to shut down the PHY due to
>"inactivity".
>
>
>--------------------------------------------------------------------------
>----

I would suggest that that is not the case for this issue.  I have had this
on a server that is busy backing up multiple backups where one of them
will get this error.  Everything is on the server, so I am not reaching
out to a separate client.  I do not use any of the power saving features
on the server either.

Patti Clark
Linux System Administrator
Research and Development Systems Support Oak Ridge National Laboratory





------------------------------------------------------------------------------
This SF.net email is sponsored by Windows:

Build for Windows Store.

http://p.sf.net/sfu/windows-dev2dev
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users