Bacula-users

Re: [Bacula-users] bacula fatal error all the time

2012-12-15 06:34:41
Subject: Re: [Bacula-users] bacula fatal error all the time
From: lst_hoe02 AT kwsoft DOT de
To: bacula-users AT lists.sourceforge DOT net
Date: Sat, 15 Dec 2012 12:31:48 +0100
Zitat von b_rom AT mail DOT ru:

> On Nov 29, 2012, at 11:45 PM, lst_hoe02 AT kwsoft DOT de wrote:
>
>>
>> Zitat von b_rom AT mail DOT ru:
>>
>>> On Nov 29, 2012, at 11:05 PM, Dan Langille <dan AT langille DOT org> wrote:
>>>
>>>>
>>>> On Nov 29, 2012, at 2:26 PM, b_rom AT mail DOT ru wrote:
>>>>
>>>>> Hello,
>>>>>
>>>>> I have a couple of hosts which backup through bacula. I have
>>>>> confronted with next situation, one host can't perform full
>>>>> backup, process of backup always finished with this error:
>>>>> Error: bsock.c:389 Write error sending 262144 bytes to Storage
>>>>> daemon:IP ADDRESS:9103: ERR=Broken pipe
>>>>> Fatal error: backup.c:1190 Network send error to SD. ERR=Broken pipe
>>>>> Error: Director's comm line to SD dropped
>>>>> Error: Bacula dir.bacula.HOSTNAME. 5.2.6
>>>>>
>>>>> Time of occurrence of this error is not always the same, here  
>>>>> are the logs:
>>>>> Elapsed time:           11 mins 56 secs
>>>>> Elapsed time:           1 hour 10 mins 18 secs
>>>>> Elapsed time:           47 mins 36 secs
>>>>> Elapsed time:           1 hour 14 mins 40 secs
>>>>> Elapsed time:           1 hour 40 mins 1 sec
>>>>>
>>>>> I don't think that this related to network issue. But client side
>>>>> have a lot of I/O operation, disk system is busy almost all the
>>>>> time. Maybe this is a cause.
>>>>> How can I solve this? I don't see any timeout directives which  
>>>>> can help me.
>>>>>
>>>>> Client side is FreeBSD 7.2 amd64  bacula  5.2.6
>>>>
>>>> HEADS UP.  FreeBSD 7.2 was end-of-life'd in 2010 which means no
>>>> security patches will be issued for it.  Upgrading is recommended.
>>>>
>>>> see http://www.freebsd.org/security/#unsup
>>>>
>>>> That that matter, that's an old version of FreeBSD too.  :)
>>>>
>>> I know, but upgrade in this case is impossible and this is not
>>> related to our problem with bacula I think
>>>>> Server side (DIR and SD on the same host) is FreeBSD 9.0 amd64
>>>>> bacula  5.2.6 (also tried  5.2.12 with the same result)
>>>>
>>>> Have you looked at trying the 'Heartbeat Interval' settings?
>>>>
>>>> You should try setting on the SD.  There is also a 'Heartbeat
>>>> Interval' on the FD, but that doesn't seem to be the error you're
>>>> getting.
>>>
>>> yes, I have tried to play with "Heartbeat Interval". Unfortunately
>>> doesn't help.
>>
>
> this is not related to NIC in my case, the same issue with various  
> cards, OS. The same behaviour with FreeBSD, CentOS. Broadcom and  
> Intel cards.
> I have no idea what could be a cause, anybody have have any thoughts?
>
>> Try with another NIC. We first had problems with our Bacula Server
>> failing two clients out of ~20 with connection failures randomly.
>> After ditching the Onboard GE (Marvell PHY) and using a PCIe NIC on
>> the Server the problem went away.
>>

It is a problem outside Bacula which simply uses a standard long-lived  
TCP connection. The amount of data transfered over a single connection  
for a long time does reveal subtile bugs in hardware and OS/drivers  
from time to time. So your only chance to solve is too swap related  
parts (NIcs, Switches, Router, cable ...) until the culprit is found.

Regards

Andreas



------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users

<Prev in Thread] Current Thread [Next in Thread>