Bacula-users

Re: [Bacula-users] Director's comm line to SD dropped

2013-07-25 12:28:51
Subject: Re: [Bacula-users] Director's comm line to SD dropped
From: Josh Fisher <jfisher AT pvct DOT com>
To: bacula-users AT lists.sourceforge DOT net
Date: Thu, 25 Jul 2013 12:25:08 -0400
On 7/25/2013 9:58 AM, Yann Cézard wrote:
> On 24/07/2013 16:26, Jeff Dickens wrote:
>> Hi.   I keep getting this error on one of my servers.
>>
>> A few words about the setup:
>>
>> The director is running under Ubuntu 12.04 LTS in a virtual machine 
>> on a very lightly loaded Xeon X3 with SSDs.
>>
>> The FD and SD are both running on a QNAP NAS box.  I hand-built a 
>> bacula package for the QNAP.  All are running 5.2.12.
>>
>> I have two identical QNAP NAS boxes, running the identical bacula 
>> package, and one works perfectly and has been extensively tested. 
>>  The only difference is that one is connected to the director via a 
>> VPN (over Comcast Internet) and one is on the same LAN.
>>
>> The one on the remote QNAP will sometimes work for days and then 
>> sometimes only for a few hours.  Restarting the SD will always make 
>> it work for a while.  The log looks like this:
>>
>> 22-Jul 23:05 kiskadee-dir JobId 6445: No prior Full backup Job record 
>> found.
>> 22-Jul 23:05 kiskadee-dir JobId 6445: No prior or suitable Full 
>> backup found in catalog. Doing FULL backup.
>> 24-Jul 09:43 kiskadee-dir JobId 6445: Start Backup JobId 6445, 
>> Job=mill1-gene.2013-07-22_23.05.01_39
>> 24-Jul 09:43 kiskadee-dir JobId 6445: Purging oldest volume 
>> "mill1-gene-full-0117"
>> 24-Jul 09:43 kiskadee-dir JobId 6445: 1 File on Volume 
>> "mill1-gene-full-0117" purged from catalog.
>> 24-Jul 09:43 kiskadee-dir JobId 6445: There are no more Jobs 
>> associated with Volume "mill1-gene-full-0117". Marking it purged.
>> 24-Jul 09:43 kiskadee-dir JobId 6445: All records pruned from Volume 
>> "mill1-gene-full-0117"; marking it "Purged"
>> 24-Jul 09:43 kiskadee-dir JobId 6445: Using Device "FileStorage"
>> 24-Jul 09:44 mill1-sd JobId 6445: Recycled volume 
>> "mill1-gene-full-0117" on device "FileStorage" (/share/bacula), all 
>> previous data lost.
>> 24-Jul 09:43 kiskadee-dir JobId 6445: Max Volume jobs=1 exceeded. 
>> Marking Volume "mill1-gene-full-0117" as Used.
>> 24-Jul 10:02 mill1-fd JobId 6445: Error: bsock.c:429 Write error 
>> sending 65809 bytes to Storage 
>> daemon:mill1.intranet.seamanpaper.com:9103 
>> <http://mill1.intranet.seamanpaper.com:9103/>: ERR=Connection reset 
>> by peer
>> 24-Jul 10:02 mill1-fd JobId 6445: Fatal error: backup.c:1200 Network 
>> send error to SD. ERR=Connection reset by peer
>> 24-Jul 10:02 kiskadee-dir JobId 6445: Error: Director's comm line to 
>> SD dropped.
>> 24-Jul 10:02 kiskadee-dir JobId 6445: Error: Bacula kiskadee-dir 
>> 5.2.12 (12Sep12):
>>
>> Thanks in advance for any light you can shed on this.
>>
>> -- 
>> *Jeff Dickens*
>>      IT Manager      978-632-1513
>>
> Hi Jeff,
>
> You should read this : 
> http://wiki.bacula.org/doku.php?id=faq#my_backup_starts_but_dies_after_a_while_with_connection_reset_by_peer_error
> Especially the part talking about keepalive (considering you are using 
> a VPN).
>
> Regards,
> -- 
> Yann Cézard - administrateur systèmes serveurs
> Centre de ressources informatiques    -http://cri.univ-pau.fr
> Université de Pau et des pays de l'Adour -http://www.univ-pau.fr

In addition to routers/switches/firewalls between the FD and SD, I have 
also seen the machine running FD cause this issue. For example, if a 
relatively weak client machine takes a long time to compress/encrypt a 
large file, it is possible that a network device driver with overly 
aggressive power management settings thinks there is no activity and 
powers down the PHY, even though there is an open socket. I don't know 
if that is a bug or a feature. I doubt that is the case with the QNAP 
box, but I have seen it with Windows and Mac clients, particularly when 
connected with WiFi devices.


------------------------------------------------------------------------------
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users

<Prev in Thread] Current Thread [Next in Thread>