Bacula-users

Re: [Bacula-users] Director's comm line to SD dropped

2013-07-25 10:20:33
Subject: Re: [Bacula-users] Director's comm line to SD dropped
From: Yann Cézard <yann.cezard AT univ-pau DOT fr>
To: Jeff Dickens <jeff AT seamanpaper DOT com>
Date: Thu, 25 Jul 2013 15:58:31 +0200
On 24/07/2013 16:26, Jeff Dickens wrote:
Hi.   I keep getting this error on one of my servers. 

A few words about the setup:

The director is running under Ubuntu 12.04 LTS in a virtual machine on a very lightly loaded Xeon X3 with SSDs.

The FD and SD are both running on a QNAP NAS box.  I hand-built a bacula package for the QNAP.  All are running 5.2.12.   

I have two identical QNAP NAS boxes, running the identical bacula package, and one works perfectly and has been extensively tested.  The only difference is that one is connected to the director via a VPN (over Comcast Internet) and one is on the same LAN.

The one on the remote QNAP will sometimes work for days and then sometimes only for a few hours.  Restarting the SD will always make it work for a while.  The log looks like this:

22-Jul 23:05 kiskadee-dir JobId 6445: No prior Full backup Job record found.
22-Jul 23:05 kiskadee-dir JobId 6445: No prior or suitable Full backup found in catalog. Doing FULL backup.
24-Jul 09:43 kiskadee-dir JobId 6445: Start Backup JobId 6445, Job=mill1-gene.2013-07-22_23.05.01_39
24-Jul 09:43 kiskadee-dir JobId 6445: Purging oldest volume "mill1-gene-full-0117"
24-Jul 09:43 kiskadee-dir JobId 6445: 1 File on Volume "mill1-gene-full-0117" purged from catalog.
24-Jul 09:43 kiskadee-dir JobId 6445: There are no more Jobs associated with Volume "mill1-gene-full-0117". Marking it purged.
24-Jul 09:43 kiskadee-dir JobId 6445: All records pruned from Volume "mill1-gene-full-0117"; marking it "Purged"
24-Jul 09:43 kiskadee-dir JobId 6445: Using Device "FileStorage"
24-Jul 09:44 mill1-sd JobId 6445: Recycled volume "mill1-gene-full-0117" on device "FileStorage" (/share/bacula), all previous data lost.
24-Jul 09:43 kiskadee-dir JobId 6445: Max Volume jobs=1 exceeded. Marking Volume "mill1-gene-full-0117" as Used.
24-Jul 10:02 mill1-fd JobId 6445: Error: bsock.c:429 Write error sending 65809 bytes to Storage daemon:mill1.intranet.seamanpaper.com:9103: ERR=Connection reset by peer
24-Jul 10:02 mill1-fd JobId 6445: Fatal error: backup.c:1200 Network send error to SD. ERR=Connection reset by peer
24-Jul 10:02 kiskadee-dir JobId 6445: Error: Director's comm line to SD dropped.
24-Jul 10:02 kiskadee-dir JobId 6445: Error: Bacula kiskadee-dir 5.2.12 (12Sep12):

Thanks in advance for any light you can shed on this.

--
     Jeff Dickens
     IT Manager      978-632-1513

Hi Jeff,

You should read this : http://wiki.bacula.org/doku.php?id=faq#my_backup_starts_but_dies_after_a_while_with_connection_reset_by_peer_error
Especially the part talking about keepalive (considering you are using a VPN).

Regards,
-- 
Yann Cézard - administrateur systèmes serveurs
Centre de ressources informatiques    -     http://cri.univ-pau.fr
Université de Pau et des pays de l'Adour -  http://www.univ-pau.fr
------------------------------------------------------------------------------
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users
<Prev in Thread] Current Thread [Next in Thread>