My two cent it's something related to tcpmss.
I remember I had some problems like that when I deployed my
strongswan setup (not sure if it was with bacula or any other
app).
If so, this is completely unrelated to bacula, try playing with
iptables TCPMSS target in the appropriated chain/table.
Regards.
Le 27/09/2011 10:20, Kevin Keane a écrit :
Hi,
I recently made several changes to my network, and ever since my bacula backups to the affected server error out after exactly 15 minutes. I don't know exactly which change is the culprit, but I suspect it is somehow IPSec-related, and would appreciate some help troubleshooting the problem.
I have servers in two locations: my main office with the bacula director and SD, and several clients, and two remote servers in a data center 2000 miles away. These servers only run the bacula-fd.
The director used to use SSH to connect to both remote servers, and this worked reliably.
Recently, I implemented IPSec to one of the two remote servers. Director, remote FD and SD can now connect directly to each other, without any other tunnel (logically, IPSec is transparent). At the same time, I also switched my Internet connection from Cable modem to DSL. The second server still uses the SSH tunnel to do the backups.
Unfortunately, using IPSec, the backups seem to fail after 15 minutes; the connection from FD to SD seems to get severed. The backups to the second server, using SSH, work without a problem.
The director produces this log output:
26-Sep 19:31 my-dir JobId 10686: Start Backup JobId 10686, Job=remoteserver.2011-09-26_19.05.00_06
26-Sep 19:31 my -dir JobId 10686: Created new Volume " remoteserver _20110926193150_Differential.bacula" in catalog.
26-Sep 19:31 my -dir JobId 10686: Using Device "SATADisk1"
26-Sep 19:31 Disk1 JobId 10686: Labeled new Volume " remoteserver _20110926193150_Differential.bacula" on device "SATADisk1" (/misc/BACKUP1).
26-Sep 19:31 Disk1 JobId 10686: Wrote label to prelabeled Volume " remoteserver _20110926193150_Differential.bacula" on device "SATADisk1" (/misc/BACKUP1)
26-Sep 19:31 my -dir JobId 10686: Max Volume jobs=1 exceeded. Marking Volume " remoteserver_20110926193150_Differential.bacula" as Used.
26-Sep 19:46 my -dir JobId 10686: Fatal error: Network error with FD during Backup: ERR=Connection reset by peer
26-Sep 19:46 Disk1 JobId 10686: JobId=10686 Job=" remoteserver.2011-09-26_19.05.00_06" marked to be canceled.
26-Sep 19:46 Disk1 JobId 10686: Error: bsock.c:548 Read expected 65536 got 1392 from client:xxx.xxx.xxx.xxx:366432
The FD produces this error message:
Sep 27 02:46:58 remoteserver bacula-fd: bsock.c:393 Write error sending 270 bytes to client::::36387: ERR=Connection reset by peer
------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2dcopy1
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users
--
Alexandre Chapellon
Ingénierie des systèmes open sources et
réseaux.
Follow me on twitter: @alxgomz
|
a_chapellon.vcf
Description: Vcard
------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2dcopy1 _______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users
|