Hi,
I recently made several changes to my network, and ever since my bacula backups
to the affected server error out after exactly 15 minutes. I don't know exactly
which change is the culprit, but I suspect it is somehow IPSec-related, and
would appreciate some help troubleshooting the problem.
I have servers in two locations: my main office with the bacula director and
SD, and several clients, and two remote servers in a data center 2000 miles
away. These servers only run the bacula-fd.
The director used to use SSH to connect to both remote servers, and this worked
reliably.
Recently, I implemented IPSec to one of the two remote servers. Director,
remote FD and SD can now connect directly to each other, without any other
tunnel (logically, IPSec is transparent). At the same time, I also switched my
Internet connection from Cable modem to DSL. The second server still uses the
SSH tunnel to do the backups.
Unfortunately, using IPSec, the backups seem to fail after 15 minutes; the
connection from FD to SD seems to get severed. The backups to the second
server, using SSH, work without a problem.
The director produces this log output:
26-Sep 19:31 my-dir JobId 10686: Start Backup JobId 10686,
Job=remoteserver.2011-09-26_19.05.00_06
26-Sep 19:31 my -dir JobId 10686: Created new Volume " remoteserver
_20110926193150_Differential.bacula" in catalog.
26-Sep 19:31 my -dir JobId 10686: Using Device "SATADisk1"
26-Sep 19:31 Disk1 JobId 10686: Labeled new Volume " remoteserver
_20110926193150_Differential.bacula" on device "SATADisk1" (/misc/BACKUP1).
26-Sep 19:31 Disk1 JobId 10686: Wrote label to prelabeled Volume " remoteserver
_20110926193150_Differential.bacula" on device "SATADisk1" (/misc/BACKUP1)
26-Sep 19:31 my -dir JobId 10686: Max Volume jobs=1 exceeded. Marking Volume "
remoteserver_20110926193150_Differential.bacula" as Used.
26-Sep 19:46 my -dir JobId 10686: Fatal error: Network error with FD during
Backup: ERR=Connection reset by peer
26-Sep 19:46 Disk1 JobId 10686: JobId=10686 Job="
remoteserver.2011-09-26_19.05.00_06" marked to be canceled.
26-Sep 19:46 Disk1 JobId 10686: Error: bsock.c:548 Read expected 65536 got 1392
from client:xxx.xxx.xxx.xxx:366432
The FD produces this error message:
Sep 27 02:46:58 remoteserver bacula-fd: bsock.c:393 Write error sending 270
bytes to client::::36387: ERR=Connection reset by peer
------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2dcopy1
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users
|