Hi James,
On Wed, 08 Jul 2009, James Harper wrote:
> 7200 + 9 * 75 = 7875 seconds = 2 hours, 11 minutes and 15 seconds. I
> don't think that's a coincidence.
I'm inclined to agree. Thanks :-)
> There are 3 TCP connections when a backup runs:
> DIR->SD
> DIR->FD
> FD->SD
>
> The FD->SD one moves a lot of traffic, and one of the DIR->SD or DIR->FD
> ones moves a bit (attributes) but I can't remember which one. The other
> one is probably sitting idle and a firewall somewhere is timing out the
> connection long before the 2 hours that Linux uses as a minimum.
I see. I was trying to work out why the FD->SD had died, but it was killed
by bacula-dir when it detected one of the others died. Looking at the
error, I'm going to guess that the problem is on the DIR->FD socket.
07-Jul 13:19 cuimhne-dir JobId 58: Fatal error: Network error with FD during
Backup: ERR=Connection timed out
I've set a heartbeat interval of 60 seconds on the director and am running
the backup again to see what happens.
I see similar issues with ssh connections through that firewall and have a
heartbeat set on them too, though I'd never noticed the significance of
7875 seconds. Hopefully this will be a fix.
Thanks very much,
Gavin
------------------------------------------------------------------------------
Enter the BlackBerry Developer Challenge
This is your chance to win up to $100,000 in prizes! For a limited time,
vendors submitting new applications to BlackBerry App World(TM) will have
the opportunity to enter the BlackBerry Developer Challenge. See full prize
details at: http://p.sf.net/sfu/Challenge
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users
|