Bacula-users

Re: [Bacula-users] Network problem?

2009-05-27 04:35:37
Subject: Re: [Bacula-users] Network problem?
From: Arno Lehmann <al AT its-lehmann DOT de>
To: bacula-users AT lists.sourceforge DOT net
Date: Wed, 27 May 2009 10:30:18 +0200
Hello,

26.05.2009 15:15, Benedikt Carda wrote:
> Hi Arno,
> 
> thanks for your reply. In the last days I tried to implement your 
> suggestions. As far as I could find out the heartbeat interval 
> configuration option has to be added to the configuration of the 
> director, so this is how my director configuration looks like right now:
> 
> Director {                            # define myself
>   Name = plato-dir
>   DIRport = 9101                # where we listen for UA connections
>   QueryFile = "/usr/lib/bacula/query.sql"
>   WorkingDirectory = "/var/lib/bacula"
>   PidDirectory = "/var/run"
>   Maximum Concurrent Jobs = 1
>   Password = "somepass"         # Console password
>   Messages = Daemon
>   Heartbeat Interval = 60
> }
> 
> Anyway, the error stays the same. Did I add this configuration option 
> correctly?

Basically, yes. But it might help if you also added it to the SD and 
FD configurations.

I would have expected the setting in the DIR to be sufficient, though...

Which leaves the potential problem of some intermediate router 
ignoring the heartbeat and simply closing down the session. If that's 
the case, you might have more suvvess with setting up a VPN(-like) 
connection to your remote clients.

SSH with port forwarding is easy to set up and manage (and can ensure 
connections stay open by providing dummy traffic). There's a great 
article at wiki.bacula.org from Kevin.

A full-blown VPN using IPSec or OpenVPN can be easier to manage once 
you get beyond a certain number of clients, but is definitely harder 
to set up and maintain. OpenVPN is relatively simple, and can hide 
connection loss to the higher level applications (IIRC, I once set up 
an OpenVPN tunnel to just work around this problem, and it worked well).

Cheers,

Arno

> Best Regards,
> Benedikt
> 
> 
> 
> Arno Lehmann schrieb:
>> Hi,
>>
>> 19.05.2009 07:51, Benedikt Carda wrote:
>>   
>>> Dear All,
>>>     
>> welcome to the mailing list! I hope we can help you with your 
>> problems, and I hope you enjoy the way the Bacula community works 
>> together!
>>
>>   
>>> I kindly ask for help to solve the below problem:
>>>
>>> My configuration:
>>> OS: Centos 5.3
>>> Bacula version: 2.4.4.1
>>>
>>> I try to setup a backup process to backup just one database file (a 
>>> mysql dump) for nightly backup over the internet (in order to have the 
>>> backup off site). I am using this already for backing up other files and 
>>> the connection works, only with large files,
>>>     
>> Ah... quite obvious, if you ask me :-)
>>
>>   
>>> it seems bacula has a 
>>> problem. The error is everyday exactly the same:
>>>
>>> 19-Mai 04:05 plato-dir JobId 206: No prior Full backup Job record found.
>>> 19-Mai 04:05 plato-dir JobId 206: No prior or suitable Full backup found 
>>> in catalog. Doing FULL backup.
>>> 19-Mai 04:05 plato-dir JobId 206: Start Backup JobId 206, 
>>> Job=merkur-database.2009-05-19_04.05.00.27
>>> 19-Mai 04:05 plato-dir JobId 206: Using Device "FileStorage"
>>> 19-Mai 04:05 plato-sd JobId 206: Volume "2009-0003" previously written, 
>>> moving to end of data.
>>> 19-Mai 04:05 plato-sd JobId 206: Ready to append to end of Volume 
>>> "2009-0003" size=777761344
>>> 19-Mai 05:11 plato-sd JobId 206: Job write elapsed time = 01:06:18, 
>>> Transfer rate = 39.45 K bytes/second
>>> 19-Mai 06:16 plato-dir JobId 206: Fatal error: Network error with FD 
>>> during Backup: ERR=Die Wartezeit für die Verbindung ist abgelaufen
>>> 19-Mai 06:16 plato-dir JobId 206: Fatal error: No Job status returned 
>>> from FD.
>>>     
>> This means that the connection between DIR and FD is dead.
>>
>>   
>>> 19-Mai 06:16 plato-dir JobId 206: Error: Bacula plato-dir 2.4.4 
>>> (28Dec08): 19-Mai-2009 06:16:18
>>>   Build OS:               i686-redhat-linux-gnu redhat
>>>   JobId:                  206
>>>   Job:                    merkur-database.2009-05-19_04.05.00.27
>>>   Backup Level:           Full (upgraded from Incremental)
>>>   Client:                 "merkur-fd" 2.4.4 (28Dec08) 
>>> i686-redhat-linux-gnu,redhat,
>>>   FileSet:                "merkur-database" 2009-02-26 04:05:00
>>>   Pool:                   "plato" (From Job resource)
>>>   Storage:                "plato-sd" (From Job resource)
>>>   Scheduled time:         19-Mai-2009 04:05:00
>>>   Start time:             19-Mai-2009 04:05:02
>>>   End time:               19-Mai-2009 06:16:18
>>>   Elapsed time:           2 hours 11 mins 16 secs
>>>   Priority:               10
>>>   FD Files Written:       0
>>>   SD Files Written:       2
>>>   FD Bytes Written:       0 (0 B)
>>>   SD Bytes Written:       156,970,404 (156.9 MB)
>>>   Rate:                   0.0 KB/s
>>>   Software Compression:   None
>>>   VSS:                    no
>>>   Storage Encryption:     no
>>>   Volume name(s):         2009-0003
>>>   Volume Session Id:      95
>>>   Volume Session Time:    1238603747
>>>   Last Volume Bytes:      934,848,632 (934.8 MB)
>>>   Non-fatal FD errors:    0
>>>   SD Errors:              0
>>>   FD termination status:  Error
>>>   SD termination status:  OK
>>>   Termination:            *** Backup Error ***
>>>
>>> It seems like the network connection has been terminated sometime.
>>>     
>> Well, not the network connection itself, but the TCP session between 
>> DIR and FD.
>>
>>   
>>> But 
>>> that is not true as it is a permanent connection with a static IP 
>>> address on both sides.
>>>     
>> There's more to it than just IP addresses... Bacula uses TCP, so there 
>> is a persistent connection established. Any router / firewall between 
>> DIR and FD can affect the state of that connection, and that's most 
>> likely what you observe: After a certain time of inactivity, some 
>> router decides the connection is stale and closes it. Some routers 
>> tend to do that and do not follow some standards that, on the TCP/IP 
>> level, keep connections open (this can be a feature, too, if you need 
>> to prevent DoS attacks where the attack itself is opening so many 
>> connections that resources are exhausted).
>>
>>   
>>> Furthermore, the line "SD Files Written:" states 
>>> that there were two files written to the storage daemon. This is even 
>>> one more than expected, it should only backup this one directory where 
>>> there is only one file in it. And secondly the "SD Bytes Written:" line 
>>> states nearly the bytes the mysql dump file had on that day (some bytes 
>>> more than the file actually has). This means it actually wrote the whole 
>>> file but after that I get an error or let's say the FD doesn't respond 
>>> anymore or something. How can this happen? What could be a solution?
>>>     
>> How this happens is explained above. The cure is probably to use 
>> Bacula's "Heartbeat interval" directive, to force Bacula to send some 
>> packets between DIR and FD regularly. That should enable any router in 
>> between to understand that this connection is still in use and should 
>> not be forcibly dropped.
>>
>> Arno
>>
>>   
>>> Thanks in advance.
>>>
>>> Best Regards,
>>> Benedikt.
>>>
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> Crystal Reports - New Free Runtime and 30 Day Trial
>>> Check out the new simplified licensing option that enables 
>>> unlimited royalty-free distribution of the report engine 
>>> for externally facing server and web deployment. 
>>> http://p.sf.net/sfu/businessobjects
>>> _______________________________________________
>>> Bacula-users mailing list
>>> Bacula-users AT lists.sourceforge DOT net
>>> https://lists.sourceforge.net/lists/listinfo/bacula-users
>>>
>>>     
>>   
> 
> 
> ------------------------------------------------------------------------------
> Register Now for Creativity and Technology (CaT), June 3rd, NYC. CaT
> is a gathering of tech-side developers & brand creativity professionals. Meet
> the minds behind Google Creative Lab, Visual Complexity, Processing, & 
> iPhoneDevCamp asthey present alongside digital heavyweights like Barbarian
> Group, R/GA, & Big Spaceship. http://www.creativitycat.com 
> _______________________________________________
> Bacula-users mailing list
> Bacula-users AT lists.sourceforge DOT net
> https://lists.sourceforge.net/lists/listinfo/bacula-users
> 

-- 
Arno Lehmann
IT-Service Lehmann
Sandstr. 6, 49080 Osnabrück
www.its-lehmann.de

------------------------------------------------------------------------------
Register Now for Creativity and Technology (CaT), June 3rd, NYC. CaT 
is a gathering of tech-side developers & brand creativity professionals. Meet
the minds behind Google Creative Lab, Visual Complexity, Processing, & 
iPhoneDevCamp as they present alongside digital heavyweights like Barbarian 
Group, R/GA, & Big Spaceship. http://p.sf.net/sfu/creativitycat-com 
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users

<Prev in Thread] Current Thread [Next in Thread>