Bacula-users

Re: [Bacula-users] Network problem?

2009-05-26 09:20:32
Subject: Re: [Bacula-users] Network problem?
From: Benedikt Carda <carda AT two-wings DOT net>
To: bacula-users AT lists.sourceforge DOT net
Date: Tue, 26 May 2009 15:15:48 +0200
Hi Arno,

thanks for your reply. In the last days I tried to implement your 
suggestions. As far as I could find out the heartbeat interval 
configuration option has to be added to the configuration of the 
director, so this is how my director configuration looks like right now:

Director {                            # define myself
  Name = plato-dir
  DIRport = 9101                # where we listen for UA connections
  QueryFile = "/usr/lib/bacula/query.sql"
  WorkingDirectory = "/var/lib/bacula"
  PidDirectory = "/var/run"
  Maximum Concurrent Jobs = 1
  Password = "somepass"         # Console password
  Messages = Daemon
  Heartbeat Interval = 60
}

Anyway, the error stays the same. Did I add this configuration option 
correctly?

Best Regards,
Benedikt



Arno Lehmann schrieb:
> Hi,
>
> 19.05.2009 07:51, Benedikt Carda wrote:
>   
>> Dear All,
>>     
>
> welcome to the mailing list! I hope we can help you with your 
> problems, and I hope you enjoy the way the Bacula community works 
> together!
>
>   
>> I kindly ask for help to solve the below problem:
>>
>> My configuration:
>> OS: Centos 5.3
>> Bacula version: 2.4.4.1
>>
>> I try to setup a backup process to backup just one database file (a 
>> mysql dump) for nightly backup over the internet (in order to have the 
>> backup off site). I am using this already for backing up other files and 
>> the connection works, only with large files,
>>     
>
> Ah... quite obvious, if you ask me :-)
>
>   
>> it seems bacula has a 
>> problem. The error is everyday exactly the same:
>>
>> 19-Mai 04:05 plato-dir JobId 206: No prior Full backup Job record found.
>> 19-Mai 04:05 plato-dir JobId 206: No prior or suitable Full backup found 
>> in catalog. Doing FULL backup.
>> 19-Mai 04:05 plato-dir JobId 206: Start Backup JobId 206, 
>> Job=merkur-database.2009-05-19_04.05.00.27
>> 19-Mai 04:05 plato-dir JobId 206: Using Device "FileStorage"
>> 19-Mai 04:05 plato-sd JobId 206: Volume "2009-0003" previously written, 
>> moving to end of data.
>> 19-Mai 04:05 plato-sd JobId 206: Ready to append to end of Volume 
>> "2009-0003" size=777761344
>> 19-Mai 05:11 plato-sd JobId 206: Job write elapsed time = 01:06:18, 
>> Transfer rate = 39.45 K bytes/second
>> 19-Mai 06:16 plato-dir JobId 206: Fatal error: Network error with FD 
>> during Backup: ERR=Die Wartezeit für die Verbindung ist abgelaufen
>> 19-Mai 06:16 plato-dir JobId 206: Fatal error: No Job status returned 
>> from FD.
>>     
>
> This means that the connection between DIR and FD is dead.
>
>   
>> 19-Mai 06:16 plato-dir JobId 206: Error: Bacula plato-dir 2.4.4 
>> (28Dec08): 19-Mai-2009 06:16:18
>>   Build OS:               i686-redhat-linux-gnu redhat
>>   JobId:                  206
>>   Job:                    merkur-database.2009-05-19_04.05.00.27
>>   Backup Level:           Full (upgraded from Incremental)
>>   Client:                 "merkur-fd" 2.4.4 (28Dec08) 
>> i686-redhat-linux-gnu,redhat,
>>   FileSet:                "merkur-database" 2009-02-26 04:05:00
>>   Pool:                   "plato" (From Job resource)
>>   Storage:                "plato-sd" (From Job resource)
>>   Scheduled time:         19-Mai-2009 04:05:00
>>   Start time:             19-Mai-2009 04:05:02
>>   End time:               19-Mai-2009 06:16:18
>>   Elapsed time:           2 hours 11 mins 16 secs
>>   Priority:               10
>>   FD Files Written:       0
>>   SD Files Written:       2
>>   FD Bytes Written:       0 (0 B)
>>   SD Bytes Written:       156,970,404 (156.9 MB)
>>   Rate:                   0.0 KB/s
>>   Software Compression:   None
>>   VSS:                    no
>>   Storage Encryption:     no
>>   Volume name(s):         2009-0003
>>   Volume Session Id:      95
>>   Volume Session Time:    1238603747
>>   Last Volume Bytes:      934,848,632 (934.8 MB)
>>   Non-fatal FD errors:    0
>>   SD Errors:              0
>>   FD termination status:  Error
>>   SD termination status:  OK
>>   Termination:            *** Backup Error ***
>>
>> It seems like the network connection has been terminated sometime.
>>     
>
> Well, not the network connection itself, but the TCP session between 
> DIR and FD.
>
>   
>> But 
>> that is not true as it is a permanent connection with a static IP 
>> address on both sides.
>>     
>
> There's more to it than just IP addresses... Bacula uses TCP, so there 
> is a persistent connection established. Any router / firewall between 
> DIR and FD can affect the state of that connection, and that's most 
> likely what you observe: After a certain time of inactivity, some 
> router decides the connection is stale and closes it. Some routers 
> tend to do that and do not follow some standards that, on the TCP/IP 
> level, keep connections open (this can be a feature, too, if you need 
> to prevent DoS attacks where the attack itself is opening so many 
> connections that resources are exhausted).
>
>   
>> Furthermore, the line "SD Files Written:" states 
>> that there were two files written to the storage daemon. This is even 
>> one more than expected, it should only backup this one directory where 
>> there is only one file in it. And secondly the "SD Bytes Written:" line 
>> states nearly the bytes the mysql dump file had on that day (some bytes 
>> more than the file actually has). This means it actually wrote the whole 
>> file but after that I get an error or let's say the FD doesn't respond 
>> anymore or something. How can this happen? What could be a solution?
>>     
>
> How this happens is explained above. The cure is probably to use 
> Bacula's "Heartbeat interval" directive, to force Bacula to send some 
> packets between DIR and FD regularly. That should enable any router in 
> between to understand that this connection is still in use and should 
> not be forcibly dropped.
>
> Arno
>
>   
>> Thanks in advance.
>>
>> Best Regards,
>> Benedikt.
>>
>>
>>
>> ------------------------------------------------------------------------------
>> Crystal Reports - New Free Runtime and 30 Day Trial
>> Check out the new simplified licensing option that enables 
>> unlimited royalty-free distribution of the report engine 
>> for externally facing server and web deployment. 
>> http://p.sf.net/sfu/businessobjects
>> _______________________________________________
>> Bacula-users mailing list
>> Bacula-users AT lists.sourceforge DOT net
>> https://lists.sourceforge.net/lists/listinfo/bacula-users
>>
>>     
>
>   


------------------------------------------------------------------------------
Register Now for Creativity and Technology (CaT), June 3rd, NYC. CaT
is a gathering of tech-side developers & brand creativity professionals. Meet
the minds behind Google Creative Lab, Visual Complexity, Processing, & 
iPhoneDevCamp asthey present alongside digital heavyweights like Barbarian
Group, R/GA, & Big Spaceship. http://www.creativitycat.com 
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users

<Prev in Thread] Current Thread [Next in Thread>