Bacula-users

Re: [Bacula-users] Trunking Bacula-Dir/Bacula-SD on the same layer 3 network.

2008-09-19 05:38:48
Subject: Re: [Bacula-users] Trunking Bacula-Dir/Bacula-SD on the same layer 3 network.
From: Matthew Ife <Matthew.Ife AT ukfast.co DOT uk>
To: "bacula-users AT lists.sourceforge DOT net" <bacula-users AT lists.sourceforge DOT net>
Date: Fri, 19 Sep 2008 10:38:24 +0100

Anyone have any thoughts on this?

 

From: bacula-users-bounces AT lists.sourceforge DOT net [mailto:bacula-users-bounces AT lists.sourceforge DOT net] On Behalf Of Matthew Ife
Sent: 12 September 2008 14:39
To: bacula-users AT lists.sourceforge DOT net
Subject: [Bacula-users] Trunking Bacula-Dir/Bacula-SD on the same layer 3 network.

 

Hi Guys.

 

We have an unusual bacula network setup and after spending a few weeks playing around with configuration after configuration I have finally hit a stumbling block I am unable to work out.

 

Firstly I apologize for the length of this but I believe the history of what I have tried is relevent to what I am trying to achieve.

 

We offer backup services for our clients and our clients all have different needs, some are attached to a shared firewall whilst some are not. The shared firewall runs transparently and we control access through it by placing customers who are on the firewall on one vlan, and customers who are not on the firewall on another vlan.

 

We have since doing this recognised numerous connectivity issues which is caused by lots of backup traffic passing through the firewall eventually causing connections to be dropped and backups to fail, worse still for us all this backup traffic running through a firewall causes loss of service, even for our non backed up clients.

 

In order to avoid this we have placed backup servers on the same layer 3 network as our client machines as only traffic destined for the gateway is directed through the firewall but we were still seing connectivity issues.

 

Both firewalled and none firewalled customer reside on the same layer 3 network but do not reside on the same layer 2 network and thus ARPING the backup server doesn’t always work if your client is on the wrong broadcast domain (the firewalled vlan). So even backup machines connected to the same switch on the same layer 3 network was running out of the gateway, through the firewall and back down the firewall to the backup server.

 

In order to resolve this we decided to trunk the backup server to belong on BOTH vlans, in theory this would work but in practice we have had problems.

 

So, now the backup server has two vlans, two interfaces and two IP address, one resides on vlan A another resides on vlan B.  To prevent traffic going through any firewalls I dynamically configured our build script to create bacula configs which used interface A if your machine resides on vlan A and interface B if it resides on vlan B. After a few days of testing it soon became apparent we were seeing the same connectivity issues, but this was not totally the same. We were getting lots of "Authentication Failed" messages even though authentication was definitely correct.

 

Further investigation revealed that due to both vlanned IPs being on the same network, the routing tables were not always honouring the right interface to send traffic down (in fact, it as always sending down one. I think sometimes it would send down the other and as the source/dest ip would mismatch it would throw out an authentication error intermittently). So, I set out to work once again and this time I setup the routing tables using advanced routing (multiple routing tables which made decisions based on the source IP address). This time I could definitely confirm that I was sending data down the right interface through the correct vlan. But this time all the backups failed on the server! I received the following error "Fatal error: Bad response from stored to open command".

 

Thus I am still unable to send backup traffic down the right interface using a trunked vlan.

 

Our general configuration is as follows:

bacula-dir and bacula-sd reside on the same server.

client machines are generally all connected to the same switch.

the backup server is trunked so that we avoid passing bacula traffic through the firewall. (well, that’s the intention!)

 

Any advise which helps me set this up correctly would be great.

 

Ultimately we need bacula to send traffic down the right interface to the right client without causing problems. We cannot centralize our storage daemon since the sheer number of customers we are backing up at one point makes it unfeasible bandwidth wise and we are not keen on generating large quantities of inter-switched backup traffic. We are restricted to what times we can run backups because traditionally backing up a server for all intents and purposes causes loss of other services because bacula uses up large quantities of customers outbound traffic.

 

My most immediate questions are:

 

The authentication error we have experienced - we suspect this is something to do with how bacula keeps authentication "tokens" of each client. If traffic suddenly comes through the wrong interface (i.e the interface it DIDN’T authenticate on) bacula requires re-authentication. This can be caused because there are two routing rules in the routing table for each vlan but a since both match it nearly always chooses the first. Sometimes it seems it chooses the second. Can you shed any light on this?

 

When properly sorting out routing rules so that traffic has to go down the right interface, bacula fails immediately and consistently with "Fatal error: Bad response from stored to open command". What does this mean and how can we fix it? Why would this appear when forcing traffic down one interface?

 

Does bacula handle multiple interfaces better where IP addresses are on different networks instead of the same?

 

Below is a list of routing rules we have setup.

 

[root@163 ~]# ip rule ls

0:      from all lookup 255

32765:  from 78.109.163.75 lookup vlan64

32766:  from all lookup main

32767:  from all lookup default

 

Please note, default implies vlan 63.

 

[root@163 ~]# ip route show vlan64

78.109.163.0/24 dev eth0.64  scope link

default via 78.109.163.3 dev eth0.64

 

[root@163 ~]# ip route show table main

78.109.163.0/24 dev eth0.63  proto kernel  scope link  src 78.109.163.186

169.254.0.0/16 dev eth0.64  scope link

default via 78.109.163.3 dev eth0.63

 

#########Example of a firewalled client.##########

 

FileSet {

  Name = "78.109.163.116 Full Set"

        Include {

                Options {

                        wildfile = "*pagefile.sys"

wildfile = "*.log"

                        exclude = yes

                        signature = MD5

                        _onefs_ = no

                        fstype = ntfs

                }

        File = C:/

File = D:/

 

        }

}

Client {

  Name = srv-78_109_163_116

  Address = 163.116.srvlist.ukfast.net

  FDPort = 9102

  Catalog = MyCatalog

  Password = "JUbuVYFC"

  File Retention = 7 days

  Job Retention = 7 days

  AutoPrune = yes

}

 

Storage {

  Name = file-78_109_163_116

  Address = 163.186.srvlist.ukfast.net

  SDPort = 9103

  Password = "XXXXXXX"                 #This is the password for the director to the SD - Don't get confused

  Device = storage-78_109_163_116

  Media Type = File

}

 

JobDefs {

  Name = "78_109_163_116 Job"

  Type = Backup

  Level = Incremental

  Client = srv-78_109_163_116

  FileSet = "78.109.163.116 Full Set"

  Schedule = "WeeklyCycle"

  Storage = file-78_109_163_116

  Messages = Standard

  Pool = pool-78_109_163_116

  Priority = 10

}

 

Job {

  Name = "78_109_163_116 Job"

  JobDefs = "78_109_163_116 Job"

ClientRunBeforeJob = "C:/windows/sysstate.bat"

  Write Bootstrap = "/home/bacula/bootstraps/srv-78_109_163_116.bsr"

}

 

Pool {

  Name = pool-78_109_163_116

  Pool Type = Backup

  Recycle = yes

  AutoPrune = yes

  Recycle = yes

  Maximum Volumes = 5

  Maximum Volume Jobs = 7

  Maximum Volume Bytes = 5g

  VolumeRetention = 7d

  Volume Use Duration = 0

  LabelFormat = "srv-78_109_163_116-"

}

 

##########Example of a non-firewalled client#############

 

FileSet {

  Name = "78.109.163.43 Full Set"

  Include {

    Options {

      signature = MD5

      _onefs_ = no

      fstype = ext2

      Exclude = yes

      wildfile = "*access.log"

wildfile = "*access.log.*.*.gz"

wildfile = "*access.log.*.*"

wildfile = "*error.log"

 

                }

 

        File = /

 

  }

   Exclude {

    File = /proc

    File = /tmp

    File = /.journal

    File = /.fsck

    File = /sys

        }

}

 

Client {

  Name = srv-78_109_163_43

  Address = 163.43.srvlist.ukfast.net

  FDPort = 9102

  Catalog = MyCatalog

  Password = "oL3HxG38"

  File Retention = 7 days

  Job Retention = 7 days

  AutoPrune = yes

}

 

Storage {

  Name = file-78_109_163_43

  Address = 163.75.srvlist.ukfast.net      #we send traffic down the non firewalled vlan

  SDPort = 9103

  Password = "XXXXXXX"                 #This is the password for the director to the SD - Don't get confused

  Device = storage-78_109_163_43

  Media Type = File

}

 

JobDefs {

  Name = "78_109_163_43 Job"

  Type = Backup

  Level = Incremental

  Client = srv-78_109_163_43

  FileSet = "78.109.163.43 Full Set"

  Schedule = "WeeklyCycle"

  Storage = file-78_109_163_43

  Messages = Standard

  Pool = pool-78_109_163_43

  Priority = 10

}

 

Job {

  Name = "78_109_163_43 Job"

  JobDefs = "78_109_163_43 Job"

  Write Bootstrap = "/home/bacula/bootstraps/srv-78_109_163_43.bsr"

}

 

Pool {

  Name = pool-78_109_163_43

  Pool Type = Backup

  Recycle = yes

  AutoPrune = yes

  Recycle = yes

  Maximum Volumes = 5

  Maximum Volume Jobs = 7

  Maximum Volume Bytes = 5g

  VolumeRetention = 7d

  Volume Use Duration = 0

  LabelFormat = "srv-78_109_163_43-"

}

 

Any information you can provide would be very helpful or if you need more infomation from me please let me know.

 

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users
<Prev in Thread] Current Thread [Next in Thread>