Bacula-users

Re: [Bacula-users] Bacula Setup for 300 clients & 20 Servers

2009-04-23 04:05:09
Subject: Re: [Bacula-users] Bacula Setup for 300 clients & 20 Servers
From: Arno Lehmann <al AT its-lehmann DOT de>
To: Bacula-users AT lists.sourceforge DOT net
Date: Thu, 23 Apr 2009 09:58:55 +0200
Hello,

and welcome to the mailing list!

I hope we can help you solve any issues you encounter with running 
Bacula, and of course I also hope that you'll help others as well!

22.04.2009 23:29, Jayson Broughton wrote:
> 
> 
> Hello list,
> 
>  
> 
> We are currently in the process of scrapping our old backup solution

Good idea :-)

> (windows nt client backup), and going with Bacula.  I have been tasked 
> with setting up a test environment for this.  As of right now I have 
> bacula backing up 6 clients to an external SD.  The DIR & SD are located 
> on the same server for testing purposes (The SD is an external 1TB drive 
> for testing purposes).

I assume this works as you want it and you've go some experience with 
Bacula by now.

> 
> But alas, I need a little guidance on how to properly implement this.

Sure, see below. And keep in mind that there is also commercial 
support available - if you want to move some of the design and 
implementation work to someone specialized on this sort of tasks, 
there are companies available to assist you (Bacula Systems SA and 
it's partners would be a good choice, of course ;-)

>  
> 
> This is the layout:
> 
>  
> 
> Local Servers = ~ 20 (Linux servers running oracle & windows servers 
> running various services)

Here, the services will be the major task. VSS probably covers some of 
them, but others - Oracle of course - will need some glue scripting.

> Remote Servers = ~ 10 (Linux & Windows)

Remote as in "connected by WAN/VPN"... ok.

> Local Clients = ~150 (Windows XP, Vista)
> 
> Remote Clients = ~150 (20 off-site locations scattered across the US).

Nicely sized network. You'll probably have some work to do with 
firewalls and VPN gateways, but that's probably no news to you.

>  
> 
> As of right now:
> 
> *I figured we would have a different pool for each department 
> (accounting, legal, etc) to better manage pools. 

Reasonable, but it might be simpler if you use less pools. A pool is 
most useful to organize backups into volumes with different retention 
settings and media types - finding the right volumes to restore a 
certain client from is handled by Bacula itself quite reasonably.

If you need to assign responsibilites for groups of clients to 
different backup operators or admins, reflecting that responsibility 
in the pools is, of course, reasonable.

> *The filesets are individual for each user (as each user has different 
> things needed backed up).

So creating and maintaining all those filesets will be a good part of 
your day-to-day workload with Bacula. I suggest you start 
experimenting with include files and auto-generated configuration soon.

> *Each Department has a different Pool, along with a different Volume 
> (max 2gb each vol) that is labeled for their department (IT001, 
> Legal001, etc).

Ok, that's reasonable. I assume you want volumes to be labeled 
automatically.

> * The backup schedules are tailored to the department as well, to easily 
> manage backup times across the network (so that there are not 190+ 
> backups queued on the director waiting to backup)

That in itself would not be a major problem, provided you've got 
enough storage daemons to run many jobs in parallel.

> * Client backups are saved for a max of 14 days, at which time the 
> volumes will be overwritten with the newest data.

I suggest you start thinking the way Bacula works, as it makes life 
easier in the long run: "at which time the volumes *can* be 
overwritten". If you want to force Bacula to overwrite volumes you're 
starting a long struggle.

>  
> 
> SD:
> 
> *Remote Clients will be backed up to a remote SD

Ok, that's reasonable.

> *Local clients are backed up to a local SAN

Ok.

Don't forget to assign unique media types to all the SD's (or rather, 
storage devices) you use.

> *Servers are backed up every day (full once a week, 
> incremental-differential during the week) to the SAN, then full back-ups 
> to an 8-tape autoloader at the end of the week.

Probably better and more efficient to use a copy job to copy the 
latest full backup to tape. Saves bandwidth, and it doesn't break the 
relationship between full level backups and incremental / differential 
ones.

> 
> What Im wondering is this:
> 
> *Is this the right way to go about this?  Different Pools for different 
> departments (I think it would be easier to maintain a scheduled backup 
> time for each department rather than having to remember what computer 
> backs up at what time)

The schedules don't have anything to do with the pools.

How you use pools is mostly a high-level decision, i.e. you could use 
one pool only per storage device, or one pool per client. Quite often, 
this is a matter of taste - but one pool per department sounds like a 
reasonable compromise.

> *Because of the vast volume of machines being backed up, should I have 3 
> separate directors (Servers, Local Clients, Remote Servers)?

I guess that, for your network size plus some growth over time, one 
central DIR should well be able to handle it. You'll need a decently 
performing server and a well-performing catalog database, of course. 
And reliable WAN connections...

> *The remote offices have between 5-20 clients, and 1 server at each 
> location.  I figured that they would all report to the local director 
> here at the main office, and back up to a SD at their remote site.  This 
> way we can minimize the amount of traffic going through the VPN/WAN, but 
> yet still be able to administer the restoring and backup functionality 
> remotely in a centralized location.

Yup, that's what I'd recommend. Keep in mind that the traffic can be 
more than you expect, due to all the metadata that gets sent to the 
catalog!

>  
> 
>  
> 
> Has anyone done a backup to this scale with such a large variety of 
> servers, clients & remote clients?

Not personally, no. I do manage or personally know some Bacula 
installations that are similar in number of clients, but those 
typically have all the backed up machines in one data center. I think 
the main issue you'll see is the WAN performance and stability. VPN 
tunnels can make stability much less of an issue, and ssh tunnels 
allow you to pass firewalls quite easily if no VPN or dedicated WAN 
link is available - you might want to check the list archives, as 
during the last few weeks, ssh tunneling has been discussed a bit.

> 
> Thank you for your time,

You're welcome - your setup sounds like a nice and interesting 
project! If you want commercial support don't hesitate to ask me! ;-)

(Or ask Bacula Systems - though we don't have a pertner in the US yet, 
we might be able to help you anyway.)

Arno

> 
> Jayson Broughton
> 
> Linux Systems Administrator
> 
> True Oil LLC
> 
> jbroughton AT truecos DOT com <mailto:jbroughton AT truecos DOT com>
> 
>  
> 
> The information in this electronic mail message and any attached files 
> is confidential and may be legally privileged. If you are not the 
> intended recipient, delete this message and contact the sender 
> immediately. Access to this message by anyone other than its intended 
> recipient is unauthorized. You must not use or disseminate this 
> information as it is proprietary property of the True companies. 
> Communications on or through the True companies' computer systems may be 
> monitored or recorded to secure effective system operation and for other 
> lawful purposes. Thank you.
> 
> 
> ------------------------------------------------------------------------
> 
> ------------------------------------------------------------------------------
> Stay on top of everything new and different, both inside and 
> around Java (TM) technology - register by April 22, and save
> $200 on the JavaOne (SM) conference, June 2-5, 2009, San Francisco.
> 300 plus technical and hands-on sessions. Register today. 
> Use priority code J9JMT32. http://p.sf.net/sfu/p
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Bacula-users mailing list
> Bacula-users AT lists.sourceforge DOT net
> https://lists.sourceforge.net/lists/listinfo/bacula-users

-- 
Arno Lehmann
IT-Service Lehmann
Sandstr. 6, 49080 Osnabrück
www.its-lehmann.de

------------------------------------------------------------------------------
Stay on top of everything new and different, both inside and 
around Java (TM) technology - register by April 22, and save
$200 on the JavaOne (SM) conference, June 2-5, 2009, San Francisco.
300 plus technical and hands-on sessions. Register today. 
Use priority code J9JMT32. http://p.sf.net/sfu/p
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users