-------- Original Message --------
Subject: Re: [Bacula-users] Hardware Specs & Sizing for
Bacula?
Local Time: February 19, 2016 10:02 pm
UTC Time: February 19, 2016 10:02 PM
On 19/02/16 19:10, paul.hutchings
wrote:
Alan thanks, I omitted that we have a Spectra LTO6
library which would be SAS attached to the server in
question but I didn't mention it as my initial query was
more about the hardware specs.
It all ties together.
The rough plan would be D2D2T and we'd probably run one
of our fileservers (linux) directly to a local directly
attached small LTO6/7 library as it's not data where we need
a long retention and it feels dumb to be running it over the
network just to send it to tape to keep for a week.
You're better off centralising it. Seriously. Even if
you're only keeping the backups a few days.
The hardware we have happens to have an 800GB SSD in it by a lucky
coincidence,which I thought could
be used for the Postgres database (not used Bacula enough to
know how big the database may grow)
800Gb is big enough, but is it fast enough? ("There are
SSDs and there are SSDs"),
With the kind of use it's getting you need to know the
speed of garbage collection as this is going to be the driving
factor far more than trim commands. On top of that you need to
know the endurance of the drives so you can calculate when
they'll need replacing (consumer SSDs run about 1000 times
capacity, enterprise will generally run to 100 times more than
that.)
By way of comparison, 2million file full backups were
taking hours to insert attributes to the database on a raid 6
6-drive spinning set. That came down to "5 minutes" when I
moved to a raid1 pair of samsung 840pro 500GB, but became "an
hour" after a while. Flushing and trimming the disks brought
the speed back down, but as the same blocks are being
repeatedly written there's no trim sent in normal operation
and they're gradually slowing down again even though they're
now on a controller which supports trim commands. Whilst 840s
are fairly notorious for their GC speed they're faster than
most consumer drives and the plan is to replace them with a
pair of SM843s as 500Gb isn't large enough anyway.
Raid is a must - you really don't want to try a restore
without an intact database. This is worst -case disaster
scenario material and you need to treat the database as
business-critical - which it is when things go wrong. (The
database can also function as an IDS (intrusion detection
system). Bacula manuals have details on how to use it for
that)
> which I image would benefit from it immensely but I'm
not clear what the "spool" is that you're referring to - a
quick dig suggests it could be attribute spooling or a spool
area for data that's going to tape?
Correct on both counts - and if you're feeding LTO-anything
you MUST spool or you'll shoeshine the tapes and you MUST use
SSDs for concurrent backups on anything faster than LTO3 as
the raw speed of the tapes is faster than the sequential read
speed of even 15krpm drives.
The moment you start randomly seeking on spinning media
your throughput and iops will plummet, raid or no raid. On top
of that, spinning drives used for spool will self-destruct
regularly (even HGST 7k4s) due to the cumulative seek load,
which translates to unnecessary downtime and hassle.
On the current back up box I'm using a raid0 5-disk set of
64GB Intel E25 drives. At the time they were $800 each and the
spool area is really only about half the size it needs to be.
Spending that kind of money now will get you an extremely
nice, blisteringly fast PCIe SSD. You need at least 600MB/s
(sustained) so stay away from SAS/SATA for the spool.
Sounds like we're good on the hardware but if necessary
throw in some RAM.
If you keep with the plan of spinning drives, you'll regret
it very quickly.
More ram is a must, as is proper database tuning.
(postgres is good but you need to tell it how much ram there
is available to use and give it optimisations for ssds. Mysql
is tuning hell)
Use separate SSDs for the database and spool. Consider SSD
for the OS
Use software RAID and dump the PERCs unless you can switch
them to IT (initiator-target) mode from IR (Initiator-raid),
as they'll slow you down (PERCs are mpt2sas based - this is a
low end SAS chipset with significant RAID performance limits)
The spool device is disposable. Everything else is not.
Your backup system needs to be treated as business-critical
and built accordingly, along with the tape storage (a filing
cabinet or shelves is nowhere near good enough). When things
go bang you need it to work first time in order to be up and
running as quickly as possible.
Some of the more paranoid people I know use 3-4 way raid1
mirroring on the OS and database disksets, specifically so
they can keep one disk from each raidset in the datasafe at
all times.
With regard to safes: We use 2 of the large ones pictured
at
http://www.phoenixsafeusa.com/primary-designation/media-safes
- these hold ~800 LTOs apiece and they should be positioned
close to your tape library - which in turn should be in a
temperature/humidty controlled dust-free environment _out_ of
your main server areas (The last thing you want if the server
rooms catch fire is to lose your backup system too and server
rooms always end up dusty, which kills tape drives)
If you buy a lot of tapes, consider a LTO cleaner from
mptapes.com - most tape-drive related contamination incidents
we've seen have been the result of new media with
contamination on it contaminating the drives, which in turnm
crosscontaminated a lot of other tapes. This shows as drives
requesting excessive cleaning cycles and tapes showing as
"full" at significantly less than their raw capacity (in the
worst cases tapes were only holding 100Gb of data, the rest
was taken up by rewrites)
We're so new to Bacula that
I'll be blunt and admit there's lots I simply haven't got my
head around yet if we do go with it so apologies if some of
this is dumb/obvious to most of you :)
Bacula installations range from home systems to major
banks. There's no "one size fits all" but there are some
fairly important guidelines you need to adhere to in order to
ensure that your backups are there and usable when you need
them (which is always a high-stress event no matter if it's "I
just deleted XYZ important file and I need it back NOW" or
"the main fileserver caught fire and we need to rebuild it",
so plan ahead)
As a rule of thumb for LTO - try not to let individual
backup sets go much over 1TB. The bigger they are, the greater
the chances of something going wrong during the backup/restore
procedure and you don't want full backups going over 24 hours
in any case as this starts interfering with daily backups of
the backup server itself.
If someone tells you they need a 12TB filesystem, its
quite likely they don't and they haven't thought through what
happens if it needs fscking (which is another good reason for
keeping backed-up filesets under 1TB. Beyond that fscks at
startup can eat a lot of time even when parallelised. One such
machine here gets rebooted every 6 months and usually spends a
day in fsck before it's ready for use.)
-------- Original Message --------
Subject: Re: [Bacula-users] Hardware Specs & Sizing
for Bacula?
Local Time: February 19, 2016 6:58 pm
UTC Time: February 19, 2016 6:58 PM
On 19/02/16 18:12,
paul.hutchings wrote:
We're new to Bacula and are still considering if it's viable for us.
Our test environment is quite small (it is a test environment) and when I read the docs I'm not sure how recent they are when they relate to hardware specs.
For example if I were to suggest box with dual 8 core E5 CPUs, hardware PERC RAID card with 1GB cache, 48TB of 7.2k SATA in RAID6 and 32GB (or more) of RAM running as a SD would people be thinking "hmmm may need more horsepower" or would people be thinking "that should handle hundred/thousands of clients"?
It depends. For SD-only use, your CPU is overkill and
even 16Gb of ram would be overkill
Ram requirements are for the director and database.
These can be on the same box and probably should be to
avoid networking penalties. You don't need VMs - and
really shouldn't play that game on backup-dedicated
hardware as VMs come with performance penalties ranging
from noticeable to major.
Even with the DB and DIR on the box, your CPUs are more
than adequate.
Assuming SD + DIR + Postgres (don't mess with Mysql for
million+file installations, it doesn't scale well) then,
I'd add more ram. It's cheap enough these days that you
should think about running at least 96GB if you're backing
up tens of TB and tens of millions of files (even more if
you can afford it)
The real issue if you're running backups at this scale:
Disk is a liability. It's too slow and drives will end up
shaking themselves to pieces, making the backup pool your
Single point of failure. You _need_ tape - A decent robot
and several drives along with a suitably sized data safe.
We currently back up about 250 million files over 400TB
and I'm currently using a Quantum i500 with 14U extension
and 6 SAS LTO6 drives, previously we had a Overland
Neo8000 with 7 FC LTO5 drives.
Once you bite the bullet and use tape, dump the sata
spinning disks. Use something like a raid1 pair of 500GB
SM843s for your OS, put in second dedicated 1TB raid1 pair
for the database and use a _fast_ 200-800GB PCIe flash
drive for spool.
10GB networking is an absolute must. Don't try to play
games with 1Gb/s bonding. Any given data stream will only
run at 1Gb/s maximum.
On the other hand, the setup above would be an
expensive waste of time for backing up 10TB of data -
although for that size you could keep the spinning media
and keep the rest - but bear in mind that 48TB is only
going to allow 3 full backups of 15TB (any fewer than 3
full backups is asking for trouble), without taking
differentials or incrementals into account.
For 20TB+ you may want to look at a single-drive tape
autochanger capable of holding at least 10 tapes. The last
thing you want to be doing is feeding new LTO6/7s into it
every 2-3 hours when a full backup is running (yes, they
will fill up that quickly)
Director could be on the same physical box but would ideally be a VM with a couple of CPU cores and as much RAM as is needed to handle a couple dozen clients, though the largest two clients are around 10TB and each have millions of files.
Impression I get is that network and disk will be a bottleneck way before RAM and CPU should be?
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users