Bacula-users

Re: [Bacula-users] Bacula and 16 bay JBOD

2011-03-17 20:01:18
Subject: Re: [Bacula-users] Bacula and 16 bay JBOD
From: Phil Stracchino <alaric AT metrocast DOT net>
To: bacula-users AT lists.sourceforge DOT net
Date: Thu, 17 Mar 2011 19:57:58 -0400
On 03/17/11 18:46, Marcello Romani wrote:
> Il 16/03/2011 18:38, Phil Stracchino ha scritto:
>> On 03/16/11 13:08, Mike Hobbs wrote:
>>>    Hello,  I'm currently testing bacula v5.0.3 and so far so good.  One
>>> of my issues though, I have a 16 bay Promise Technologies VessJBOD.  How
>>> do I get bacula to use all the disks for writing volumes to?
>>>
>>> I guess the way I envision it working would be, 50gb volumes would be
>>> used and when disk1 fills up, bacula switches over to disk2 and starts
>>> writing out volumes until that disk is filled, then on to disk3, etc..
>>> eventually coming back around and recycling the volumes on disk 1.
>>>
>>> I'm not sure the above scenario is the best way to go about this, I've
>>> read that some people create a "pool" for each drive.  What is the most
>>> common practice when setting up a JBOD unit with bacula?  Any
>>> suggestions or advice would be appropriated.
>>
>> That scheme sounds like a bad and overly complex idea, honestly.
>> Depending on your data load, I'd use software RAID to make them into a
>> single RAID5 or RAID10 volume.  RAID10 would be faster and, if set up
>> correctly[1], more redundant; RAID5 is more space-efficient, but slower.
>>
>>
>> [1] There's a right and a wrong way to set up RAID10.  The wrong way is
>> to set up two five-disk stripes, then mirror them; lose one disk from
>> each stripe, and you're dead in the water.  The right way is to set up
>> five mirrored pairs, then stripe the pairs; this will survive multiple
>> disk failures as long as you don't lose both disks of any single pair.
>>
>>
> 
> Hi Phil,
>      that last sentence sounds a little scary to me: "this will survive 
> multiple disk failures *as long as you don't lose both disks of any 
> single pair*".
> Isn't RAID6 a safer bet ?

That depends.

With RAID6, you can survive any one or two disk failures, in degraded
mode.  You'll have a larger working set than RAID10, but performance
will be slower because of the overhead of parity calculations.  A third
failure will bring the array down and you will lose the data.

With RAID10 with sixteen drives, you can survive any one drive failure
with minimal performance degradation.  There is a 1 in 15 chance that a
second failure will be the other drive of that pair, and bring the array
down.  If not, then there is a 1 in 7 chance that a third drive failure
will be on the same pair as one of the two drives already failed.  If
not, the array will still continue to operate, with some read
performance degradation, and there is now a just less than 1 in 4 chance
(3/13) that if a fourth drive fails, it will be on the same pair as one
of the three already failed.  ... And so on.  There is a cumulative 39%
chance that four random failures will fail the entire array, which rises
to 59% with five failures, and 78% with six.  (91% at seven, 98% at
eight, and no matter how many leprechauns live in your back yard, at
nine failures you're screwed of course.  It's like the joke about the
two men in the airliner.)

But if the array was RAID6, it already went down for the count when the
third drive failed.



Now, granted, multiple failures like that are rare.  But ... I had a
cascade failure of three drives out of a twelve-drive RAIDZ2 array
between 4am and 8am one morning.  Each drive that failed pushed the load
on the remaining drives higher, and after a couple of hours of that, the
next weakest drive failed, which pushed the load still higher.  And when
the third drive failed, the entire array went down.  It can happen.

But ...  I'm running RAIDZ3 right now, and as soon as I can replace the
rest of the drives with new drives, I'll be going back to RAIDZ2.
Because RAIDZ3 is a bit too much of a performance hit on my server, and
- with drives that aren't dying of old age - RAIDZ2 is redundant
*enough* for me.  There is no data on the array that is crucial *AND*
irreplaceable *AND* not also stored somewhere else.

What it comes down to is, you have to decide for yourself what your
priorities are - redundancy, performance, space efficiency - and how
much of each you're willing to give up to get as much as you want of the
others.


-- 
  Phil Stracchino, CDK#2     DoD#299792458     ICBM: 43.5607, -71.355
  alaric AT caerllewys DOT net   alaric AT metrocast DOT net   phil AT 
co.ordinate DOT org
  Renaissance Man, Unix ronin, Perl hacker, SQL wrangler, Free Stater
                 It's not the years, it's the mileage.

------------------------------------------------------------------------------
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users