BackupPC-users

Re: [BackupPC-users] How to use backuppc with TWO HDD

2009-06-02 20:51:16
Subject: Re: [BackupPC-users] How to use backuppc with TWO HDD
From: dan <dandenson AT gmail DOT com>
To: "General list for user discussion, questions and support" <backuppc-users AT lists.sourceforge DOT net>
Date: Tue, 2 Jun 2009 18:38:52 -0600


On Tue, Jun 2, 2009 at 12:36 AM, Adam Goryachev <mailinglists AT websitemanagers.com DOT au> wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

dan wrote:
> Unfortunately there is a 'rebuild hole' in many redundant
> configurations.  In RAID1 that is when one drive fails and just one
> remains.  This can be eliminated by running 3 drives so that 1 drive can
> fail and 2 would still be operational.
>
> There are plenty of charts online to give % of redundancy for regular
> RAID arrays.

I must admit, this is something I have never given a lot of thought
to... Then again, I've not yet worked in an environment with large
numbers of disks. Of course, that is no excuse, and I'm always
interested in filling in knowledge gaps...

Is it really worthwhile considering a 3 drive RAID1 system, or even a 4
drive RAID1 system (one hot spare). Of course, worthwhile depends on the
cost of not having access to the data, but from a "best practice" point
of view. ie, Looking at any of the large "online backup" companies, or
gmail backend, etc... what level of redundancy is considered acceptable.
(Somewhat surprising actually that google/hotmail/yahoo/etc have ever
lost any data...)

Redundancy is the key for these companies.  They use databases that can be spread out among servers and replicated many times accross their network.  Google for instance could have 20 copies of data in different servers and a catastrophic loss at one facility has no effect on the whole(or little effect anyway)

I might also add that these companies have a lot of losable data.  Caches for website is simply rebuilt in the even data is lost.
 

> With a modern filesystem capable of multiple copies of each file this
> can be overcome. ZFS can handle multiple drive failures by selecting the
> number of redundant copies of each file to store on different physical
> volumes.  Simply put, a ZFS RAIDZ with 4 drives can be set to have 3
> copies which would allow 2 drives to fail.  This is somewhat better than
> RAID1 and RAID5  both because more storage is available yet still allows
> up to 2 drives to fail before leaving a rebuild hole where the storage
> is vulnerable to a single drive failure during a rebuild or resilver.

So, using 4 x 100G drives provides 133G usable storage... we can lose
any two drives without any data loss. However, from my calculations
(which might be wrong), RAID6 would be more efficient. On a 4 drive 100G
system you get 200G available storage, and can lose any two drives
without data loss.
Well, really the key to filesystems with build in volume management is that a large array an be broken down into smaller chunks with various levels of redundancy across different data stores.  using 4x100 you would likely do a raidz2 which calculates 2 peices of parity for each file which is something like raid6.

The real issue with raid6 is abismal performance on software raid because of the double parity compute and limited support in hardware cards and similarly the load on the card's cpu and slower performance.

The arguement is always data safety vs access speed.  Keep in mind that the raid5 write whole also applies to raid6.

 

> Standard RAID is not going to have this capability and is going to
> require more drives to improve though each drive also decreases
> reliability has more drives are likely to fail.

Well, doesn't RAID6 do exactly that (add an additional drive to improve
data security)? How is ZFS better than RAID6? Not that I am suggesting
ZFS is bad, I'm just trying to understand the differences...

raid6 has a write whole during parity computation that can catch you suprisingly often.  zfs does not have this.  not to be a zfs fanboy, btrfs will also have such capabilities. 
 

> ZFS also is able to put metadata on a different volume and even have a
> cache on a different volume which can spread out the chance of a loss.
> very complicated schemes can be developed to minimize data loss.

In my experience, if it is too complicated:
1) Very few people use it because they don't understand it
2) Some people who use it, use it in-correctly, and then don't
understand why they lose data (see the discussion of people who use RAID
controller cards but don't know enough to read the logfile on the RAID
card when recovering from failed drives).

Also, I'm not sure what the advantage of metadata on a different volume
is? If you lose all your metadata how easily will you recover your
files? Perhaps you should be just as concerned about protecting your
metadata as you do for your data, thus why separate it?

What is the advantage of using another volume as a cache ? Sure, you
might be lucky enough that the data you need is still in cache when you
lose the whole array, but that doesn't exactly sound like a scenario to
plan for? (For performance, the cache might be a faster/more expensive
drive, (read SSD or similar) but we are discussing reliability here)

as far as re-locating metadata, you can put metadata an an additional redundant array for performance and additionally the parity checks on the data can be spread accross different controllers which not only improves performance but allows the parity to be calculated in paralell which means it will be completed and written soon which means a smaller window for data loss.
 

> This is precisely the need for next-gen filesystems like ZFS and soon
> BTRFS.  To fill these gaps in storage needs.  Imagine the 10TB drives of
> tomorrow that are only capable of being read at 100MB/s.  Thats a 30
> hour rebuild under ideal conditions.  even when SATA3 or SATA6 are
> standardized (or SAS) you can cut that to 7.5 or 15 hours but that is
> still a very large window for a rebuild.

Last time I heard of someone using ZFS for their backuppc pool under
linux, they didn't seem to consider it ready for production use due to
the significant failures. Is this still true, or did I mis-read something?

Personally, I used reiserfs for years, and once or twice had some
problems with it (actually due to RAID hardware problems). I have
somewhat moved to ext3 now due to the 'stigma' that seems to be attached
to reiserfs. I don't want to move to another FS before it is very stable...

ZFS on linux = bad.  ZFS is a solaris thing and will be for some time.  someday *BSD will have stable ZFS but I doubt linux ever will.  btrfs will likely be in wide use which will serve many of the same purposes as ZFS.

The reason to use a nextgen filesystem is exactly as you stated above.  some dataloss caused by funky raid hardware.  unlike traditional RAID where the filesystem has no awareness of the disk geometry and disk errors, you can get silent corruption that the filesystem isnt aware of and cant take steps to correct.  next-gen filesystems like btrfs and zfs are aware of disk geometry and can correct for these issues.

reiserfs is a good filesystem but is succeptable to disk errors.  this is typical of older filesystems.

 

> On-line rebuilds and
> filesystems aware of the disk systems are becoming more and more relevant.

I actually thought it would be better to disable these since it:
1) increases wear 'n' tear on the drives
2) what happens if you have a drive failure in the middle of the rebuild?

1) this is an arguable point.  many would say that disk usable makes little to bo difference on disk life.  heat is what effect disk life most.  also, in important workloads disks should have a scheduled lifetime and be rotated out.
2)drive failure during rebuild is certianly a worst case but likely scenario.  it is even more likely because all of the disks in an array are likely the same age and may all be nearling the MTBF which increases the probability of failure.  this is what raid6 or 10 or 5+ or whatever multiple redundant disk raidlevel is for.

 


Mainly the 2nd one scared me the most.

Sorry for such a long post, but hopefully a few other people will learn
a thing or two about storage, which is fundamentally important to
backuppc...

Regards,
Adam



I have really done a ton of testing on filesystems in MySQL and PostgreSQL environments as well as backuppc systems.  There is a catch22 for backuppc.  The absolute best filesystem I have found for backuppc is zfs used directly on the physical volumes running with raidz2 on sata/sas and metadata and cache on SSD.  The catch22 is that this must be run on *solaris.  *BSD does not have a zfs ready for such uses.

With that being said, I am running backuppc with debian systems on raid10 with ext3.  I run a limited number of hosts on each machine and use multiple servers to handle my needs.  my raid10 is a 4 drive setup in Dell 2U hardware(I have 2 spare drive bays, ready for zfs and some SSDs.  I am not a slowlaris fan and am dying to see stable zfs on bsd.


------------------------------------------------------------------------------
OpenSolaris 2009.06 is a cutting edge operating system for enterprises 
looking to deploy the next generation of Solaris that includes the latest 
innovations from Sun and the OpenSource community. Download a copy and 
enjoy capabilities such as Networking, Storage and Virtualization. 
Go to: http://p.sf.net/sfu/opensolaris-get
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/