BackupPC-users

Re: [BackupPC-users] reducing i/o

2011-11-05 16:20:33
Subject: Re: [BackupPC-users] reducing i/o
From: Eric Chowanski <eric AT wwek DOT org>
To: backuppc-users AT lists.sourceforge DOT net
Date: Sat, 05 Nov 2011 13:18:35 -0700
Hi,

Being a BackupPC 'newb', I can only say I haven't found much information
on i/o tuning.  Naively, it would seem that the bulk of BackupPC i/o is
spent comparing hashes.  *If* that's true, I think it would be
interesting to see if splitting hashes from the actual data and putting
them on much faster storage such as SSDs would benefit overall
throughput.  I'd suspect that's only possible at the code level.

But I also notice that you didn't provide much architectural info and
that's an area that can have large impact.  Here's some thoughts
presented more as 'food for though' than as recommendations.

I'd think there's two options.  The first option is to use network
storage, but then you quickly hit bandwidth limitations so you need
fatter (and possibly lower latency) pipes like 10Gb Ethernet or Myrinet.

The newer crop of network storage such as GlusterFS (being purchased by
Red Hat) is nice for several reasons.  It scales nearly linearly in i/o,
available storage, and redundancy into the petabyte range.  It makes the
server the redundant and allows you to remove the redundancy at the disk
level; and the more parts you can "throw out the window" and recover
from, the better.

The other option is to put more disks in each server.  Making big
assumptions, if you've got 16TB of data and 50% utilization on 1TB
disks, that's only 4 disks per server.  That doesn't buy you much in
raid 10 and is probably quite slow in raid 5.

Putting all your disks in one server with 12 or 16 bays plus an eSata
chassis would significantly increase throughput.  

A note about raid levels.  Raid 5 is really really not the thing to use
as it has very poor performance for some realistic workloads and I'd
expect BackupPC to be one of those workloads.   If you can, use raid 10,
which is both very fast and _ought_ to be more robust.  That is, in the
event of a disk failure and rebuild, raid 10 only requires a read from
one disk, lessening the chance that the rebuild crashes other disks as
occasionally happens in raid 5.

Lastly, a note about disk choice.  I too use cheap SATA drives.
Optimizing for cost, i/o, and throughput, it's pretty easy to choose a
1TB drive and only use e.g. 20% of the space versus a much more
expensive SAS drive.  The beauty of this is you still have a lot of
unused disk should you need it or should you migrate disks to other
uses.

If development on BackupPC should go that direction (or should it be
possible today), splitting off the higher i/o part of the workload onto
SSDs would make those larger but slower SATA disks much more attractive
relative to a faster but smaller SAS disks.

Eric


On Sat, 2011-11-05 at 08:40 +0100, pv wrote:
> Hi
> 
> is there a way to reduce i/o load on the backup-servers significantly?
> 
> we are using backuppc over years in many different combinations of 
> hardware and filesystems and always i/o-wait is the killer.
> 
> we are now running 8 backuppc-server running ~16TB of backup-data 
> (quickly changing) and the handling is getting tricky (which host is the 
> client backuped on? is there a backup of every host? when do I have the 
> time to finaly really start programming backuppc-hq?)
> 
> so. we are willing to do anything to reduce the nr of backup-servers 
> (best would be only one).
> 
> eg we could give up deduplication, compression, increase RAM and 
> CPU-Power, change filesystem and os (debian and xfs now), change 
> raid-level (Non, raid-0, raid-1 and raid-10 now) and so on.
> 
> what we cant do for financial reasons is drop the cheap SATA drives.
> changing to SAS 15k eg would be much more expensive (even if 
> calculating rackspace, power, machines, manpower and so on of the 
> current backuppool of 8 backup-servers)
> 
> any tips?
> 
> ys
> Peter
> 
> 
> ------------------------------------------------------------------------------
> RSA(R) Conference 2012
> Save $700 by Nov 18
> Register now
> http://p.sf.net/sfu/rsa-sfdev2dev1
> _______________________________________________
> BackupPC-users mailing list
> BackupPC-users AT lists.sourceforge DOT net
> List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
> Wiki:    http://backuppc.wiki.sourceforge.net
> Project: http://backuppc.sourceforge.net/
> 



------------------------------------------------------------------------------
RSA(R) Conference 2012
Save $700 by Nov 18
Register now
http://p.sf.net/sfu/rsa-sfdev2dev1
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/

<Prev in Thread] Current Thread [Next in Thread>