BackupPC-users

Re: [BackupPC-users] What file system do you use?

2013-12-17 12:53:48
Subject: Re: [BackupPC-users] What file system do you use?
From: Mark Campbell <mcampbell AT emediatrade DOT com>
To: Russell R Poyner <rpoyner AT engr.wisc DOT edu>, "backuppc-users AT lists.sourceforge DOT net" <backuppc-users AT lists.sourceforge DOT net>
Date: Tue, 17 Dec 2013 10:52:57 -0700
I've done virtually no tuning of ZFS.  In my initial experimentations with ZFS, 
I was blowing away my array so often when trying different combinations of BPC 
commpress/ZFS compress/ZFS dedup that I wrote a little shell script that 
recreated the ZFS array & populated the necessary directories for BackupPC:

#!/bin/bash
bpcdir=/backup/BackupPC
service backuppc stop
zpool destroy -f backup
zpool create backup raidz2 sdc sdd sde sdf sdg sdh
zfs set compression=on backup
zfs set dedup=on backup
zfs set atime=off backup
mkdir $bpcdir
chown backuppc $bpcdir
chmod 750 $bpcdir
cd $bpcdir
mkdir cpool pc pool
chown backuppc *
chmod 750 *
service backuppc start

That's pretty much the extent of my tuning of ZFS.  This is a CentOS 6.4 x86_64 
system, with ZoL installed.  I've got 6 250GB disks installed on a 3ware SATA 
RAID card set to be stand alone disks, 16GB of RAM, which has served me well, 
no swapping, so that's been good.  I've tried to avoid intermediary caches for 
sake of performance.  For the level of compression, just whatever is default 
(isn't it normally 6?).

In my case, my data comes from 16 hosts, a mix of linux, winxp, win7, & win8 
machines.  My biggest backup client is a network drive, CentOS 6 based, that is 
housing all sorts of files, but is also a repository of backups in and of 
itself for some Server 2012 backups.  This is a variety of stuff, ranging from 
SQL Server backups, to full bare metal system backups, which most 
unfortunately, presents itself as a few gigantic files.  Some of these backups 
are dozens, if not hundreds of GB in size.  And all that was changing in these 
files were a few MB worth from day to day, so I can totally empathize with 
Timothy's gripes with Exchange.  File-based deduplication wasn't helping me 
here, so that's why I tried out ZFS.  And boy, does it work.  I'm really only 
doing about 10 backups at a time right now; The same basic system (- ZFS, + 
BackupCP compression) was reaching 95% capacity with just 3 backups before.  I 
guarantee that were I storing more backups, my 7.5 fold reduction would 
skyrocket even higher.

Thanks,

--Mark


-----Original Message-----
From: Russell R Poyner [mailto:rpoyner AT engr.wisc DOT edu] 
Sent: Tuesday, December 17, 2013 11:12 AM
To: backuppc-users AT lists.sourceforge DOT net
Cc: Mark Campbell
Subject: Re: [BackupPC-users] What file system do you use?

Mark,

Questions, and some comments.

Questions:

What have you done to tune your zfs?
Do you use a ZIL? and or an L2ARC?
How much ram do you have?
What compression level are you using on zfs?

I reflexively put a ZIL on my system but I'm curious if anyone has experimented 
with BackupPC performance on zfs with and without the ZIL.

Comments:

I built a backuppc on zfs system at my last job, but I took the opposite 
approach on compression. I disabled compression and dedupe on zfs and let 
BackupPC handle those jobs. I haven't seen load problems, but I do notice that 
the tranfer speed reported by backuppc varies a lot between different windows 
clients. Anywhere from 1.4 MB/s to 41 MB/s. This is partly due to network speed 
since some machines are on Gb connections, but most are on 100Mb. There also 
seems to be some dependence on the age and condition of the windows boxes.

BackupPC reports 76486 Gb of fulls and 1442 Gb of incrementals.
zfs list shows 11.9 Tb allocated from the 65Tb pool for backuppc data.
Which gives me about a 6.4 fold reduction in storage, slightly less than the 
roughly 7.5 fold reduction that you see. My data comes from user files on 12 
windows 7 machines.

This is a poor comparison since we have different data sets, but it would 
appear that BackupPC's internal dedupe and compression is comparable to, or 
only slightly worse than what zfs achieves. This in spite of the expectation 
that zfs block level dedupe might find more duplication than BackupPC's file 
level dedupe.

Russ Poyner



On 12/17/13 07:50, Mark Campbell wrote:20
> I too am using ZFS, and I can honestly say that ZFS works great, up to a 
> point.  rsync does seem to take up an inordinate amount of resources, but in 
> a smaller shop like mine, it's been tolerable.  I think it would work in a 
> larger shop too, but the system resource requirements (CPU/RAM) would grow 
> larger than what you would expect normally.  I've had a couple of instances 
> of performance issues in my setup, where over time, rsync was uploading data 
> to the system faster than zfs could process it, and so I'd watch my load go 
> through the roof (8.00+ on a quad core system), and I would have to stop 
> BackupPC for an hour or so, so that ZFS could catch up, but other than that, 
> this system has actually handled it fairly well.
>
> What I really like about ZFS though, is the deduplication coupled with 
> compression.  I've disabled compression in BackupPC to allow ZFS to properly 
> do the dedup & compression (enabling compression in BackupPC kills ZFS' dedup 
> ability, since it messes with the checksums of the files), and I'm getting 
> numbers in the range of 4.xx deduplication.  My ZFS array is 1.12TB in size, 
> yet, according to BackupPC, I've got 1800GB in fulls, and 2400GB in 
> incrementals.  When I query the array for actual disk usage, it says I'm 
> using 557GB of space...  Now that's just too cool.
>
> Thanks,
>
> --Mark
>
>
> -----Original Message-----
> From: Tim Connors [mailto:tconnors AT rather.puzzling DOT org]
> Sent: Monday, December 16, 2013 10:00 PM
> To: General list for user discussion, questions and support
> Subject: Re: [BackupPC-users] What file system do you use?
>
> On Mon, 16 Dec 2013, Timothy J Massey wrote:
>
>> One last thing:  everyone who uses ZFS raves about it.  But seeing as 
>> (on
>> Linux) you're limited to either FUSE or out-of-tree kernel modules 
>> (of questionable legality:  ZFS' CDDL license is *not* GPL 
>> compatible), it's not my first choice for a backup server, either.
> I am using it, and it sucks for a backuppc load (in fact, from the mailing 
> list, it is currently (and has been for a couple of years) terrible on an 
> rsync style workload - any metadata heavy workload will eventually crash the 
> machine after a couple of weeks uptime.  Some patches are being tested right 
> now out of tree that look promising, but I won't be testing them myself until 
> it hits master 0.6.3.
>
> Problem for me is that it takes about a month to migrate to a new filesystem. 
>  I migrated to zfs a couple of years ago with insufficient testing.  I should 
> have kept on ext4+mdadm (XFS was terrible too - no faster than ext4, and 
> given that I've always lost data on various systems with it because it's such 
> a flaky filesystem, I wasn't gaining anything).
> mdadm is more flexible than ZFS, although harder to configure.  With
> mdadm+ext4, you can choose any disk arrangement you like without being
> limited to simple RAID-Z(n) arrangements of equal sized disks.  That said, I 
> do prefer ZFS's scrubbing compared to mdadm's, but only slightly.  If I was 
> starting from scratch and didn't have 4-5 years of backup archives, I'd tell 
> backuppc to turn off compression and munging of the pool, and let ZFS do it.
>
> I used JFS 10 years ago, and "niche buggy product" would be my description 
> for it.  Basically, go with the well tested popular FSs, because they're not 
> as bad as everyone makes them out to be.
>
> --
> Tim Connors
>
> ----------------------------------------------------------------------
> -------- Rapidly troubleshoot problems before they affect your 
> business. Most IT organizations don't have a clear picture of how application 
> performance affects their revenue. With AppDynamics, you get 100% visibility 
> into your Java,.NET, & PHP application. Start your 15-day FREE TRIAL of 
> AppDynamics Pro!
> http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.c
> lktrk _______________________________________________
> BackupPC-users mailing list
> BackupPC-users AT lists.sourceforge DOT net
> List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
> Wiki:    http://backuppc.wiki.sourceforge.net
> Project: http://backuppc.sourceforge.net/
>
> ----------------------------------------------------------------------
> -------- Rapidly troubleshoot problems before they affect your 
> business. Most IT organizations don't have a clear picture of how 
> application performance affects their revenue. With AppDynamics, you 
> get 100% visibility into your Java,.NET, & PHP application. Start your 
> 15-day FREE TRIAL of AppDynamics Pro!
> http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.c
> lktrk _______________________________________________
> BackupPC-users mailing list
> BackupPC-users AT lists.sourceforge DOT net
> List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
> Wiki:    http://backuppc.wiki.sourceforge.net
> Project: http://backuppc.sourceforge.net/
>


------------------------------------------------------------------------------
Rapidly troubleshoot problems before they affect your business. Most IT 
organizations don't have a clear picture of how application performance 
affects their revenue. With AppDynamics, you get 100% visibility into your 
Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/