BackupPC-users

Re: [BackupPC-users] What file system do you use?

2013-12-18 09:08:47
Subject: Re: [BackupPC-users] What file system do you use?
From: Mark Campbell <mcampbell AT emediatrade DOT com>
To: Russell R Poyner <rpoyner AT engr.wisc DOT edu>, "backuppc-users AT lists.sourceforge DOT net" <backuppc-users AT lists.sourceforge DOT net>
Date: Wed, 18 Dec 2013 07:08:10 -0700
Thanks Russell for the link.  I'd previously used another link that contained 
similar info for RAM calculation.  And it's curious that I have very different 
results between these two.  In your link, it suggests doing zdb -b <array>, and 
multiplying the bp count by 320 bytes to get approx dedup table size, and later 
multiplying that by 4 to include all metadata, cached block data to keep it all 
in RAM.

In the article I'd previously found, it suggests doing zdb -S <array>, and 
multiplying the allocated blocks total with 320 bytes.  The strange thing is 
these two block numbers are significantly different.  Doing zdb -S yielded 
6.01M for me.  Doing zdb -b yielded 29,499,480--almost 5x the first number...  
Any ideas on why these numbers vary so drastically?

Thanks,

--Mark


-----Original Message-----
From: Russell R Poyner [mailto:rpoyner AT engr.wisc DOT edu] 
Sent: Tuesday, December 17, 2013 1:34 PM
To: Mark Campbell; backuppc-users AT lists.sourceforge DOT net
Subject: Re: [BackupPC-users] What file system do you use?

Thanks Mark.

 From the zfs man page for zol it looks like the default compression is lzjb, 
same as other zfs implementations. I generally use lz4 which is basically lzjb 
with some performance upgrades. It's a minor tweak unless you have a lot of 
uncompressible files.

If you are experiencing decent data rates without a separate ZIL, it likely 
means that BackupPC is not doing synchronous writes. Answers one of my 
longstanding questions about BPC.

It's possible that your performance stalls are related to the size of your 
dedupe table. Performance will tank if you are having to read the dedupe table 
from disk, rather than have all of it cached in ram. This is a well known 
performance issue with zfs dedupe. There is a good discussion of the issue here:

http://constantin.glez.de/blog/2011/07/zfs-dedupe-or-not-dedupe

I suspect that using an ssd as l2arc to hold the extra dedupe would give 
adequate performance for backups.

RP

On 12/17/13 11:52, Mark Campbell wrote:
> I've done virtually no tuning of ZFS.  In my initial experimentations with 
> ZFS, I was blowing away my array so often when trying different combinations 
> of BPC commpress/ZFS compress/ZFS dedup that I wrote a little shell script 
> that recreated the ZFS array & populated the necessary directories for 
> BackupPC:
>
> #!/bin/bash
> bpcdir=/backup/BackupPC
> service backuppc stop
> zpool destroy -f backup
> zpool create backup raidz2 sdc sdd sde sdf sdg sdh zfs set 
> compression=on backup zfs set dedup=on backup zfs set atime=off backup 
> mkdir $bpcdir chown backuppc $bpcdir chmod 750 $bpcdir cd $bpcdir 
> mkdir cpool pc pool chown backuppc * chmod 750 * service backuppc 
> start
>
> That's pretty much the extent of my tuning of ZFS.  This is a CentOS 6.4 
> x86_64 system, with ZoL installed.  I've got 6 250GB disks installed on a 
> 3ware SATA RAID card set to be stand alone disks, 16GB of RAM, which has 
> served me well, no swapping, so that's been good.  I've tried to avoid 
> intermediary caches for sake of performance.  For the level of compression, 
> just whatever is default (isn't it normally 6?).
>
> In my case, my data comes from 16 hosts, a mix of linux, winxp, win7, & win8 
> machines.  My biggest backup client is a network drive, CentOS 6 based, that 
> is housing all sorts of files, but is also a repository of backups in and of 
> itself for some Server 2012 backups.  This is a variety of stuff, ranging 
> from SQL Server backups, to full bare metal system backups, which most 
> unfortunately, presents itself as a few gigantic files.  Some of these 
> backups are dozens, if not hundreds of GB in size.  And all that was changing 
> in these files were a few MB worth from day to day, so I can totally 
> empathize with Timothy's gripes with Exchange.  File-based deduplication 
> wasn't helping me here, so that's why I tried out ZFS.  And boy, does it 
> work.  I'm really only doing about 10 backups at a time right now; The same 
> basic system (- ZFS, + BackupCP compression) was reaching 95% capacity with 
> just 3 backups before.  I guarantee that were I storing more backups, my 7.5 
> fold reduction would skyrocket even higher.
>
> Thanks,
>
> --Mark
>
>
> -----Original Message-----
> From: Russell R Poyner [mailto:rpoyner AT engr.wisc DOT edu]
> Sent: Tuesday, December 17, 2013 11:12 AM
> To: backuppc-users AT lists.sourceforge DOT net
> Cc: Mark Campbell
> Subject: Re: [BackupPC-users] What file system do you use?
>
> Mark,
>
> Questions, and some comments.
>
> Questions:
>
> What have you done to tune your zfs?
> Do you use a ZIL? and or an L2ARC?
> How much ram do you have?
> What compression level are you using on zfs?
>
> I reflexively put a ZIL on my system but I'm curious if anyone has 
> experimented with BackupPC performance on zfs with and without the ZIL.
>
> Comments:
>
> I built a backuppc on zfs system at my last job, but I took the opposite 
> approach on compression. I disabled compression and dedupe on zfs and let 
> BackupPC handle those jobs. I haven't seen load problems, but I do notice 
> that the tranfer speed reported by backuppc varies a lot between different 
> windows clients. Anywhere from 1.4 MB/s to 41 MB/s. This is partly due to 
> network speed since some machines are on Gb connections, but most are on 
> 100Mb. There also seems to be some dependence on the age and condition of the 
> windows boxes.
>
> BackupPC reports 76486 Gb of fulls and 1442 Gb of incrementals.
> zfs list shows 11.9 Tb allocated from the 65Tb pool for backuppc data.
> Which gives me about a 6.4 fold reduction in storage, slightly less than the 
> roughly 7.5 fold reduction that you see. My data comes from user files on 12 
> windows 7 machines.
>
> This is a poor comparison since we have different data sets, but it would 
> appear that BackupPC's internal dedupe and compression is comparable to, or 
> only slightly worse than what zfs achieves. This in spite of the expectation 
> that zfs block level dedupe might find more duplication than BackupPC's file 
> level dedupe.
>
> Russ Poyner
>
>
>
> On 12/17/13 07:50, Mark Campbell wrote:20
>> I too am using ZFS, and I can honestly say that ZFS works great, up to a 
>> point.  rsync does seem to take up an inordinate amount of resources, but in 
>> a smaller shop like mine, it's been tolerable.  I think it would work in a 
>> larger shop too, but the system resource requirements (CPU/RAM) would grow 
>> larger than what you would expect normally.  I've had a couple of instances 
>> of performance issues in my setup, where over time, rsync was uploading data 
>> to the system faster than zfs could process it, and so I'd watch my load go 
>> through the roof (8.00+ on a quad core system), and I would have to stop 
>> BackupPC for an hour or so, so that ZFS could catch up, but other than that, 
>> this system has actually handled it fairly well.
>>
>> What I really like about ZFS though, is the deduplication coupled with 
>> compression.  I've disabled compression in BackupPC to allow ZFS to properly 
>> do the dedup & compression (enabling compression in BackupPC kills ZFS' 
>> dedup ability, since it messes with the checksums of the files), and I'm 
>> getting numbers in the range of 4.xx deduplication.  My ZFS array is 1.12TB 
>> in size, yet, according to BackupPC, I've got 1800GB in fulls, and 2400GB in 
>> incrementals.  When I query the array for actual disk usage, it says I'm 
>> using 557GB of space...  Now that's just too cool.
>>
>> Thanks,
>>
>> --Mark
>>
>>
>> -----Original Message-----
>> From: Tim Connors [mailto:tconnors AT rather.puzzling DOT org]
>> Sent: Monday, December 16, 2013 10:00 PM
>> To: General list for user discussion, questions and support
>> Subject: Re: [BackupPC-users] What file system do you use?
>>
>> On Mon, 16 Dec 2013, Timothy J Massey wrote:
>>
>>> One last thing:  everyone who uses ZFS raves about it.  But seeing 
>>> as (on
>>> Linux) you're limited to either FUSE or out-of-tree kernel modules 
>>> (of questionable legality:  ZFS' CDDL license is *not* GPL 
>>> compatible), it's not my first choice for a backup server, either.
>> I am using it, and it sucks for a backuppc load (in fact, from the mailing 
>> list, it is currently (and has been for a couple of years) terrible on an 
>> rsync style workload - any metadata heavy workload will eventually crash the 
>> machine after a couple of weeks uptime.  Some patches are being tested right 
>> now out of tree that look promising, but I won't be testing them myself 
>> until it hits master 0.6.3.
>>
>> Problem for me is that it takes about a month to migrate to a new 
>> filesystem.  I migrated to zfs a couple of years ago with insufficient 
>> testing.  I should have kept on ext4+mdadm (XFS was terrible too - no faster 
>> than ext4, and given that I've always lost data on various systems with it 
>> because it's such a flaky filesystem, I wasn't gaining anything).
>> mdadm is more flexible than ZFS, although harder to configure.  With
>> mdadm+ext4, you can choose any disk arrangement you like without 
>> mdadm+being
>> limited to simple RAID-Z(n) arrangements of equal sized disks.  That said, I 
>> do prefer ZFS's scrubbing compared to mdadm's, but only slightly.  If I was 
>> starting from scratch and didn't have 4-5 years of backup archives, I'd tell 
>> backuppc to turn off compression and munging of the pool, and let ZFS do it.
>>
>> I used JFS 10 years ago, and "niche buggy product" would be my description 
>> for it.  Basically, go with the well tested popular FSs, because they're not 
>> as bad as everyone makes them out to be.
>>
>> --
>> Tim Connors
>>
>> ---------------------------------------------------------------------
>> -
>> -------- Rapidly troubleshoot problems before they affect your 
>> business. Most IT organizations don't have a clear picture of how 
>> application performance affects their revenue. With AppDynamics, you get 
>> 100% visibility into your Java,.NET, & PHP application. Start your 15-day 
>> FREE TRIAL of AppDynamics Pro!
>> http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.
>> c lktrk _______________________________________________
>> BackupPC-users mailing list
>> BackupPC-users AT lists.sourceforge DOT net
>> List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
>> Wiki:    http://backuppc.wiki.sourceforge.net
>> Project: http://backuppc.sourceforge.net/
>>
>> ---------------------------------------------------------------------
>> -
>> -------- Rapidly troubleshoot problems before they affect your 
>> business. Most IT organizations don't have a clear picture of how 
>> application performance affects their revenue. With AppDynamics, you 
>> get 100% visibility into your Java,.NET, & PHP application. Start 
>> your 15-day FREE TRIAL of AppDynamics Pro!
>> http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.
>> c lktrk _______________________________________________
>> BackupPC-users mailing list
>> BackupPC-users AT lists.sourceforge DOT net
>> List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
>> Wiki:    http://backuppc.wiki.sourceforge.net
>> Project: http://backuppc.sourceforge.net/
>>



------------------------------------------------------------------------------
Rapidly troubleshoot problems before they affect your business. Most IT 
organizations don't have a clear picture of how application performance 
affects their revenue. With AppDynamics, you get 100% visibility into your 
Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/

<Prev in Thread] Current Thread [Next in Thread>