BackupPC-users

Re: [BackupPC-users] improving the deduplication ratio

2008-04-09 11:41:12
Subject: Re: [BackupPC-users] improving the deduplication ratio
From: Les Mikesell <lesmikesell AT gmail DOT com>
To: backuppc-users AT lists.sourceforge DOT net
Date: Wed, 09 Apr 2008 10:12:09 -0500
Tino Schwarze wrote:
>
>> I've seen in some commercial backup systems with included
>> deduplication (which often run under Linux :-), that files are split in
>> 128k or 256k chunks prior to deduplication.
>>
>> It's nice to improve the deduplication ratio for big log files, mbox
>> files, binary db not often updated, etc. Only the last chunk of a log
>> file would create a new entry in the backuppc pool.
>>
>> Anybody has idead about how this feature could be added without major
>> changes to backuppc ? (replacing hard links with text files containing
>> the list of chunks ?)
> 
> Oh no, not even more small files in the file system! :-<
> 
> But the idea's not too bad. And it would need to be enabled per host
> (possibly per file or backed up directory) anyway.

I'd probably look at what rdiff-backup does with incremental differences 
and instead of chunking everything, just track changes where the 
differences are small.

-- 
   Les Mikesell
    lesmikesell AT gmail DOT com


-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/