BackupPC-users

Re: [BackupPC-users] Problems with hardlink-based backups...

2009-08-31 13:17:17
Subject: Re: [BackupPC-users] Problems with hardlink-based backups...
From: "Jeffrey J. Kosowsky" <backuppc AT kosowsky DOT org>
To: "General list for user discussion, questions and support" <backuppc-users AT lists.sourceforge DOT net>
Date: Mon, 31 Aug 2009 13:11:34 -0400
Jim Leonard wrote at about 21:18:16 -0500 on Sunday, August 30, 2009:
 > dan wrote:
 > > One of the biggest concerns with backuppc that is constantly discussed 
 > > on this list is syncing the backup data between two or more servers.  
 > > Simply reducing the file count by eliminating the hardlinks would allow 
 > > rsync to be used reliably and effectively.  
 > 
 > It's almost as if you guys haven't heard of filesystem-specific dump 
 > utilities.  For such utils (vxdump, ufsdump, zfs send/receive, etc.) the 
 > number of hardlinks isn't a problem.  You can do both full and 
 > incremental dumps, even across separate machines.  This isn't a problem 
 > that needs solving.

I think you are missing some key points.
First, why should a program require it's own separate filesystem? This
seems to me like an unnatural and kludgey type of requirement.

Also, unless you use LVM, then you have difficulties growing or moving
the filesystem -- and even so it always seems heavy-handed to me to
have to worry about adding LVM volumes. Plus, as you add LVM volumes,
you become more like a huge JBOD with the corresponding risk of
corruption.

I see lots of advantage in keeping the database portion relatively
small, fast, replicable, and moveable. Then you can keep and
distribute the files themselves wherever you want them spread across
one or more separate filesystems. Then the database portion is
optimized for what a database does best and the file-storing
portion can be optimized for what a filesystem stores best. And both
parts are easily moveable, replicable and not dependent or limited by
hardlinks or other filesystem-dependent functionality.

Don't get me wrong - Backuppc is great and hardlinks are a great
kludge to at first glance get something for nothing. I'm just saying
that hardlinks while "easy" bring some longer-term limitations and
that there comes the time when it may be worth investing in going
beyond them.

Personally I would like to see Backuppc evolve to combine the pooling
functionality, leveraging of rsync, and relative simplicity of the
existing Backuppc with the expandability, portability, and flexibility
of the database-based systems like Bacula. I believe that the
combination of a database to store the file attributes and metadata
together with a filesystem to store the pool would be an ideal hybrid.

 > 
 > I feel like the whole "we need an SQL/hybrid" solution discussion is 
 > happening because you aren't aware of better ways to do things.  Just 
 > because a filesystem is a database doesn't mean it would be "better" to 
 > replace it with a "better" database.

Well, did you ever consider that maybe there is a reason that people
keep returning to the same issues and are not satisfied with the current
Backuppc approach?
One issue is the pool replication issue that you cite and many of us
are not satisfied with the answer of "just use ZFS" or "just block
copy the filesystem".

Another equally important limitation is that expanding the attrib
files to include Linux extended attributes or Windows ACLs (or any
ACLs) is kludgey which is probably why it hasn't been done. And I
don't see how you can have a complete backup solution if you can't
back up all the associated file attributes.

 > 
 > For anyone thinking that working with giant multi-gigabyte BLOBs in a 
 > database is the right way to go, I suggest you actually attempt it 
 > yourself and see what happens.  I'm backing up my HD video production 
 > rig with BackupPC, and although such a machine (Windows, 16T of storage, 
 > most video files are at least 50G in size) is outside of the intent of 
 > BackupPC, it actually works.  If BackupPC were to rely on an SQL 
 > database, it would greatly shrink the potential userbase.

You are attacking a straw man. No one has ever suggested
"multi-gigabyte BLOBS in a database." The database would only consist
of the filenames, links, attrib data, and other backup-related metadata. I
would imagine in most cases this would be at most a couple of
gigabytes, assuming you have millions of files in your pool.

 > -- 
 > Jim Leonard (trixter AT oldskool DOT org)            http://www.oldskool.org/
 > Help our electronic games project:           http://www.mobygames.com/
 > Or check out some trippy MindCandy at     http://www.mindcandydvd.com/
 > A child borne of the home computer wars: http://trixter.wordpress.com/
 > 
 > ------------------------------------------------------------------------------
 > Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
 > trial. Simplify your report design, integration and deployment - and focus 
 > on 
 > what you do best, core application coding. Discover what's new with 
 > Crystal Reports now.  http://p.sf.net/sfu/bobj-july
 > _______________________________________________
 > BackupPC-users mailing list
 > BackupPC-users AT lists.sourceforge DOT net
 > List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
 > Wiki:    http://backuppc.wiki.sourceforge.net
 > Project: http://backuppc.sourceforge.net/
 > 

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/

<Prev in Thread] Current Thread [Next in Thread>