BackupPC-users

Re: [BackupPC-users] Problems with hardlink-based backups...

2009-08-30 01:18:36
Subject: Re: [BackupPC-users] Problems with hardlink-based backups...
From: "Jeffrey J. Kosowsky" <backuppc AT kosowsky DOT org>
To: "General list for user discussion, questions and support" <backuppc-users AT lists.sourceforge DOT net>
Date: Sun, 30 Aug 2009 01:15:03 -0400
dan wrote at about 10:30:17 -0600 on Sunday, August 23, 2009:
 > Speed.  Backuppc is constrained by I/O performance as a bottleneck on the
 > system is that the storage volume must be a single filesystem due to
 > hardlinks.  It has been measured a number of times on this mailing list that
 > I/O is the major bottleneck for backuppc.  Getting faster hardware certainly
 > helps but the reliance on a single filesystem for all data is a bottleneck
 > for performance as well as an irritation when upgrading storage as you
 > either need to add additional raid arrays (as expanding a raid is not
 > generally an option) or just use JBOD with LVM or something.   not-ideal.
 > 
 > My solution is to break the backup scheme into smaller chunks and have a
 > number of backuppc servers handling a set number of clients.  The issues
 > here are complexity as I need to admin a number of servers and loss of the
 > file de-duping.   In my organization like many others, each client will have
 > absolutely identical files.  4 backup machines means that a massive amount
 > of data is duplicated 4 times PLUS whatever redundancy is in the raid.
 > 
 > A hybrid platform can use the filesystems strengths and a databases
 > strengths and no have most of the weaknesses.
 > 
 > 
 > My example was a simplistic one.  Sure MD5 can have some collisions so
 > either MD5+SHA1 or just do SHA2.  You would need to store a few more peices
 > of data but I think it would be hard to argue that mysql is many orders of
 > magnitude faster at finding data than a filesystem just like it is hard to
 > argue that a filesystem is many times faster at simply storing files and
 > even faster at storing large files.
 > 
 > Other benefits of the hybrid system are that the files can be on a different
 > volumes than the database.  In fact, because you store the files location on
 > disk in the database, you could store files on many different disks, with to
 > issues with hardlinks.  Because of this, you could put two backuppc machines
 > together in a cluster and each instance of backuppc would look at the same
 > database (or replicated data on their own database) and be able to do online
 > replication of the filestore on other servers.  They could automatically
 > duplicate these files on their own local file store and because there are
 > not millions of hardlinks to worry about, rsync can actually be useful in
 > syncing up file stores to other backuppc machines.  sure you will still have
 > a lot of files but you will have a lot less files for rsync to track.  rsync
 > can handle a lot of files.  with backuppc rsync actually has to track every
 > instance of every file from each host and each backup number plus the pool.
 > without the hardlink pooling rsync would only have to see each file once.
 > 

The hybrid system also has many other advantages including:
- Allows Backuppc to work on OS's/FS's that don't support Unix-type
  hardlinks such as Windoze
- Allows for more expandable, robust, and faster storage of
  metadata. Continuing to expand attrib files to include ACL's and
  other extended attributes will just make the hack messier and
  slower.
- Allows for more granular security and access controls to backups

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/

<Prev in Thread] Current Thread [Next in Thread>