Re: [BackupPC-users] Backing up a BackupPC server

Peter Walter wrote at about 06:27:35 -0400 on Tuesday, June 2, 2009:
 > I have read with interest various threads on this list concerning 
 > methods of how to back up a backuppc server to a remote file system over 
 > the internet. My impression from reading the threads is that there is no 
 > *good* way - that rsync is a poor choice if you have many hardlinks, and 
 > methods like copying a "snapshot"  of a block-level device are 
 > inefficient if only a relatively small proportion of the data changes. I 
 > have tried both methods, and am not satisfied with the performance and 
 > efficiency of either. In addition, BackupPC is not compatible with 
 > 'cloud' storage systems - at least the ones I have looked at do not seem 
 > to support hardlinks.
 > 
 > As a Linux newbie, I have only a partial understanding of the technology 
 > underlying Linux and BackupPC, but I get the impression that the problem 
 > with a rsync-like solution is that processing hardlinks is very 
 > expensive in terms of cpu time and memory resources. This may be a 
 > stupid question, but, if hardlinks are the problem, has any thought been 
 > given to adding to BackupPC an option to use some form of database 
 > (text, SQL or otherwise) to associate hashes to files, instead? It seems 
 > to me that using hardlinks is in fact using that feature of the file 
 > system *as* a database, a use that does not appear to be optimal ... if 
 > I have misunderstood, please educate me :-)
 > 
 > Peter
 > 

Indeed this has been discussed many times before ;) -- see the archives.

That being said, I agree that using a database to store both the
hardlinks along with the metadata stored in the attrib files would be
a more elegant, extensible, and platform-independent solution though
presumably it would require a major re-write of BackupPC.

I certainly understand why BackupPC uses hardlinks since it allows for
an easy way to do the pooling and in a sense as you suggest uses the
filesystem as a rudimentary database.

On the other hand as I and others have mentioned before moving to a
database would add the following advantages:

1. Platform and filesystem independence -- BackupPC would no longer
   depend on the specific hard link behaviors of linux and associated
   filesytems.

2. It would be easier to extend the attrib notion to store extended
   attributes whether for Linux (e.g., selinux attributes), Windows
   (e.g., ACL attributes) or any other OS.

3. The pool could be split among multiple disks and filesystems since
   it would no longer depend on hard-link behavior

4. Backing up BackupPC backups would be much easier and faster since
   you no longer would have hard links to worry about -- just backup
   the database and any portion of the pool that you want to.

5. The whole system would be more elegant and extensible since all
   types of metadata could be stored in the database rather than being
   stored in various files in the BackupPC tree. For example,
         - You wouldn't need the kludge of file mangling
         - Checksums could be stored in the database rather than being
           appended in a non-standard way to the end of the file
         - File level encryption could easily be added
         - Alternative file-level compression schemes could easily be
           supported.
         - The host-specific config data (and maybe even all the config
           data) could be stored in tables rather than in individual
           config files
         - The 'backups' file could also be stored as a table

6. Presumably a database architecture would also make it easier to
   have more granular control over user access and permissions at the
   feature and file level.

The challenge though is that to do this right (i.e. in a way that is
both elegant and extensible) would require a substantial if not almost
complete re-write of BackupPC and I'm not sure that Craig (or anybody
else for that matter) are willing to sign up for that...

Still, it would be awesome to combine the simplicity and pooling
structure of BackupPC with the flexibility of a database
architecture...


------------------------------------------------------------------------------
OpenSolaris 2009.06 is a cutting edge operating system for enterprises 
looking to deploy the next generation of Solaris that includes the latest 
innovations from Sun and the OpenSource community. Download a copy and 
enjoy capabilities such as Networking, Storage and Virtualization. 
Go to: http://p.sf.net/sfu/opensolaris-get
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/