Bacula-users

Re: [Bacula-users] [Bacula-devel] Idea/suggestion for dedicated disk-based sd

2010-04-08 02:40:51
Subject: Re: [Bacula-users] [Bacula-devel] Idea/suggestion for dedicated disk-based sd
From: Kern Sibbald <kern AT sibbald DOT com>
To: bacula-devel AT lists.sourceforge DOT net
Date: Thu, 8 Apr 2010 08:39:04 +0200
Hello,

I haven't seen the original messages, so I am not sure if I understand the 
full concept here so my remarks may not be pertinent.  

However, from what I see, this is basically similar to what BackuPC does.  The 
big problem I have with it is that it does not scale well to thousands of 
machines.

If I were thinking about changing the disk Volume format, I would start by 
looking at how git handles storing objects, and whether git can scale to 
handle a machine with 40 million file entries.

One thing is sure is that, unless some new way of implementing hardlinks is 
implemented, you will never see Bacula using hard links in the volumes. That 
is a sure way to make your machine unbootable if you scale large enough  Just 
backup enough clients with BackupPC and one day you will find that fsck no 
longer works.  I suspect that it will require only a couple hundred million 
hardlinks before a Linux machine will no longer boot.

Regards,

Kern

On Wednesday 07 April 2010 22:15:24 Phil Stracchino wrote:
> On 04/07/10 12:06, Robert LeBlanc wrote (in bacula-users):
> > So still thinking about this, is there any reason to not have a
> > hierarchical file structure for disk based backup rather than a
> > serialized stream? Here are my thought, any comments welcome to have a
> > good discussion about this.
> >
> > SD_Base_Dir
> >     +- PoolA
> >     +- PoolB
> >             +- JobID1
> >             +- JobID2
> >                     +- Clientinfo.bacula (Bacula serial file that
> > holds information similar to block header)
> >                     +- Original File Structure (File structure from
> > client is maintained and repeated here, allows for browsing of files
> > outside of bacula)
> >                              +- ClientFileA
> >                              +- ClientFileA.bacula (Bacula serial file
> > that holds information similar to the unix file attribute package)
> >                              +- ClientFileB
> >                              +- ClientFileB.bacula
> >                              +- ClientDirA
> >                              +- ClientDirA.bacula
> >
> > Although it's great to reuse code, I think something like this would
> > be very benifical to disk based backups. The would help increase dedup
> > rates and some file systems like btrfs and ZFS may be able to take
> > advantage of linked files (there has been some discussion on the btrfs
> > list about things like this). This would also allow it to reside on
> > any file system as all the ACL and information is being serialized in
> > separate files which keeps unique data out of the blocks of possible
> > duplicated data. I think we could even reuse a lot of the
> > serialization code, so it would just differ in how it writes the
> > stream of data.
>
> After having thought about this a bit, I believe the idea has
> significant merit.  Tape and disk differ significantly enough that there
> is no conceptual reason not to have separate tape-specific and
> disk-specific SDs.  So long as the storage logically looks the same from
> the point of view of other daemons, the other daemons don't need to know
> that the underlying storage architecture is different.  Creating a
> hierarchical disk SD in this fashion that appears to the rest of Bacula
> exactly the same as the existing FD does, and yet takes advantage of the
> features offered by such an implementation, will not necessarily be a
> trivial problem.  It's a pretty major project and, if approved, wouldn't
> happen right away.
>
> The major problem I see at the moment, architecturally speaking, is that
> at the present time, this would break both migration and copy jobs
> between volumes on the new disk-only SD and volumes of any kind on the
> traditional SD, because Bacula does not yet support copy or migration
> between different SDs.  At this time, both source and destination
> devices are required to be on the same SD.



------------------------------------------------------------------------------
Download Intel&#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users

<Prev in Thread] Current Thread [Next in Thread>