Bacula-users

Re: [Bacula-users] Include Dir Containing?

2011-03-03 13:43:37
Subject: Re: [Bacula-users] Include Dir Containing?
From: Christian Manal <moenoel AT informatik.uni-bremen DOT de>
To: bacula-users AT lists.sourceforge DOT net
Date: Thu, 03 Mar 2011 19:40:38 +0100
Am 03.03.2011 17:20, schrieb Bob Hetzel:
> 
>> From: Christian Manal <moenoel AT informatik.uni-bremen DOT de>
>> Subject:     
>> To: bacula-users AT lists.sourceforge DOT net
>> Message-ID: <4D6CB79B.3070001 AT informatik.uni-bremen DOT de>
>> Content-Type: text/plain; charset=ISO-8859-1
>>
>> Am 26.02.2011 04:52, schrieb Dan Langille:
>>>> On 2/25/2011 5:49 AM, Christian Manal wrote:
>>>>>> Am 22.02.2011 14:06, schrieb Christian Manal:
>>>>>>>> Am 22.02.2011 13:45, schrieb Marc Schiffbauer:
>>>>>>>>>> * Christian Manal schrieb am 22.02.11 um 12:43 Uhr:
>>>>>>>>>>>> Am 22.02.2011 12:26, schrieb Phil Stracchino:
>>>>>>>>>>>>>> On 02/22/11 06:07, Christian Manal wrote:
>>>>>>>>>>>>>>>> That's right, but not what I need. I want to include the 
>>>>>>>>>>>>>>>> directory
>>>>>>>>>>>>>>>> containing the specific file, not the file itself like it is 
>>>>>>>>>>>>>>>> shown in
>>>>>>>>>>>>>>>> the examples.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> To give an example: By default, nothing under
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>     /export/home
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> will be backed up. Now a user "foo" creates a file ".backmeup" 
>>>>>>>>>>>>>>>> in his
>>>>>>>>>>>>>>>> home directory or a subdirectory of it. For example
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>     /export/home/foo/important/.backmeup
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The next backup should then include the directory
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>     /export/home/foo/important
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> There is not, to my knowledge, any built-in functionality to do 
>>>>>>>>>>>>>> this at
>>>>>>>>>>>>>> this time.  You'd have to use a script to generate the fileset 
>>>>>>>>>>>>>> on demand.
>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> I feared as much. Thanks for the replies anyway.
>>>>>>>>>>
>>>>>>>>>> Christian,
>>>>>>>>>>
>>>>>>>>>> before bacula had the "Exclude Dir Containing" feature I used
>>>>>>>>>> something very similra to you example like that:
>>>>>>>>>>
>>>>>>>>>> File = "\\|sh -c 'for D in /home; do find $D -xdev -name 
>>>>>>>>>> .BACULA_NO_BACKUP \
>>>>>>>>>>          -type f -printf \"%h\\n\"; done | tee 
>>>>>>>>>> /root/bacula_excluded_dirs.log'"
>>>>>>>>>>
>>>>>>>>>> That worked very well over years.
>>>>>>>>>>
>>>>>>>>>> So I think something like that for include should work well for you.
>>>>>>>>>>
>>>>>>>>>> File = "\\|sh -c 'for D in /home; do find $D -xdev -name 
>>>>>>>>>> .BACULA_BACKUP \
>>>>>>>>>>          -type f -printf \"%h\\n\"; done | tee 
>>>>>>>>>> /root/bacula_included_dirs.log'"
>>>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks. That saves me the work of putting a working script together
>>>>>>>> myself :-)
>>>>>>
>>>>>> Hi again,
>>>>>>
>>>>>> this turned out to be not viable for my setup. Running that find command
>>>>>> in a shell on my fileserver takes about 9 hours for a ZFS dataset of
>>>>>> about 620 GiB with far more than 10 million files. And having a 9 hour
>>>>>> overhead in my backups just to create the fileset isn't acceptable.
>>>>>>
>>>>>> So I'm thinking about solving this another way, by letting the users
>>>>>> create their own filesets by putting relative paths into a dotfile in
>>>>>> the root of their homedir.
>>>>>>
>>>>>> To give an example: User "foo" puts a file '.backuprc' in his homedir
>>>>>> containing the following:
>>>>>>
>>>>>>     important/stuff
>>>>>>     other/important/stuff
>>>>>>     mostly/unimportant/stuff/importantfile.txt
>>>>>>     mostly/unimportant/stuff/importantfile2.txt
>>>>>>
>>>>>> which would be collected by something like this:
>>>>>> (a first test run with '-name .bashrc' took only about 10 minutes)
>>>>>>
>>>>>>     find /export/home -maxdepth 2 -type f -name .backuprc \
>>>>>>       -exec sh -c '/path/to/sanitize-paths.pl {}<  {}' \;
>>>>>>
>>>>>> where 'sanitize-paths.pl' filters empty lines and comments, appends the
>>>>>> relative path to the absolute path of the homedir, makes sure the users
>>>>>> don't pull any shenanigans with this (like putting '../../../' as a
>>>>>> path) and also informs them when they have invalid lines in their list.
>>>>>>
>>>>>> The output would look like this:
>>>>>>
>>>>>>     /export/home/foo/important/stuff
>>>>>>     /export/home/foo/other/important/stuff
>>>>>>     /export/home/foo/mostly/unimportant/stuff/importantfile.txt
>>>>>>     /export/home/foo/mostly/unimportant/stuff/importantfile2.txt
>>>>>>
>>>>>>
>>>>>> But now I'm concerned about the potential size of the fileset. I have
>>>>>> over 3000 homedirs in that filesystem, which could result in some
>>>>>> ten-thousand lines and more. Will Bacula handle that or do I have to
>>>>>> expect performance issues or even crashes?
>>>>
>>>> Try it.  Find out.  Sorry, but I really don't know.  It is easily tested
>>>> though.  Without involving your users.  Create a test case.
>>>>
>> I just did that. Made a test setup on a VirtualBox. I created 3000 "home
>> directories" and filled them randomly with some dirs and files (around
>> 140,000 files total). The script generated fileset had about 18,000
>> lines. I noticed no real problems during my test-runs.
>>
>>
>> Regards,
>> Christian Manal
>>
>>
>>
> 
> Here's one issue you're going to run into if you do it by "including" 
> rather than excluding:
> First, either way you're going to have to use "ignore fileset changes = 
> yes".  If you don't do that you'll find that you get a full backup every 
> single time.  Usually that's not what is wanted.

I am aware of that.


> Now suppose you do a full backup on March 1.  On March 10, a user then 
> decides some old files that weren't being backed up before should be backed 
> up.  The files are all timestamped prior to March 1.  The end result is 
> that they won't get backed up until the next full backup.  If you do full 
> backups frequently you might not care much about it but you should be aware 
> of it.

Does it really work that way? I would have thought Bacula actually
notices if the fileset changes and backs up new files and directories
regardless of their time of last change.

If that's not the case, I have to include this tidbit into the enduser
documentation so they know to "touch" older files they add to the fileset.
Thanks for that info.


> If you go about it the other way--backing up everything by default and only 
> skipping things somebody thought at least a little about you're going to be 
> less likely to skip important stuff.

Well, by letting the users decide what to back up, I no longer am
responsible if something important is not included, as long as
everything is well documented.


> Either way you do this though, by excluding lots of stuff dynamically, or 
> including lots of stuff dynamically, you're going to find at some point in 
> the future that you missed something and data will be lost forever due to 
> user error.  In many environments, that's not worth the risk.  YMMV.

The terms of service our users accept when they receive their account
state that they are responsible for their own backup. We backed up
everything anyway since we could afford it, but it is starting to strain
our capacities. So we are just removing that "safety net" while still
providing ressources the users are not actually entitled to, hoping that
the backups will go down to a managable size.


Regards,
Christian Manal

------------------------------------------------------------------------------
Free Software Download: Index, Search & Analyze Logs and other IT data in 
Real-Time with Splunk. Collect, index and harness all the fast moving IT data 
generated by your applications, servers and devices whether physical, virtual
or in the cloud. Deliver compliance at lower cost and gain new business 
insights. http://p.sf.net/sfu/splunk-dev2dev 
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users

<Prev in Thread] Current Thread [Next in Thread>