BackupPC-users

Re: [BackupPC-users] rsync never starts transferring files (but does something)

2012-11-20 16:08:38
Subject: Re: [BackupPC-users] rsync never starts transferring files (but does something)
From: John Rouillard <rouilj-backuppc AT renesys DOT com>
To: Bowie Bailey <Bowie_Bailey AT BUC DOT com>
Date: Tue, 20 Nov 2012 21:07:09 +0000
On Tue, Nov 20, 2012 at 03:34:02PM -0500, Bowie Bailey wrote:
> On 11/20/2012 3:13 PM, John Rouillard wrote:
> > On Tue, Nov 20, 2012 at 09:46:33AM -0500, Bowie Bailey wrote:
> >> On 11/19/2012 4:35 PM, John Rouillard wrote:
> >>> What may also work is to use excludes to do your sharding. I have 4
> >>> "hosts" now with different excludes. All of them back up the same share:
> >> That seems a bit overly complex.  Wouldn't it be easier to use includes?
> >>
> >> # include subdirectories starting with a, b, or c case insensitive
> >>      $Conf{BackupFilesOnly} = {
> >>        '/home1' => [ "/[A-Ca-c]*/**" ],
> >>      };
> >>
> >>      # include subdirectories starting with d...m case insensitive
> >>      $Conf{BackupFilesOnly} = {
> >>        '/home1' => [ "/[D-Md-m]*/**" ],
> >>      };
> >>
> >>      # include subdirectories starting with n...z case insensitive
> >>      $Conf{BackupFilesOnly} = {
> >>        '/home1' => [ "/[N-Zn-z]*/**" ],
> >>      };
> >>      # exclude your problem case
> >>      $Conf{BackupFilesExclude} = {
> >>        '/home1' => [ "- /user/**" ],
> >>      };
> > IIRC BackupFilesOnly and BackupFilesExclude interact in very wierd
> > ways.  I think you can only choose one method.
> >   
> >>      # back up problem user and other misc directories (non-alphabetic
> >> first char)
> >>      $Conf{BackupFilesExclude} = {
> >>        '/home1' => [ "+ /user/**", "- /[A-Za-z]*/**" ],
> >>      };
> >>
> >> This way, it is much more obvious what is being backed up by each host.
> > That is true but.....
> >   
> >> This is off the top of my head and not tested, so it may need to be
> >> tweaked a bit.
> > But what happens when somebody creates a directory starting with '.',
> > '!'  or some unicode character that you didn't put in your include
> > range?
> 
> Then they get backed up by the last host which IS done as an exclude.
> 
> > With exclusion you will get those directories (multiple times but you
> > will get them). With inclusion you must be sure that there is no
> > chance at all of having a directory created that you have not included
> > otherwise you have no backup. There is also no error that you have no
> > backup.  Granted expanding the inclusion to all possible initial
> > characters is possible, but IMO more likey to fail.
> >
> > Also character classes depend on the language settings. While I expect
> > everybody to use LANG=C that is probaly a stupid assumption. In
> > Estonian [A-Z] is not the same as it is with LANG=C (see
> > http://en.wikipedia.org/wiki/Estonian_alphabet first listing). While
> > it's unlikely you would trip over it, you still need to realize the
> > issue as it leads to a silent failure to back up data.  So you need to
> > test every possible filename that can possibly be included.
> >
> > If you are using exclusion, you still have the same character class
> > issue, but since you are excluding those files, some host will not
> > have that range/character class excluded and the files will get backed
> > up. So it fails safely - the data is backed up.
> >
> > Hence I claim the testing is easier, you just run 10 or so test cases
> > (control character, puncutation, chars > 128, Things beginging with T
> > if you are Estonian ...) that are not in any of the exclusion ranges
> > and verify that it gets backed up.
> 
> The first set of hosts backs up their specified range (a-c,d-m,n-z).  
> The last host backs up everything else (exclude [a-z]).  Any strange 
> characters, punctuation and such be picked up by the last host.  Since 
> the last host only excludes what was included in the other hosts, I 
> don't see an issue.  Language settings should not matter since any given 
> character will either match [a-zA-Z] or fail to match.  If it matches, 
> it will be picked up by the includes in the first few hosts.  If it 
> doesn't match, it will be picked up by the last host.

You are assuming that [A-Za-z] is the same as [A-Ca-cD-Md-mN-Zn-z].
You are correct AFAIK in the C locale. I don't feel comfortable making
the same claim in any other locale. E.G. There could be a C caret
after C and before D that is included in the exclude list [A-Z] but
not in the other set.

To work around that this should work:

  # back up problem user and other misc directories (non-alphabetic first char)
     $Conf{BackupFilesExclude} = {
       '/home1' => [ "+ /user/**", "- /[A-Ca-c]*/**",
              "- /[D-Md-m]*/**", "- /[N-Zn-z]*/**", ],
      };

The reason I could exclude all alphabetics in my 4th host setup was
because it only needed to back up exactly one directory. The other
'hosts' handled the backup of strange directories/files.

I agree this method looks promising (except for the mixing of
BackupFilesExclude and BackupFilesOnly).

-- 
                                -- rouilj

John Rouillard       System Administrator
Renesys Corporation  603-244-9084 (cell)  603-643-9300 x 111

------------------------------------------------------------------------------
Monitor your physical, virtual and cloud infrastructure from a single
web console. Get in-depth insight into apps, servers, databases, vmware,
SAP, cloud infrastructure, etc. Download 30-day Free Trial.
Pricing starts from $795 for 25 servers or applications!
http://p.sf.net/sfu/zoho_dev2dev_nov
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/

<Prev in Thread] Current Thread [Next in Thread>