BackupPC-users

[BackupPC-users] Grouping by network connectivity (aka replacement queue mechanism)

2009-03-19 17:29:48
Subject: [BackupPC-users] Grouping by network connectivity (aka replacement queue mechanism)
From: John Rouillard <rouilj-backuppc AT renesys DOT com>
To: "General list for user discussion, questions and support" <backuppc-users AT lists.sourceforge DOT net>
Date: Thu, 19 Mar 2009 21:23:00 +0000
This was originally part of:

 Subject:  Re: [BackupPC-users] FS and backuppc performance
 In-Reply-To: <49C29C96.4030506 AT gmail DOT com>

I am starting a new thread on this rather then hijacking the original
thread.

On Thu, Mar 19, 2009 at 02:27:18PM -0500, Les Mikesell wrote:
> Carl Wilhelm Soderstrom wrote:
> > 
> > Backuppc will use all the processor, ram, and disk speed you give it. I've
> > not had a box where they weren't all pegged. I tend to limit concurrent
> > backups to 2; maybe 3 or 4 on a really high-end box (multiple processors
> > and a proven fast disk array); to control disk-head thrashing.
> 
> One thing I think is missing from backuppc that amanda has had for years 
> is a concept of grouping (or excluding...) by network connectivity.  I 
> have a mix of local and remote targets and would like to be able to 
> control concurrency to permit 1 or 2 local backups plus separate limits 
> for each independent WAN path.

I would like this too. I currently use semaphore to create a set of
available slots and lock the slot during the backup using a pre/post
dump command.

Most of our hosts are named after the site they are at:

  box1.site1.example.com
  box1.site2.example.com
  box2.site3.example.com

etc. With semaphore I create one resource pool for each remote site
based on how many parallel backups I am willing to allow from that
site:

  Semaphore site1 has 20 resources.
    Resource 0 is available.
    Resource 1 is available.
    Resource 2 is available.
    Resource 3 is available.
    Resource 4 is available.
    Resource 5 is available.
    Resource 6 is available.
    Resource 7 is available.
    Resource 8 is available.
    Resource 9 is available.
    Resource 10 is available.
    Resource 11 is available.
    Resource 12 is available.
    Resource 13 is available.
    Resource 14 is available.
    Resource 15 is available.
    Resource 16 is available.
    Resource 17 is available.
    Resource 18 is available.
    Resource 19 is available.

  Semaphore site2 has 2 resources.
    Resource 0 is available.
    Resource 1 is taken by PID 29224.

  Semaphore site3 has 2 resources.
    Resource 0 is available.
    Resource 1 is available.

Using some home written scripts (runUserCmds, CheckQueue), I set:

  $Conf{DumpPreUserCmd}     = '/etc/BackupPC/bin/runUserCmds -t $type \
    -c $client -H $host -P $cmdType CheckQueue';
  $Conf{DumpPostUserCmd}    = '/etc/BackupPC/bin/runUserCmds -t $type \
    -c $client -H $host -P $cmdType CheckQueue';

which locks one of the available semaphores if it's a PreUserCmd
and unlocks if it's a PostUserCmd. If it can't lock a semaphore, it
exits with exit code 1, and because:

  $Conf{UserCmdCheckStatus} = 1;

is enabled in the config, the host is skipped for that cycle.

So it is doable in BackupPC without any core changes and the upside of
this is that you can group by factors other than remote site. The
downside is that the log file shows:

  2009-03-02 08:50:07 DumpPreUserCmd returned error status 256... exiting

every time the host is scheduled to be backed up but is unable to
reserve a slot. Also you can have backups fail when they are starved
for resources.

For example:

One thing I have to watch is bandwidth usage. My plan for handling
that is to allocate bandwidth in 64KB/s (512Kb/s) chunks, and use the
CheckQueue script to determine what the bw limit is for the given host
(by scanning /etc/BackupPC/pc/hostname.pl or config.pl). Then I just
reserve the proper number of chunks to reserve that bandwidth.

So I have a site that is bw limited to 2Mb/s (approx 4 chunks), I will
allocate 4 resources in the pool for the site.

If one of the hosts (one_mb) at that site has a bwlimit of 1Mb/s, then
it won't run unless there are at least 2 free resources. So no more
than 2 512Mb/sec hosts can be running.

Semaphore does support fair queing where nothing queued after one_mb
will run till one_mb has run. This guarantees that one_mb will get run
at some point. However this doesn't work with BackupPC's queing
mechanism.  With

  $Conf{MaxBackups} = 8;

to keep reasonable on the system, any backup that is run and queued
waiting for a resource uses one of these 8 slots. So I could have 7
jobs waiting on a resource for site2, but yet backups for site1 and
site3 have plenty of resources available. the only way I can see
around this is to set:

  $Conf{MaxBackups} = 10000;

or some such number, and have an additional queue:

  Semaphore actual_running_backups has 8 resources.
    Resource 0 is available.
    Resource 1 is taken by PID 29224.
    Resource 2 is available.
    Resource 3 is available.
    Resource 4 is available.
    Resource 5 is available.
    Resource 6 is available.
    Resource 7 is available.

so basically BackupPC will queue a host that needs a backup, and the
control of how many are actually running is totally external to
BackupPC. I haven't tried this yet, but I think it will work.

(BTW, semaphore is a ksh impletation of semaphores written by John
Spurgeon under the GPL. I have made a fixed copy at:
http://www.cs.umb.edu/~rouilj/shell_semaphore as the original magazine
it was published in no longer has it available.)

-- 
                                -- rouilj

John Rouillard
System Administrator
Renesys Corporation
603-244-9084 (cell)
603-643-9300 x 111

------------------------------------------------------------------------------
Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are
powering Web 2.0 with engaging, cross-platform capabilities. Quickly and
easily build your RIAs with Flex Builder, the Eclipse(TM)based development
software that enables intelligent coding and step-through debugging.
Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/

<Prev in Thread] Current Thread [Next in Thread>
  • [BackupPC-users] Grouping by network connectivity (aka replacement queue mechanism), John Rouillard <=