BackupPC-users

Re: [BackupPC-users] backup of backuppc and schedule, is it archive?

2010-03-12 09:17:20
Subject: Re: [BackupPC-users] backup of backuppc and schedule, is it archive?
From: Les Mikesell <lesmikesell AT gmail DOT com>
To: "General list for user discussion, questions and support" <backuppc-users AT lists.sourceforge DOT net>
Date: Fri, 12 Mar 2010 08:15:34 -0600
Sylvain Viart - Gmail wrote:

>>> Backing up some primaries backuppc servers, on a secondary backuppc.
>>> One of the primary hold itself 86 host.
>>>     
>> Are you trying to back up the primary backuppc servers complete with 
>> history, or do you just need the latest full from each target in this copy?
>>   
> Just the last full.
> 
> Following your advice, I've changed my strategy to provide a quicker way 
> to restore any host data, directly from the secondary.
> Backuping /var/local/machine, I've all final host data living there, on 
> the secondary.

Maybe it would be easier to reverse the concept and do a straight rsync to an 
intermediate disk location at the primary site, keeping only one copy there, 
letting the remote backuppc copy that and keep all the history.  That has the 
downside of having to write your own rsync scripts for the local copy or 
finding 
something else, but would be a more efficient approach since you don't have to 
copy in and out of backuppc's storage format all the time.

>>> Strategy:
>>> Using a BackupPC_tarCreate loop on the primary, called from a
>>> DumpPreUserCmd on the secondary backup server.
>>> Then the secondary backup itself the copy of every extracted host.
>>>     
>> This seems like you are adding a bottleneck compared to just backing up 
>> the targets directly from the primary and secondary backuppc servers.
>>   
> 
> Backup of final host twice should be avoided, because:
> 
>     * need to keep the config in sync between, primary and secondary
>       backup server.
>     * the backup job is too heavy for the final host. I prefer to use
>       the copy which lives on the primary backup.
> 
> What is the bottleneck?

You are making the primary backuppc server copy everything in and out of 
backuppc format - and I was thinking you were feeding the tar extract to the 
secondary needing bandwidth for the complete copy.  If you extract on the 
primary, at least you can use rsync from the secondary.

>> If you hit the hosts directly you could use rsync.  Maybe you could use 
>> the same ssh identity key and just script the config updates to 
>> propagate changes as you add hosts to the primaries.
>>
>>   
> Hum, yes...
> But what about my earlier problem about blackout period sync between 
> primary and secondary?
> I don't want a final host to be backuped by both backuppc at the the 
> same time, IO would be really poor!

That could be as simple as having non-overlapping times that aren't in the 
blackout. That's assuming that you have time away from the server's peak load 
to 
complete 2 runs.  I'm used to 'business' type use patterns where you have all 
night. If your servers have heavy international use that might not be the case.

> I'm still interesting in a scheduler simulator! :-)
> May be some developer could point me, in the backuppc code?

Typically the schedule is mostly driven by how long it took the previous hosts 
to complete because of the concurrency limit.

> I can tolerate poor IO on the primary backuppc, which is dedicated to 
> backup.
> Can I use a DumpPreUserCmd , or some other check, to schedule a deferred 
> backup on a given host?

The ping command would be a better place to decide to defer a run.

> For now my script should work, except a small permission problem during 
> tar extract on the secondary.
> I'm going to perform everything as root, should kept user's id and 
> original permission, as backuped from the final host.

If the primary has time and disk space to complete the backups and extracts, it 
looks like it might work.

-- 
   Les Mikesell
    lesmikesell AT gmail DOT com


------------------------------------------------------------------------------
Download Intel&#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/