BackupPC-users

Re: [BackupPC-users] Noted Observations & Complaints Using BackupPC for 5 months

2010-04-22 11:57:59
Subject: Re: [BackupPC-users] Noted Observations & Complaints Using BackupPC for 5 months
From: Les Mikesell <lesmikesell AT gmail DOT com>
To: backuppc-users AT lists.sourceforge DOT net
Date: Thu, 22 Apr 2010 10:55:51 -0500
On 4/22/2010 5:53 AM, Saturn2888 wrote:
>
> I am listing the observations and complaints of mine after using BackupPC for 
> 5 months.

First, I'd recommend joining the backuppc email list instead of posting 
on backupcentral's forum because it works better for the people likely 
to answer questions.

My system is running Ubuntu Hardy and has been kept up with the latest 
updates and patches. I upgraded from 3.0.0 to 3.1.0 within my first few 
months and have kept this computer solely a BackupPC machine. I've since 
gone through a few hard drive configurations which I will list here:
> - 3x640GB in hardware RAID3 using an IDE RAID card with SATA ports

Raid3 is kind of odd.  Any reason of using it?

> - 3x250GB in Linux software RAID5 using mdadm and onboard SATA ports

Raid5 is a bad idea for write speed.

> - 1x2TB using onboard SATA ports

> All of those configurations have assured me it is not the speed of the drives 
> themselves which have been giving me any issues.

How much RAM do you have?  Filesystem buffering helps a lot with 
apparent disk speed.

> ./01\.
> From here, I noticed the load of my computer consistently at or over 4, and I 
> don't think I've ever seen the processor usage go under 100% unless BackupPC 
> isn't doing anything.

Most ways of showing linux processor include iowait time, and the most 
likely culprit is waiting for a disk head to move most of the time.  You 
could verify that with iostat. Samba fulls will xfer the whole content, 
then replace anything that already existed in the pool with hardlinks.

> ./02\.
> I later added in a few always-on Linux machines configured with Rsyncd.

Rsync/rsyncd transfer the entire directory tree listing before starting 
to copy data.  So having a huge number of files can affect performance, 
especially if the backuppc server does not have enough ram to hold the 
data for the concurrently running backups.  Then it only transfers 
changes, so other than for the first run the xfer speed statistics don't 
mean a lot.  Just look at the elapsed time.

> ./03\.
> I eventually ended up with 18 machines configured in BackupPC with more being 
> added as I configure them. In those 18 are a few websites of mine for which 
> the servers are remote. Those website I am backing up using the Rsync 
> configuration. The Rsync configuration with these websites works exactly as 
> it should. One of the sites had 11GB of data and while that took forever to 
> download on my connection, incremental updates are speedy.
>
> The full updates are speedy too. Backup number 61 on this website took only 
> 6.4 minutes for 2.5GB of files. This is because those 2.5GB were already 
> backed up in the last full and therefore no longer needed to be redownloaded. 
> It is my understanding that this is how BackupPC is supposed to work. While 
> the very first full took around 160 minutes, the second was quite a huge 
> difference, only taking a bit more than three times as long as the previous 
> incrementals.

With rsync/rsyncd, only changes are copied, so directories that aren't 
huge with files with few changes, will go quickly.  Incremental runs 
skip over files with matching timestamps and lengths, fulls do a block 
checksum comparison which will slow things down to at least the disk 
read speed but won't transfer much more data.

>
> ./04\.
> After some time, I noticed my Samba backups were missing files.

These should be noted in the XferLOG along with the reason, which is 
probably windows permissions or file locking.

> ./05\.
>
 > I noticed each incremental seemed to take longer than the one before 
 > it and that the two Linux boxes began to take longer and longer to
 > backup as well.

Incrementals are based on the last full - so they should be expected to 
take longer each time until you do a full.

> ./06\.
> The logs seem to suck really bad. I can't seem to find out which new files 
> are being created

You must be looking at the wrong log.  Look at the XferLOG in the table 
under the backup summary on the host page.  And the 'Error' link there 
should have those missing samba files listed, although if you hit an 
unreadable directory you won't see the files missed under it.

> What's worse, no matter how much more stuff I put in my excludes, more data 
> seems to be downloaded.

Perhaps your syntax is wrong.  Do you still see the excluded files 
appearing in the backups?  If you post a specific example to the mail 
list, someone who understands it better than I do will probably answer 
(include the xfer method involved).

> ...it was at a point where there was at least 1GB of growth a night from when 
> it was a 160GB pool to a 210GB pool when it began jumping half-dozens of 
> gigabytes during the period of a week.

One possible problem is large files that change daily, even if the 
changes are small (databases, growing logs, unix mailbox format files, 
tar files you are updating elsewhere, etc.).  If the content does not 
match exactly, it will become a new pool entry.  There's not much you 
can do about this.

> I wish I could figure this out. Everything I'm seeing is attributed to 
> Rsyncd. Samba was so fast and Rsync seems to work just fine but Rsyncd is 
> entirely botched.

In theory, rsync has more overhead than rsync because it is the same 
thing plus ssh as the transport.  So, if rsyncd is slower there is 
something wrong with your particular implementation.  Or it is just a 
coincidence that the rsyncd instances are processing much larger 
directory trees.

 > All of my machines show a high amount of CPU usage whenever BackupPC 
beings an Rsyncd backup anywhere from 33% on my most-powerful rig (a 
quad-core) to 95-100% on my netbook.

That's partly from being implemented in perl and partly from counting 
iowait as CPU.  Do you have the --checksum-seed=32761 option enabled for 
rsync?

> Tonight, to do a final test, I've set 8 backups to go in tandem.

It is rarely productive to run more than 2 concurrently on a fast lan 
unless you have a well tuned many-disk raid array.  You are just making 
the disk seek bottleneck worse.

-- 
   Les Mikesell
    lesmikesell AT gmail DOT com

------------------------------------------------------------------------------
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/