BackupPC-users

Re: [BackupPC-users] hardware and configuration recommendations for speed?

2010-05-25 17:18:55
Subject: Re: [BackupPC-users] hardware and configuration recommendations for speed?
From: Les Mikesell <lesmikesell AT gmail DOT com>
To: backuppc-users AT lists.sourceforge DOT net
Date: Tue, 25 May 2010 16:17:10 -0500
On 5/25/2010 1:59 PM, Frank J. Gómez wrote:
>
> It is definitely not the initial run.  The problem I've had with this
> particular client is that any full backup takes ages; the incremental
> backups work smoothly until the next full backup is scheduled.  If the
> laptop stays in the office long enough for the next full backup to
> complete, then incrementals work fine; otherwise, the client seems to
> get stuck (for months, sometimes) trying to complete the partial.

That could be a local problem on the target.  The difference is that 
incremental runs only look at the directory entries if the timestamp and 
length match the copy in the previous full (or incremental if you are 
using levels) - the full runs do a complete read and block checksum 
compare.  Maybe the machine is just really slow at disk access or is 
doing retries to recover errors in those files.

> I don't think I have checksum caching enabled; I'll try that.

This saves the uncompress and checksum computation on the server side. 
The remote side still reads everything.  The other common problem with 
rsync is that the directory is read for the entire tree and held in 
memory at both ends as they walk through and compare contents so you can 
have problems with many files and a small amount of ram.

> Question (perhaps an obvious one): What is the benefit of doing periodic
> full backups?  What would be the downside of getting the full backup
> once and then doing incrementals going forward?

Fulls rebuild the comparison tree.  If don't use incremental levels, 
each incremental is done against the previous full, transferring more 
each time. If you do use incremental levels, the server has to traverse 
multiple directories to merge the contents for comparison which also 
becomes increasingly slower.  I've always thought it would be useful to 
be able to decouple the --ignore-times option that is added to full 
rsync runs (hardcoded in Rsync.pm) from the tree rebuilds that are a 
side effect of fulls.

>     Considering how cheap disks are these days, I like simple raid1 mirrors
>     where they are practical for the total size you need.
>
>
> Could I gain any speed advantages by writing to more than one disk
> (striping, I guess they call it)?  It seems like more disk heads might
> be able to do the job faster.

Maybe, but probably not much unless your files are typically very large 
so the writes span the striped cylinders.  With smaller files the time 
is mostly taken by seeks to the directory/inode/data locations anyway. 
Spending the money on more RAM to increase the disk cache would probably 
be more effective.

-- 
   Les Mikesell
    lesmikesell AT gmail DOT com

------------------------------------------------------------------------------

_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/