BackupPC-users

Re: [BackupPC-users] Why does backuppc transfer files already in the pool

2010-08-28 14:23:04
Subject: Re: [BackupPC-users] Why does backuppc transfer files already in the pool
From: Les Mikesell <lesmikesell AT gmail DOT com>
To: backuppc-users AT lists.sourceforge DOT net
Date: Sat, 28 Aug 2010 13:21:00 -0500
On 8/28/10 10:18 AM, martin f krafft wrote:
> Hello,
>
> Using rsync+ssh as my transfer method, I find that backuppc, when
> backing up a new host, transfers all files, even if specific files
> are already in the pool.
>
> The docs[0] say:
>
>    As BackupPC_tarExtract extracts the files from smbclient or tar,
>    or as rsync or ftp runs, it checks each file in the backup to see
>    if it is identical to an existing file from any previous backup of
>    any PC. It does this without needed to write the file to disk. If
>    the file matches an existing file, a hardlink is created to the
>    existing file in the pool.

That can only happen after the file exists on the server.  I think there is 
some 
special magic to avoid writing tmp files where the whole file fits in memory, 
though.

> 0. http://backuppc.sourceforge.net/faq/BackupPC.html#backuppc_operation
>
> This is in contrast to what I am experiencing. Is backuppc fetching
> the file into memory to compare it with the pool from there?

Rysnc comparisons work against the same-named file in the previous full run (or 
increment if you are using incremental levels).  Because of the hardlinks, this 
will normally also be the right pool file.

> Thinking about it, this is what needs to happen because it needs to
> have the file to determine its hash.
>
> But shouldn't it just need to transfer the 1st and 8th 128k chunk to
> determine the hashing? Or is is that hashing function only used once
> the whole file has been transferred?

No, it compares the whole thing - and has to to deal with collisions.

Keep in mind that what you are wanting to happen only matters in the unusual 
case that an exact copy exists in the pool but not in the previous backup of 
this machine.  As soon as it is copied in the first full for the machine you 
are 
watching you'll have the hardlink in the right place for the subsequent rsync 
runs to find it by name and not copy again.  In practice this doesn't matter 
much.   I suppose it could if you had a low-bandwidth remote connection to a 
location where you make the same change to a large file on many machines.

-- 
    Les Mikesell
     lesmikesell AT gmail DOT com


------------------------------------------------------------------------------
Sell apps to millions through the Intel(R) Atom(Tm) Developer Program
Be part of this innovative community and reach millions of netbook users 
worldwide. Take advantage of special opportunities to increase revenue and 
speed time-to-market. Join now, and jumpstart your future.
http://p.sf.net/sfu/intel-atom-d2d
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/

<Prev in Thread] Current Thread [Next in Thread>