Jeffrey J. Kosowsky wrote at about 10:48:08 -0500 on Sunday, January 23, 2011:
> Looking at MakeFileLink, I realized that it compares the full file
> length including the first byte header and the potential rsync digest
> trailer.
>
> So, if one starts using rsync and then switches to another transport
> method, won't you get duplicate pool files since new comparisons will
> compare a (new) compressed file without rsync digest to an old pool
> file with rsync digest? (these files will have the same partial file
> md5sum names in the cpool but will have different suffixes)
>
> This could be a common use case if for example one is backing up mixed
> Linux and Windows machines with common files using rsync, and smb
> transport, respectively.
>
> Unless I am missing something, this seem like a potentially major
> source of pool data duplication.
>
> In my jLib.pm, I have written a function zcompare2 that only compares
> the compressed zlib data between the first byte header and the
> potential rsync digest trailer. Since the zlib data envelope never
> changes this gives pool matches even if one file is straight zlib
> (first byte =0x78) and the other is rsync digest (first byte = 0xd6 or
> 0xd7).
>
> I have also btw, written a slightly streamlined version of compare
> (which I call jcompare) that strips out some unnecessary code for
> binary files and also works better with weird filenames.
>
> I have combined these changes plus the ones mentioned in my earlier
> thread on MakeFileLink to create a new jMakeFileLink function that
> uses jcompare for non-compressed (pool) files and zcompare2 for
> compressed (cpool) files.
>
>
>
Here is my version of MakeFileLink that:
1. Includes the code efficiency changes outlined in the other thread
2. Uses jcompare/zcompare2 depending on whether file compressed or not
3. Renames before unlinking so it can be undone:
sub jMakeFileLink2
{
my($bpc, $name, $d, $newFile, $compress) = @_;
my($i, $rawFile);
return -1 unless -f $name;
return -2 unless defined($rawFile = $bpc->MD52Path($d, $compress));
my $compare = $compress > 0 ? \&zcompare2 : \&jcompare;
for ( $i = -1 ; ; $i++ ) {
$rawFile .= "_$i" if $i >= 0;
if ( -f $rawFile ) {
if ( (stat(_))[3] < $bpc->{Conf}{HardLinkMax}
&& !$compare->($name, $rawFile) ) {
my $tempname =
mktemp("$name.XXXXXXXXXXXXXXXX");
return -5 unless rename($name, $tempname);
unless(link($rawFile, $name)) {
rename($tempname, $name); #Restore
return -3;
}
unlink($tempname);
return 1;
}
} elsif ( $newFile && (stat($name))[3] == 1 ) {
$rawFile =~ m{(.*)/};
mkpath($1, 0, 0777) unless -d $1 ;
return -4 unless link($name, $rawFile);
return 2;
} else {
return 0;
}
}
}
------------------------------------------------------------------------------
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires
February 28th, so secure your free ArcSight Logger TODAY!
http://p.sf.net/sfu/arcsight-sfd2d
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List: https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki: http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/
|