Jeffrey,
As always, thanks.
Did you forget to increment jLib.pm to 0.4.2? There's a comment in the
library but "$VERSION = '0.4.1';" in the code.
Would you mind including these as attachments, rather than inline? It would
save us some cut-and-paste.
Regards,
Tyler
On 2013-03-01 23:30, backuppc AT kosowsky DOT org wrote:
> Here is the latest (updated) version of BackupPC_copyPcPool along with
> the jLib.pm library which it uses.
>
> It is much, much better than my initial version(s) released in 2011 as
> pointed out in my email from yesterday. However, it does not yet
> include the "incremental" backup/transfer option that I referred to
> yesterday. Also, it currently requires ~190 bytes/pool entry (for
> o(million) pool elements) to cache the pool inodes. If the -P (pack)
> flage, this is reduced to about ~140 bytes/pool entry. I intend to implement
> a tighter packing of about 30 bytes/pool entry at the expense of
> lookup speed in a future version.
>
> If anyone has trouble with line wrapping errors caused by my emailing
> the plain text, I can send it as an attached file..
>
> I am very interested in hearing people's feedback and experiences with
> this program...
>
> ------------------------------------------------------------------------
> BackupPC_copyPcPool.pl
>
> #!/usr/bin/perl
> #============================================================= -*-perl-*-
> #
> # BackupPC_copyPcPool.pl: Reliably & efficiently copy pool and pc tree
> #
> # DESCRIPTION
> # See below for detailed description of what it does and how it works
> #
> # AUTHOR
> # Jeff Kosowsky
> #
> # COPYRIGHT
> # Copyright (C) 2011-2013 Jeff Kosowsky
> #
> # This program is free software; you can redistribute it and/or modify
> # it under the terms of the GNU General Public License as published by
> # the Free Software Foundation; either version 2 of the License, or
> # (at your option) any later version.
> #
> # This program is distributed in the hope that it will be useful,
> # but WITHOUT ANY WARRANTY; without even the implied warranty of
> # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> # GNU General Public License for more details.
> #
> # You should have received a copy of the GNU General Public License
> # along with this program; if not, write to the Free Software
> # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
> #
> #========================================================================
> #
> # Version 0.2.1, February 2013
> #
> # CHANGELOG:
> # 0.1.1 (Feb 2011)
> # 0.1.2 (May 2011)
> # 0.1.3 (Sep 2011)
> # 0.1.4 (Jan 2013) Added options & error checking to set backuppc uid & guid
> # when not available from passwd & group files, respectively
> # 0.2.0 (Feb 2013) Initial release of new version using full in-memory I-node
> # (hash) caching rather than I-node pooling via file system
> # tree
> # 0.2.1 (Feb 2013) Added packed option for I-node cache hash index & values
> # Added md5sum output option
> #========================================================================
> #
> # DESCRIPTION
> #
> # 1. The program first recurses through the pool/cpool to create a
> # perl hash cache (%InodeCache) indexed by inode of all the pool
> # entries. This subsequently allow for rapid lookup and association of
> # each linked pc file with the corresponding stored pool
> # file. Optionally, the cache can be written out to a file as a list
> # of "<inode> <pool mdsum>" pairs. This inode chache file can then be
> # read in rapidly to populate the chache on subsequent runs, assuming
> # the pool hasn't changed in the interim.
> #
> # 2. The program then runs through the pc tree (or a designated
> # subtree thereof) to create a single long list of all of the
> # directories, hard links, and zero-length files
> # For hard links, the entry is:
> # <path-to-pool-file> <path-to-pc-file>
> # For directories, the entry is:
> # D <owner> <group> <mode> <mtime> <path-to-directory>
> # For zero-length files, the entry is:
> # Z <owner> <group> <mode> <mtime> <path-to-file>
> # For 'errors' (e.g., non-linked, non-zero length files), the entry is:
> # X <owner> <group> <mode> <mtime> <path-to-file>
> # Comments can be indicated by:
> # #
>
> # Note: all paths are relative to TopDir
>
> # Any (non-zero length) pc files that are not linked (i.e. only one
> # hard link) or that are not found in the InodeCache may optionally be
> # corrected and linked to the pool and if this results in a new pool
> # entry then that entry is added to the InoceCache too. Files that are
> # not corrected are listed for separate examination and manual
> # correction or backup.
>
> # The output consists of 3 files:
>
> # <outfile>.links is the file consisting of the list of links,
> # directories and zero-length files described above
>
> # <outfile>.nopool is a listing of the normal top-level log and info
> # files that are not linked and need to be copied
> # over directly
>
> # <outfile>.nolinks is a listing of the non-linked, non-zero length
> # files that either couldn't be linked or that you
> # decided not to link
> #
>
> # If the --Writecache|-W option is set, a 4th output file is generated:
>
> # <outfile>.icache is a 2 column listing consisting of
> # <inode number> <pool file name> pairs.
> # If the --writecache|-W option is repeated, then a
> # 3rd column is added consisting of the md5sum of
> # the (uncompressed) contents of the pool file.
> # One can then check for pool dups by running:
> # sort -k 3 <outfile>.icache | uniq -f 2 -D
>
> # NOTE: Backuppc on the source machine should be disabled (or a
> # static snapshot used) during the entire backup process
>
> # 3. Restoring then consists of the following steps:
>
> # A. Copy over the pool/cpool any way you like, including
> # rsync but without having to worry about hard links
>
> # B. Copy over the non-pooled files (<outfile>.nopool) any way you want,
> # e.g., rsync, tar, etc.
>
> # C. Run this program again using the -r|--restore flag and the
> # <outfile>.nolinks as the input
>
> # D. Optionally copy over the non-linked files in <outfile>.nolinks
>
> # NOTE: Backuppc on the target machine should be disabled (or a
> # snapshot used) during the entire restore process
>
> # Selected features:
> # --gzip|-g flag creates compressed output/input
> # --stdio writes/reads to stdio allowing you to pipe the backup
> # stage directly into the restore stage
> # --fixlinks|-f fixes any missing/broken pc links
> # --topdir|-t allows setting of alternative topdir
> # --dryrun|-d doesn't make any changes to pool or pc tree
> # --verbose|-v verbosity (repeat for higher verbosity)
> #
> #========================================================================
>
> use strict;
> use warnings;
> use File::Path;
> use File::Glob ':glob';
> use Getopt::Long qw(:config no_ignore_case bundling);
> use Fcntl; #Required for RW I/O masks
> use Switch;
>
> use lib "/usr/share/BackupPC/lib";
> use BackupPC::FileZIO;
> use BackupPC::Lib;
> use BackupPC::jLib v0.4.2; # Requires version >= 0.4.2
> use BackupPC::Attrib qw(:all);
>
> no utf8;
>
> #use Data::Dumper; #JJK-DEBUG
> #use Devel::Size qw(size total_size); #JJK-DEBUG
>
> select(STDERR); #Auto-flush (i.e., unbuffer) STDERR
> $| = 1;
> select(STDOUT);
>
> #Variables
> my $bpc = BackupPC::Lib->new or die "BackupPC::Lib->new failed\n";
> my $md5 = Digest::MD5->new;
> my $attr; #Re-bless for each backup since may have different compress level
> my $MaxLinks = $bpc->{Conf}{HardLinkMax};
> my $CREATMASK = O_WRONLY|O_CREAT|O_TRUNC;
> my %InodeCache=(); #Cache for saving pool/cpool inodes
> my @nonpooled = ();
> my @backups = ();
> my @compresslvls=(); #Pool to use for each backup;
> my ($compresslvl, $ptype);
>
> my $backuppcuser = $bpc->{Conf}{BackupPCUser}; #User name
> my $backuppcgroup = $bpc->{Conf}{BackupPCUser}; #Assumed to be same as user
>
> my $directories = 0;
> my $totfiles = 0; #Total of next 6 variables
> my $zerolengthfiles = 0;
> my $existinglink_pcfiles = 0;
> my $fixedlink_pcfiles = 0;
> my $unlinkable_pcfiles = 0;
> my $unlinked_pcfiles = 0;
> my $unlinked_nonpcfiles = 0;
> my $unlinked_files = 0; #Total of above three
> my $missing_attribs = 0;
>
> #GetOptions defaults:
> my $cleanpool;
> $dryrun=0; #Global variable defined in jLib.pm (do not use 'my') #JJK-DEBUG
> #$dryrun=1; #Set to 1 to make dry run only
> my $fixlinks=0;
> my $Force;
> my $gzip;
> my $giddef = my $giddef_def = getgrnam($backuppcuser); #GID
> $giddef_def = "<unavailable>" unless defined $giddef_def;
> my $outfile;
> my $Overwrite;
> my $Pack;
> my $pool = my $pool_def = 'both';
> my $restore;
> my $Readcache;
> my $Skip;
> my $stdio;
> my $TopDir = my $TopDir_def = $bpc->{TopDir};
> my $uiddef = my $uiddef_def = getpwnam($backuppcgroup); #UID
> $uiddef_def = "<unavailable>" unless defined $uiddef_def;
> my $verbose = my $verbose_def = 2;
> my $noverbose;
> my $Writecache;
>
> usage() unless(
> GetOptions(
> "cleanpool|c" => \$cleanpool, #Remove orphan pool entries
> "dryrun|d!" => \$dryrun, #1=dry run
> "fixlinks|f" => \$fixlinks, #Fix unlinked/broken pc files
> "Force|F" => \$Force, #Override stuff...
> "gid|G=i" => \$giddef, #default gid if not in group
> file
> "gzip|g" => \$gzip, #Compress files
> "outfile|o=s" => \$outfile, #Output file (required)
> "Overwrite|O" => \$Overwrite, #OVERWRITES existing files/dirs
> (restore)
> "pool|p=s" => \$pool, #Pool (pool|cpool|both)
> "Pack|P" => \$Pack, #Pack the InodeCache hash
> "restore|r" => \$restore, #Restore rather than backup
> "Readcache|R=s" => \$Readcache, #Read InodeCache from file
> "Skip|S" => \$Skip, #Skip existing backups (restore)
> "stdio|s" => \$stdio, #Print/read to/from stdout/stdin
> "topdir|t=s" => \$TopDir, #Location of TopDir
> "uid|U=i" => \$uiddef, #default gid if not in group
> file
> "verbose|v+" => \$verbose, #Verbosity (repeats allowed)
> "noverbose" => \$noverbose, #Shut off all verbosity
> "Writecache|W+" => \$Writecache,#Write InodeCache out to file
> "help|h" => \&usage,
> ));
>
> if($restore) {
> usage() if $cleanpool || $fixlinks || $outfile ||
> $Readcache || $Writecache ||
> (!$stdio && @ARGV < 1);
> } else {
> usage() if ! defined($outfile) || ($pool !~ /^(pool|cpool|both)$/) ||
> ($Readcache && $Writecache) || $Overwrite;
> }
>
> $verbose = 0 if $noverbose;
> my $DRYRUN = $dryrun ? "**DRY-RUN** " : "";
>
> die "Error: 'backuppc' not found in password file, so must enter default UID
> on commandline (--uid|-U option)...\n" unless defined $uiddef;
> die "Error: 'backuppc' not found in group file, so must enter default GID on
> commandline (--gid|-G option)...\n" unless defined $giddef;
> my (%USERcache, %GROUPcache, %UIDcache, %GIDcache);
> $UIDcache{$uiddef} = $backuppcuser;
> $GIDcache{$uiddef} = $backuppcgroup;
> $USERcache{$backuppcuser} = $uiddef;
> $GROUPcache{$backuppcgroup} = $giddef;
>
> ############################################################################
> if($TopDir ne $TopDir_def) {
> #NOTE: if we are not using the TopDir in the config file, then we
> # need to manually override the settings of BackupPC::Lib->new
> # which *doesn't* allow you to set TopDir (even though it seems so
> # from the function definition, it gets overwritten later when the
> # config file is read)
> $TopDir =~ s|//*|/|g; #Remove any lurking double slashes
> $TopDir =~ s|/*$||g; #Remove trailing slash
> $bpc->{TopDir} = $TopDir;
> $bpc->{Conf}{TopDir} = $TopDir;
>
> $bpc->{storage}->setPaths({TopDir => $TopDir});
> $bpc->{PoolDir} = "$bpc->{TopDir}/pool";
> $bpc->{CPoolDir} = "$bpc->{TopDir}/cpool";
> }
>
> %Conf = $bpc->Conf(); #Global variable defined in jLib.pm (do not use 'my')
> #############################################################################
> #By convention for the rest of this program, we will assume that all
> #directory variables have a trailing "/". This should save us some
> #efficiency in terms of not having to always add back the slash.
> $TopDir .= "/"; $TopDir =~ s|//*|/|g;
> my $pcdir = 'pc/';
>
> die "TopDir = '$TopDir' doesn't exist!\n" unless -d $TopDir;
> die "TopDir = '$TopDir' doesn't look like a valid TopDir!\n"
> unless -d "$TopDir/pool" && -d "$TopDir/cpool" && -d "$TopDir/${pcdir}";
>
> system("$bpc->{InstallDir}/bin/BackupPC_serverMesg status jobs >/dev/null
> 2>&1");
> unless(($? >>8) == 1) {
> die "Dangerous to run when BackupPC is running!!! (use '--Force' to
> override)\n"
> unless $Force ||
> (stat("$TopDir/"))[0] . "." . (stat(_))[1] !=
> (stat("$TopDir_def/"))[0] . "." . (stat(_))[1]; #different fs &
> inode
> warn "WARNING: May be dangerous to run when BackupPC is running!!!\n";
> #Warn but don't die if *appear* to be in different TopDir
> }
>
> ############################################################################
> if(defined $restore) {
> do_restore(@ARGV);
> exit; #DONE
> }
> ############################################################################
> #Open files for reading/writing
> #NOTE: do before chdir to TopDir
> my $sfx = $gzip ? ".gz" : "";
> my $linksfile = "${outfile}.links${sfx}";
> my $nopoolfile = "${outfile}.nopool${sfx}";
> my $nolinksfile = "${outfile}.nolinks${sfx}";
>
> die "ERROR: '$linksfile' already exists!\n" if !$stdio && -e $linksfile;
> die "ERROR: '$nopoolfile' already exists!\n" if -e $nopoolfile;
> die "ERROR: '$nolinksfile' already exists!\n" if -e $nolinksfile;
>
> my $outpipe = $gzip ? "| gzip > " : "> ";
>
> if($stdio) {
> open(LINKS, $gzip ? "| gzip -f" : ">&STDOUT"); #Redirect LINKS to STDOUT
> } else {
> open(LINKS, $outpipe . $linksfile)
> or die "ERROR: Can't open '$linksfile' for writing!($!)\n";
> }
>
> open(NOPOOL, $outpipe . $nopoolfile)
> or die "ERROR: Can't open '$nopoolfile' for writing!($!)\n";
> open(NOLINKS, $outpipe . $nolinksfile)
> or die "ERROR: Can't open '$nolinksfile' for writing!($!)\n";
>
> my ($read_Icache, $write_Icache);
> if($Pack) { #Pack InodeCache entries
> $read_Icache = \&read_Icache_pack;
> $write_Icache = \&write_Icache_pack;
> } else { #Don't pack InodeCache entries
> $read_Icache = \&read_Icache_nopack;
> $write_Icache = \&write_Icache_nopack;
> }
>
> if($Readcache) { #Read in existing InodeCache from file
> open(RCACHE, $gzip ? "/bin/zcat $Readcache |" : "< $Readcache") or
> die "ERROR: Can't open '$Readcache' for reading!($!)\n";
> } elsif($Writecache) {
> my $wcachefile = "${outfile}.icache${sfx}";
> die "ERROR: '$wcachefile' already exists!\n" if -e $wcachefile;
> open(WCACHE, $outpipe . $wcachefile)
> or die "ERROR: Can't open '$wcachefile' for writing!($!)\n";
> }
>
> ############################################################################
> chdir($TopDir); #Do this so we don't need to worry about distinguishing
> #between absolute and relative (to TopDir) pathnames
> #Everything following opening the files occurs in TopDir
> ############################################################################
> initialize_backups();
>
> warn "**Populating Icache...\n" if $verbose>=1;
> my $toticache=0;
> if($Readcache) { #Read in existing InodeCache from file
> $ptype = $Conf{CompressLevel} ? "cpool" : "pool";
> while(<RCACHE>) {
> if(/^(\d+)\s+(\S+)/) {
> &$write_Icache($ptype, $1, $2);
> $toticache++;
> } elsif(/^#(c?pool)\s*$/) {
> $ptype = $1;
> } else {
> warn "'$_' is not a valid cache entry\n" unless /^#/;
> }
> }
> close RCACHE;
> } else { # Create InodeCache
> if($pool eq "both") {
> $toticache += populate_icache("pool");
> $toticache += populate_icache("cpool");
> } else {
> $toticache += populate_icache($pool);
> }
> close WCACHE if $Writecache;
> }
> warn "*Icache entries created=$toticache\n" if $verbose >=2;
> if(defined(&total_size)){ #JJK-DEBUG: Print out sizes used by InodeCache
> warn sprintf("InodeCache total size: %d\n",total_size(\%InodeCache));
> warn sprintf("Cpool cache entries: %d\tSize: %d\tTotal_size: %d\n",
> (scalar keys $InodeCache{"pool"}),
> size($InodeCache{"pool"}),
> total_size($InodeCache{"pool"}))
> if defined $InodeCache{"pool"};
> warn sprintf("Cpool cache entries: %d\tSize: %d\tTotal_size: %d\n",
> (scalar keys
> $InodeCache{"cpool"}),size($InodeCache{"cpool"}),
> total_size($InodeCache{"cpool"}))
> if defined $InodeCache{"cpool"};
> }
>
> exit unless @backups;
>
> warn "**Recording nonpooled top level files...\n" if $verbose>=1;
> foreach (@nonpooled) { #Print out the nonpooled files
> printf NOPOOL "%s\n", $_;
> }
> close(NOPOOL);
>
> warn "**Recording linked & non-linked pc files...\n" if $verbose>=1;
> my $lastmachine = '';
> foreach (@backups) {
> m|$pcdir([^/]*)|;
> $lastmachine = $1 if $1 ne $lastmachine;
>
> warn "* Recursing through backup: $_\n" if $verbose>=1;
> $compresslvl = shift(@compresslvls);
> $attr = BackupPC::Attrib->new({ compress => $compresslvl });
> #Reinitialize this jLib global variable in case new compress level
> $ptype = ($compresslvl > 0 ? "cpool" : "pool");
> m|(.*/)(.*)|;
> find_pool_links($1, $2);
> }
> close(LINKS);
> close(NOLINKS);
>
> ##############################################################################
> #Print summary & concluding message:
> printf STDERR "\nDirectories=%d\tTotal Files=%d\n",
> $directories, ($totfiles + $#nonpooled);
> printf STDERR "Link files=%d\t [Zero-length=%d, Hardlinks=%d (Fixed=%d)]\n",
> ($zerolengthfiles+$existinglink_pcfiles+$fixedlink_pcfiles),
> $zerolengthfiles, ($existinglink_pcfiles+$fixedlink_pcfiles),
> $fixedlink_pcfiles;
> printf STDERR "Non-pooled Toplevel=%d\n", $#nonpooled;
> printf STDERR "Non-pooled Other=%d [Valid-pc=%d (Failed-fixes=%d),
> Invalid-pc=%d]\n",
> $unlinked_files, ($unlinked_pcfiles+$unlinkable_pcfiles),
> $unlinkable_pcfiles, $unlinked_nonpcfiles;
> printf STDERR "PC files missing attrib entry=%d\n", $missing_attribs;
>
> my ($rsyncnopool, $rsyncnolinks);
> if($gzip) {
> $gzip = "-g ";
> $rsyncnopool = "zcat $nopoolfile | rsync -aOH --files-from=- $TopDir
> <newTopDir>";
> $rsyncnolinks = "zcat $nolinksfile | rsync -aOH --files-from=- $TopDir
> <newTopDir>";
> } else {
> $gzip = "";
> $rsyncnopool = "rsync -aOH --files-from=$nopoolfile $TopDir
> <newTopDir>";
> $rsyncnolinks = "rsync -aOH --files-from=$nolinksfile $TopDir
> <newTopDir>";
> }
>
> print STDERR <<EOF;
> ------------------------------------------------------------------------------
> To complete copy, do the following as user 'root' or 'backuppc':
> 1. Copy/rsync over the pool & cpool directories to the new TopDir
> (note this must be done *before* restoring '$linksfile')
> For rsync:
> rsync -aO $TopDir\{pool,cpool\} <newTopDir>
>
> 2. Copy/rsync over the non-pooled top level files ($nopoolfile)
> For rsync:
> $rsyncnopool
>
> 3. Restore the pc directory tree, hard links and zero-sized files:
> $0 -r ${gzip}[-t <newTopDir>] $linksfile
>
> 4. Optionally, copy/rsync over the non-linked pc files ($nolinksfile)
> For rsync:
> $rsyncnolinks
>
> ------------------------------------------------------------------------------
>
> EOF
> exit;
>
> ##############################################################################
> ##############################################################################
> #SUBROUTINES:
>
> sub usage
> {
> print STDERR <<EOF;
>
> usage: $0 [options] -o <outfile> [<relpath-to-pcdir> <relpath-to-pcdir>...]
> $0 [options] -r <restorefile> [<relpath-to-pcdir>
> <relpath-to-pcdir>...]
>
> where the optional <relpath-to-pcdir> arguments are paths relative to the pc
> tree of form: 'host' or 'host/share' or 'host/share/n' or
> 'host/share/n/.../'
> If no optional <relpath-to-pcdir> arguments are given then the entire pc
> tree
> is backed up or restored.
>
> Options: [Common to copy & restore]
> --dryrun|-d Dry-run - doesn\'t change pool or pc trees
> Negate with: --nodryrun
> --Force|-F Overrides various checks (e.g., if BackupPC running
> or if directories present)
> --gid|-G <gid> Set group id for backuppc user
> [Default = $giddef_def]
> --gzip|-g Pipe files to/from gzip compression
> --topdir|-t Location of TopDir.
> [Default = $TopDir_def]
> Note you may want to change from default for example
> if you are working on a shadow copy.
> --uid|-U <uid> Set user id for backuppc user
> [Default = $uiddef_def]
> --verbose|-v Verbose (repeat for more verbosity)
> [Default level = $verbose_def]
> Use --noverbose to turn off all verbosity
> --help|-h This help message
>
> Options: [Copy only]
> --cleanpool|-c If orphaned files (nlinks=1) found when populating
> Icache, remove them (and renumber chain as needed)
> NOTE: Orphans shouldn\'t exist if you have just run
> BackupPC_nightly
> --fixlinks|-f Attempt to link valid pc files back to the pool
> if not already hard-linked
> NOTE: this changes files in the pc and\/or pool
> of your source too!
> --outfile|-o [outfile] Required stem name for the 3 output files
> <outfile>.nopools
> <outfile>.links
> <outfile>.nolinks
> --pool|-p [pool|cpool|both] Pools to include in Icache tree
> [Default = $pool_def]
> --Pack|-P Pack both the indices and values of the InodeCache.
> This saves approximately 52 bytes per pool entry.
> For o(1 million) entries, this reduces usage from
> ~190 bytes/entry to ~140 bytes/entry
> --Readcache|-R [file] Read in previously written inode pool cache from
> file. This allows resuming previously started
> backups
> Only use this option if the pool has not changed.
> Note: can\'t be used with --Writecache|-W
> --stdio|-s Print the directory tree, links and zero-sized files
> to stdout so it can be piped directly to another
> copy
> of the program running --restore
> NOTE: Status, Warnings, and Errors are printed to
> stdout.
> --WriteCache|-W Write inode pool cache to file <outfile>.icache
> as a 2 column list consisting of:
> <inode number> <pool entry>
> If repeated, include md5sums of the (uncompressed)
> pool file contents as a 3rd column (this is slow)
> If argument path is '-', then quit after writing
> cache.
> Note: can\'t be used with --Readcache|-R
> Options: [Restore only]
> --Overwrite|-O OVERWRITE existing files & directories (Dangerous)
> --Skip|-S SKIP existing backups
> --stdio Read the directory tree, links and zero-sized files
> from stdin so it can be piped directly from another
> copy of the program running in the create mode.
> Note <restorefile> should not be used in this case
> For example, the following pipe works:
> $0 -t <source TopDir> [-g] --stdio | $0 -t <dest
> TopDir> [-g] -r --stdio
>
> OVERVIEW:
> First create an inode-indexed cache (hash) of all pool entries.
> Note the cache may optionaly be written out to a file for later re-use
> with the --Readcache|-R flag.
>
> Then, recurse through the paths specified relative to the pc tree or if no
> paths specified then the entire pc tree.
> - If the file is a (non-pooled) top-level log file, then write its path
> relative to pcdir out to <outfile>.nopool
> Note this includes all non-directories above the share level plus the
> backInfo files that are covered by the input paths
> - If the file is a directory, zero length file, or an existing hard link to
> the tree, then write it out to <outfile>.links
> - If the file is not hard-linked but is a valid non-zero length pc file
> (f-mangled and present in the attrib file) and --fixlinks is selected,
> then try to link it properly to the appropriate pool.
> - Otherwise, add it to <outfile>.nolinks
>
> NOTE: TO ENSURE INTEGRITY OF RESULTS IT IS IMPORTANT THAT BACKUPPC IS NOT
> RUNNING (use --Force to override)
>
> NOTE: you should run BackupPC_nightly before running this program so that
> no unnecessary links are backed up; alternatively, set the --cleanpool
> option which will remove orphan pool entries.
>
> EOF
> exit(1);
> }
>
> #Glob to determine what is being backed up.
> #Make sure each backup seems valid and determine it's compress level
> #Collect top-level non-pooled files
> sub initialize_backups
> {
> if (!@ARGV) { # All backups at backup number level
> # TopDir/pc/<host>/<nn>
> @backups = bsd_glob("${pcdir}*/*"); #2 levels down;
> @nonpooled = grep(! -d , bsd_glob("${pcdir}*")); #All
> non-directories
> } else { # Subset of backups
> return if($Writecache && $ARGV[0] eq '-'); #Don't copy pc tree
> foreach(@ARGV) {
> my $backupdir = $pcdir . $_ . '/';
> $backupdir =~ s|//*|/|g; #Remove redundant slashes
> $backupdir =~ s|/\.(?=/)||g; #Replace /./ with /
> die "ERROR: '$backupdir' is not a valid pc tree
> subdirectory\n" if $pcdir eq $backupdir || !-d $backupdir;
> if($backupdir =~ m|^\Q${pcdir}\E[^/]+/$|) { #Hostname
> only
> push(@backups, bsd_glob("${backupdir}*"));
> } else { # At share level or below #Backup number or
> below
> push(@backups, ${backupdir});
> }
> }
> @backups = keys %{{map {$_ => 1} @backups}}; #Eliminate dups
> @nonpooled = keys %{{map {$_ => 1} @nonpooled}}; #Eliminate dups
> }
>
> push(@nonpooled, grep(! -d, @backups)); #Non-directories
> @backups = grep(-d , @backups); #Directories
> push(@nonpooled, grep(m|/backupInfo$|, @backups)); # backupInfo not
> pooled
> #Note everything else *should* be pooled - if not, we will catch it later
> as an error
> @backups = grep(!m|/backupInfo$|, @backups);
>
> foreach(@backups) {
> s|/*$||; #Remove trailing slash
> m|^(\Q${pcdir}\E[^/]+/[^/]+)(/)?|;
> die "Backup '$1' does not contain a 'backupInfo' file\n"
> unless -f "$1/backupInfo";
> push(@nonpooled, "$1/backupInfo") unless $2; #Don't include if
> share level or below
> my $compress = get_bakinfo("$1","compress");
> push(@compresslvls, $compress);
> my $thepool = $compress > 0 ? 'cpool' : 'pool';
> die "Backup '$1' links to non-included '$thepool'\n"
> if $pool ne 'both' && $pool ne $thepool;
> }
> }
>
> #Iterate through the 'pooldir' tree and populate the InodeCache hash
> sub populate_icache
> {
> my ($fpool) = @_;
> my (@fstat, $dh, @dirlist);
> my $icachesize = 0;
>
> return 0 unless bsd_glob("$fpool/[0-9a-f]"); #No entries in pool
> print WCACHE "#$fpool\n" if $Writecache;
> my @hexlist = ('0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'a',
> 'b', 'c', 'd', 'e', 'f');
> my ($idir,$jdir,$kdir);
> print STDERR "\n* Populating Icache with '$fpool' branch:[0-f] " if
> $verbose >=2;
> foreach my $i (@hexlist) {
> print STDERR "$i " if $verbose >=2;
> $idir = $fpool . "/" . $i . "/";
> foreach my $j (@hexlist) {
> $jdir = $idir . $j . "/";
> foreach my $k (@hexlist) {
> $kdir = $jdir . $k . "/";
> unless(opendir($dh, $kdir)) {
> warn "Can't open pool directory:
> $kdir\n" if $verbose>=4;
> next;
> }
> my @entries = grep(!/^\.\.?$/, readdir ($dh));
> #Exclude dot files
> closedir($dh);
> warn "POOLDIR: $kdir (" . ($#entries+1) ."
> files)\n"
> if $verbose >=3;
>
> if($cleanpool) { #Remove orphans & renumber
> first and then
> #create the Icache after orphan deletion &
> chain renumbering
> #NOTE: we need to do this before populating the
> InodeCache
> #since pool deletions & renumbering would
> change the cache
> my @poolorphans =
> grep(-f $kdir . $_ &&
> (stat(_))[3] < 2, @entries);
> foreach (sort {poolname2number($b) cmp
> poolname2number($a)}
> @poolorphans) { #Reverse sort to minimize moves
> my $pfile = $kdir . $_;
> my $result
> =delete_pool_file($pfile);
> if($verbose >=1) {
> $result == 1 ?
> warn "WARN:
> Deleted orphan pool entry: $pfile\n":
> warn "ERROR:
> Couldn't properly delete orphan pool entry: $pfile\n"
> }
> }
> @entries = grep(!/^\.\.?$/, readdir
> ($dh))
> if(@poolorphans); #Reload if
> orphans renumbered/removed
> }
> foreach (@entries) {
> # Go through all entries in terminal
> cpool branch
> @fstat = stat($kdir . $_);
> #Note: 0=dev, 1=ino, 2=mode, 3=nlink,
> 4=uid, 5=gid, 6=rdev
> #7=size, 8=atime, 9=mtime, 10=ctime,
> 11=blksize, 12=blocks
> if($fstat[3] >= 2) { # Valid
> (non-orphan) pool entry
> #Note: No sense in creating icache entries for orphans
> &$write_Icache($fpool,
> $fstat[1], $_);
> if($Writecache >= 2) {
> printf WCACHE "%d %s
> %s\n", $fstat[1], $_,
> zFile2FullMD5($bpc,
> $md5, $kdir . $_);
> }elsif($Writecache) {
> printf WCACHE "%d
> %s\n", $fstat[1], $_;
> }
> $icachesize++;
> } else { #Orphan pool entry
> warn "WARN: Orphan pool entry:
> $_\n"
> if $verbose>=1;
> }
> warn "WARN: Zero-size pool entry: $_\n"
> if $fstat[7] == 0 &&
> $verbose>=1;
> }
> } # kdir
> } # jdir
> } # idir
> print STDERR "\n" if $verbose >=2;
> return $icachesize;
> }
>
> #Recursively go through pc tree
> sub find_pool_links
> {
> my ($dir, $filename) = @_;
>
> my $dh;
> my $file = $dir . $filename;
> if(-d $file) {
> my @fstat = stat(_); #Note last stat was -d $file
> #Print info to signal & re-create directory:
> #D <user> <group> <mod> <atime> <mtime> <file name>
> print_file_reference("D", $file, \@fstat);
> $directories++;
> opendir($dh,$file);
> my @contents = readdir($dh);
> closedir($dh);
> foreach (@contents) {
> next if /^\.\.?$/; # skip dot files (. and ..)
> find_pool_links($file . '/', $_); #Recurse
> }
> }
> else { #Not a directory
> my @fstat = stat(_); #Note last stat was -d $file
> $totfiles++;
> if($fstat[7] == 0) { #Zero size file
> print_file_reference("Z", $file, \@fstat);
> $zerolengthfiles++;
> return;
> } elsif($fstat[3] > 1) { #More than one link
> if(defined(my $pooltarget = &$read_Icache($ptype,
> $fstat[1]))) {
> #Valid hit: hard link found in InodeCache
> $existinglink_pcfiles++;
> $pooltarget =~ /(.)(.)(.)/;
>
> print_hardlink_reference("$ptype/$1/$2/$3/$pooltarget", $file);
> return;
> }
> }
>
> if($file =~ m|^\Q${pcdir}\E[^/]+/[^/]+/backupInfo$|) {
> $totfiles--;
> return; #BackupInfo already taken care of in
> 'nonpooled'
> }
>
> #Non-zero sized file not in InodeCache that is not 'backupInfo'
> if($filename =~ /^f|^attrib$/ && -f $file) {
> #VALID pc file if f-mangled or attrib file
> if( $filename =~ /^f/ &&
> !( -f "$dir/attrib" &&
> $attr->read($dir,"attrib") == 1 &&
>
> defined($attr->get($bpc->fileNameUnmangle($filename))))) {
> #Attrib file error if attrib file missing or
> #unmangled name not an element of the attrib
> file
> warn "ERROR: $file (inode=$fstat[1],
> nlinks=$fstat[3]) VALID pc file with MISSING attrib entry\n";
> $missing_attribs++;
> }
> if($fixlinks) {
> if(fix_link($file, \@fstat) == 1) {
> $fixedlink_pcfiles++;
> return;
> } else {$unlinkable_pcfiles++;}
> } else{
> warn "ERROR: $file (inode=$fstat[1],
> nlinks=$fstat[3]) VALID pc file NOT LINKED to pool\n";
> $unlinked_pcfiles++;
> }
> } else {
> warn "ERROR: $file (inode=$fstat[1], nlinks=$fstat[3])
> INVALID pc file and UNLINKED to pool\n";
> $unlinked_nonpcfiles++
> }
> #ERROR: Not linked/linkable to pool (and not zero length file
> or directory
> print_file_reference("X", $file, \@fstat);
> printf NOLINKS "%s\n", $file;
> $unlinked_files++
> }
> }
>
> #Try to fix link by finding/creating new pool-link
> sub fix_link
> {
> my ($filename, $fstatptr) = @_;
>
> my ($plink, @pstat);
> my ($md5sum, $result) = zFile2MD5($bpc, $md5, $filename, 0,
> $compresslvl);
> $result = jMakeFileLink($bpc, $filename, $md5sum, 2, $compresslvl,
> \$plink)
> if $result > 0;
> #Note we choose NewFile=2 since if we are fixing, we always want to
> make the link
> if($result > 0) { #(true if both above calls succeed)
> $plink =~ m|.*(\Q${ptype}\E.*/(.*))|;
> $plink = $1; #Path relative to TopDir
> if($dryrun) {
> @pstat = @$fstatptr;
> } else {
> @pstat = stat($plink);
> &$write_Icache($ptype, $pstat[1], $2);
> }
> print_hardlink_reference($plink, $filename);
> if($verbose >=2) {
> warn "${DRYRUN}NOTICE: pool entry '$plink'
> (inode=$fstatptr->[1]) missing from Icache and added back\n" if $result == 3;
> warn sprintf("${DRYRUN}NOTICE: %s (inode=%d, nlinks=%d)
> was fixed by linking to *%s* pool entry: %s (inode=%d, nlinks=%d)\n",
> $filename, $fstatptr->[1],
> $fstatptr->[3],
> ($result == 1 ? "existing" :
> "new"),
> $plink, $pstat[1], $pstat[3])
> if $result !=3;
> }
> return 1;
> }
> warn "ERROR: $filename (inode $fstatptr->[1]); fstatptr->[3] links;
> md5sum=$md5sum) VALID pc file FAILED FIXING by linking to pool\n";
> return 0;
> }
>
> sub print_hardlink_reference
> {
> my ($poolname, $filename) = @_;
>
> #<poolname> <filename>
> printf LINKS "%s %s\n", $poolname, $filename; #Print hard link
> }
>
> sub print_file_reference
> {
> my ($firstcol, $filename, $fstatptr) = @_;
>
> #<firstcol> <user> <group> <mod> <mtime> <filename>
> printf(LINKS "%s %s %s %04o %u %s\n",
> $firstcol, UID($fstatptr->[4]), GID($fstatptr->[5]),
> $fstatptr->[2] & 07777, $fstatptr->[9], $filename);
> }
>
> ###############################################################################
> ##Read/write InodeCache
>
> #Note that without packing, InodeCache hash requireas about 186
> #bytes/entry ('total_size') to store each entry with about 74 bytes
> #required to store the hash structure ('size') alone and an additional
> #exactly 112 byte to store the 32-char hex string (this assumes chains
> #are sparse enough not to affect the average storage). The 74 bytes
> #for the hash structure consists of about 8 bytes for the inode data
> #plus 48 bytes for the overhead of the scalar variable storing the
> #index plus ~22 bytes for the magic of hash storage and the pointer to
> #the hash value. The additional 112 bytes consists of 64 bytes to
> #store the unpacked 32-char hex string plus 48 bytes of overhead for
> #the scalar variable storing the hash value.
>
> #If we pack the value, we can reduce the hex string storage from 64
> #bytes to 16 bytes by storing 2 hex chars per byte. This reduces the
> #total storage by 48 bytes/entry from 186 to 138 bytes/entry.
>
> #If we pack the index as an unsigned 32-bit long (V), then we can
> #save an additional 4 bytes, reducing the average storage to about 134
> #bytes/entry. But, we need to make sure that the highest inode number
> #is less than 2^32 (~4.3 billion). If we pack the index as an unsigned
> #64-bit quad (Q), then we use 8 bytes for the index data and we are
> #back to a total 138 bytes/entry.
>
> #Note these averages are based on a hash size (i.e. pool size) of 1
> #million entries. The 74 bytes/entry for the hash structure alone will
> #increase (slowly) as the size of the hash increases but the storage
> #size for the hash vaue (112 bytes unpacked, 64 packed) does not
> #increase.
>
> sub read_Icache_nopack
> {
> $InodeCache{$_[0]}{$_[1]}; #No packing of values & indices
> }
>
> sub read_Icache_pack
> {
> # join("_", unpack("H32V", $InodeCache{$_[0]}{$_[1]})); #Pack values only
>
> #Also pack indices:
> join("_", unpack("H32V", $InodeCache{$_[0]}{pack("V",$_[1])}));#32-bit
> indices
> # join("_", unpack("H32V", $InodeCache{$_[0]}{pack("Q",$_[1])}));#64-bit
> indices
> #Note: shouldn't need 64 bits for inodes since they are ino_t
> #which are typdef'd as longs
> }
>
> sub write_Icache_nopack
> {
> $InodeCache{$_[0]}{$_[1]} = $_[2]; #No packing of values & indices
> }
>
> sub write_Icache_pack
> {
> my ($pool, $index, $value) = @_;
>
> $value =~ m|(.{32})(.(.*))?|;
> #Pack values only
> # $InodeCache{$pool}{$index}= defined($3) ? pack("H32V", $1, $3) :
> pack("H32", $1);
>
> #Also pack indices:
> $InodeCache{$pool}{pack("V",$index)} = defined($3) ? pack("H32V", $1,
> $3) : pack("H32", $1); #32-bit indices
>
> # $InodeCache{$pool}{pack("Q",$index)} = defined($3) ? pack("H32V", $1,
> $3) : pack("H32", $1); #64-bit indices
> #Note shouldn't need 64 bits for inodes since they are ino_t which
> #are typdef'd as longs
> }
>
> ###############################################################################
> # Return user name corresponding to numerical UID with caching
> sub UID
> {
> unless(exists($UIDcache{$_[0]})) {
> unless(defined($UIDcache{$_[0]} = getpwuid($_[0]))) {
> printf STDERR "Warning: '%d' not found in password
> file, writing numerical UID instead...\n", $_[0];
> $UIDcache{$_[0]} = $_[0];
> }
> }
> return $UIDcache{$_[0]};
> }
>
> # Return group name corresponding to numerical GID with caching
> sub GID
> {
> unless(exists($GIDcache{$_[0]})) {
> unless(defined($GIDcache{$_[0]} = getgrgid($_[0]))) {
> printf STDERR "Warning: '%d' not found in group file,
> writing numerical GID instead...\n", $_[0];
> $GIDcache{$_[0]} = $_[0];
> }
> }
> return $GIDcache{$_[0]};
> }
>
> # Return numerical UID corresponding to user name with caching
> sub USER
> {
> unless(exists($USERcache{$_[0]})) {
> unless(defined($USERcache{$_[0]} = getpwnam($_[0]))) {
> printf STDERR "Warning: '%s' not found in password
> file, writing default backuppc UID (%d) instead...\n", $_[0], $uiddef;
> $USERcache{$_[0]} = $uiddef;
> }
> }
> return $USERcache{$_[0]};
> }
>
> # Return numerical GUID coresponding to group name with caching
> sub GROUP
> {
> unless(exists($GROUPcache{$_[0]})) {
> unless(defined($GROUPcache{$_[0]} = getgrnam($_[0]))) {
> printf STDERR "Warning: '%s' not found in group file,
> writing default backuppc GID (%d) instead...\n", $_[0], $giddef;
> $GROUPcache{$_[0]} = $giddef;
> }
> }
> return $GROUPcache{$_[0]};
> }
> ###############################################################################
> ###############################################################################
> sub do_restore
> {
> my $currbackup = "";
> my $formaterr = 0;
> my $ownererr = 0;
> my $permserr = 0;
> my $mkdirerr = 0;
> my $mkzeroerr = 0;
> my $mklinkerr = 0;
> my $filexsterr = 0;
> my $utimerr = 0;
> my $newdir = 0;
> my $newzero = 0;
> my $newlink = 0;
> my $skipped = 0;
>
> ###############################################################################
> if($stdio) {
> open(LINKS, $gzip ? "/bin/zcat - |" : "<& STDIN");
> } else {
> my $restorefile = shift @_;
> open(LINKS, $gzip ? "/bin/zcat $restorefile |" : "<
> $restorefile") or
> die "ERROR: Can't open '$restorefile' for
> reading!($!)\n";
> }
> my @backuplist = @_;
> ###############################################################################
> chdir($TopDir); #Do this so we don't need to worry about distinguishing
> #between absolute and relative (to TopDir) pathnames
> ###############################################################################
> die "ERROR: pool directories empty!\n"
> unless $Force || bsd_glob("{cpool,pool}/*");
>
> my(@skiplist, $skiplist); #Backups to skip...
> if($Skip) {
> @skiplist = grep(@{[bsd_glob("$_/f*")]},
> bsd_glob("${pcdir}*/[0-9]*"));
> #NOTE: glob must be evaulated in list context because
> stateful...
> } elsif(!$Overwrite && grep(-d, bsd_glob("${pcdir}*/[0-9]*/f*"))) {
> die "ERROR: pc directory contains existing backups!\n(use
> --Skip to skip or --Overwrite to OVERWRITE)\n"
> }
> $skiplist = "\Q" . join("\E|\Q", @skiplist) . "\E" if @skiplist;
>
> my $backuplist; #Backups to selectively backup...
> $backuplist = "\Q" . join("\E|\Q", @backuplist) . "\E" if @backuplist;
>
> umask 0000; #No permission bits disabled
> my $time = time; #We will use this for setting atimes.
> my @dirmtimes =(); #Stack consisting of paired "dir, mtime" entries
> my $matchback = 1;
> my ($line);
>
> LINE: while($line = <LINKS>) {
> chomp $line;
>
> unless($line =~ m|^[a-f0-9DZX#]|) { #First character test
> print STDOUT "ERR_CHAR1: $line\n";
> warn sprintf("ERROR: Illegal first line character:
> %s\n",
> substr($line,0,1))
> if $verbose >=1;
> next LINE; #NOTE: next without line would go to next
> switch case
> }
>
> switch ($&) { #Go through cases...
> case 'D' {
> unless($line =~ m|^D +([^ ]+) +([^ ]+) +([^ ]+)
> +([^ ]+) +(\Q${pcdir}\E.*)|) {
> print STDOUT "ERR_DFRMT $line\n";
> $formaterr++;
> next LINE;
> }
> #NOTE: 1=uname 2=group 3=mode 4=mtime 5=dirpath
> print STDERR "$1|$2|$3|$4|$5|\n" if $verbose
> >=4;
> my $user = USER($1);
> my $group = GROUP($2);
> my $mode = oct($3);
> my $mtime = $4;
> my $dir = $5; $dir =~ s|/*$||; #Remove
> trailing slash
> $dir =~ m|(.*)/|;
> my $pdir = $1; #parent dir
>
> #Don't restore if on skiplist or if not on
> backuplist
> if(($skiplist && $dir =~ m|^($skiplist)|) ||
> ($backuplist && $dir !~
> m|^pc/($backuplist)|)) {
> $matchback = 0;
> next LINE;
> } else {
> $matchback = 1;
> }
>
> if($verbose >= 1) {
> $dir =~ m|pc/([^/]*/[^/]*)|;
> if($1 ne $currbackup) {
> $currbackup = $1;
> warn "RESTORING: $currbackup\n";
> }
> }
>
> #Look at @dirmtimes stack to see if we have
> backed out of
> #top directory(ies) on stack and can now set
> mtime
> my $lastdir; #If in new part of tree, set
> mtimes for past dirs
> while(defined($lastdir = shift(@dirmtimes)) &&
> $dir !~ m|^\Q$lastdir\E|) {
> jutime($time, shift(@dirmtimes),
> $lastdir);
> }
> unshift(@dirmtimes, $lastdir) if $lastdir; #Put
> back last one
>
> if(! -e $dir) { #Make directory (nothing in the
> way)
> unless( -d $pdir || jmake_path $pdir){
> #Make parent directory if
> doesn't exist
> #NOTE: typically shouldn't be
> necesary since list is
> #created with directories along
> the way
> print STDOUT "ERR_MKDIR
> $line\n";
> $mkdirerr++;
> next LINE;
> }
> #Make directory & set perms
> unless(mkdir $dir, $mode) {
> print STDOUT "ERR_MKDIR
> $line\n";
> $mkdirerr++;
> next LINE;
> }
> } elsif(-d $dir) { #Directory already exists
> #Set perms
> unless(jchmod $mode, $dir) {
> print STDOUT "ERR_PERMS
> $line\n";
> $permserr++;
> next LINE;
> }
> } else { #Non-directory in the way, abort line
> print STDOUT "ERR_DEXST $line\n";
> $filexsterr++;
> next LINE;
> }
>
> #Set ownership
> unless(jchown $user, $group, $dir){
> print STDOUT "ERR_OWNER
> $line\n";
> $ownererr++;
> next LINE;
> }
> #Put dir mtime on stack
> unshift(@dirmtimes, $mtime); #We need to set
> dir mtime
> unshift(@dirmtimes, $dir); #when done adding
> files to dir
>
> $newdir++;
> }
>
> case 'Z' {
> next LINE unless $matchback;
> unless($line =~ m|^Z +([^ ]+) +([^ ]+) +([^ ]+)
> +([^ ]+) +((\Q${pcdir}\E.*/)(.*))|) {
> print STDOUT "ERR_ZFRMT $line\n";
> $formaterr++;
> next LINE;
> }
> #NOTE: 1=uname 2=group 3=mode 4=mtime
> 5=fullpath 6=dir 7=file
> print STDERR "$1|$2|$3|$4|$5|$6|$7\n" if
> $verbose >=4;
> my $user = USER($1);
> my $group = GROUP($2);
> my $mode = oct($3);
> my $mtime = $4;
> my $file = $5;
> my $dir = $6;
> my $name = $7;
>
> #Check if file exists and not overwritable
> if(-e $file && !($Overwrite && -f $file &&
> unlink($file))) {
> print STDOUT "ERR_ZEXST $line\n";
> $filexsterr++;
> next LINE;
> }
>
> unless( -d $dir || jmake_path $dir){ #Make dir
> if doesn't exist
> #NOTE: typically shouldn't be necesary
> since list is
> #created with directories along the way
> print STDOUT "ERR_MKDIR $line\n";
> $mkdirerr++;
> next LINE;
> }
> #Create zero-length file with desired perms
> unless(sysopen(ZERO, $file, $CREATMASK, $mode)
> && close(ZERO)){
> print STDOUT "ERR_MKZER $line\n";
> $mkzeroerr++;
> next LINE;
> }
> unless(jchown $user, $group, $file){ #Set
> ownership
> print STDOUT "ERR_OWNER $line\n";
> $ownererr++;
> next LINE;
> }
> unless(jutime $time, $mtime, $file) { #Set
> timestamps
> $utimerr++;
> next LINE;
> }
> $newzero++;
> }
>
> case 'X' {$skipped++; next LINE; } #Error line
>
> case '#' {next LINE; } #Comment line
>
> else { #Hard link is default switch case
> next LINE unless $matchback;
> unless($line =~
> m|(c?pool/[0-9a-f]/[0-9a-f]/[0-9a-f]/[^/]+) +((\Q${pcdir}\E.*/).*)|) {
> print STDOUT "ERR_LFRMT $line\n";
> $formaterr++;
> next LINE;
> }
> print STDERR "$1|$2|$3\n" if $verbose >=4; #
> 1=source 2=target 3=targetdir
> my $source = $1;
> my $target = $2;
> my $targetdir = $3;
>
> if(-e $target &&
> !($Overwrite && -f $target &&
> unlink($target))){
> print STDOUT "ERR_LEXST $line\n";
> $filexsterr++; #Target exists and not
> overwritable
> next LINE;
> }
>
> unless(-d $targetdir || jmake_path $targetdir){
> #Make targetdir if doesn't exist
> #NOTE: typically shouldn't be necesary
> since list is
> #created with directories along the way
> print STDOUT "ERR_MKDIR $line\n";
> $mkdirerr++;
> next LINE;
> }
>
> unless(jlink $source, $target) { #Make link
> print STDOUT "ERR_MKLNK $line\n";
> $mklinkerr++;
> next LINE;
> }
> $newlink++
> }
> } #SWITCH end
> } # WHILE reading lines
>
> #Set mtimes for any remaining directories in @dirmtimes stack
> while(my $dir = shift(@dirmtimes)){
> jutime($time, shift(@dirmtimes), $dir);
> }
>
> ###############################################################################
> #Print results:
>
>
> printf("\nSUCCESSES: Restores=%d [Dirs=%d Zeros=%d Links=%d]
> Skipped=%d\n",
> ($newdir+$newzero+$newlink), $newdir, $newzero, $newlink,
> $skipped);
> printf("ERRORS: TOTAL=%d Format=%d Mkdir=%d Mkzero=%d Mklink=%d\n",
> ($formaterr + $mkdirerr + $mkzeroerr + $mklinkerr +
> $filexsterr + $ownererr + $permserr + $utimerr),
> $formaterr, $mkdirerr, $mkzeroerr, $mklinkerr);
> printf(" File_exists=%d Owner=%d Perms=%d Mtime=%d\n\n",
> $filexsterr, $ownererr, $permserr, $utimerr);
>
> exit; #RESTORE done
> }
> ###############################################################################
>
> ------------------------------------------------------------------------
> jLib.pm
>
> #============================================================= -*-perl-*-
> #
> # BackupPC::jLib package
> #
> # DESCRIPTION
> #
> # This library includes various supporting subroutines for use with BackupPC
> # functions used by BackupPC.
> # Some of the routines are variants/extensions of routines originally
> written
> # by Craig Barratt as part of the main BackupPC release.
> #
> # AUTHOR
> # Jeff Kosowsky
> #
> # COPYRIGHT
> # Copyright (C) 2008-2013 Jeff Kosowsky
> #
> # This program is free software; you can redistribute it and/or modify
> # it under the terms of the GNU General Public License as published by
> # the Free Software Foundation; either version 2 of the License, or
> # (at your option) any later version.
> #
> # This program is distributed in the hope that it will be useful,
> # but WITHOUT ANY WARRANTY; without even the implied warranty of
> # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> # GNU General Public License for more details.
> #
> # You should have received a copy of the GNU General Public License
> # along with this program; if not, write to the Free Software
> # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
> #
> #========================================================================
> #
> # Version 0.4.1, released January 2013
>
> # CHANGELOG
> # 0.4.0 (Jan 2011)
> # 0.4.1 (Jan 2013) Added MD52RPath, is_poollink
> # Added jchmod, jchown
> # 0.4.2 (Feb 2013) Made write_attrib more robust
> #
> #========================================================================
>
> package BackupPC::jLib;
>
> use strict;
> use vars qw($VERSION);
> $VERSION = '0.4.1';
>
> use warnings;
> use File::Copy;
> use File::Glob ':glob';
> use File::Path;
> use File::Temp;
> use Fcntl; #Required for RW I/O masks
>
> use BackupPC::Lib;
> use BackupPC::Attrib;
> use BackupPC::FileZIO;
> use Data::Dumper; #Just used for debugging...
>
> no utf8;
>
> use constant _128KB => 131072;
> use constant _1MB => 1048576;
> use constant LINUX_BLOCK_SIZE => 4096;
> use constant TOO_BIG => 2097152; # 1024*1024*2 (2MB)
>
> require Exporter;
> our @ISA = qw(Exporter);
> our @EXPORT = qw(
> %Conf $dryrun $errorcount
> LINUX_BLOCK_SIZE TOO_BIG
> printerr warnerr firstbyte
> zFile2MD5 zFile2FullMD5
> link_recursively_topool
> jfileNameUnmangle
> getattr read_attrib count_file_attribs
> get_bakinfo get_file_attrib get_attrib_value get_attrib_type
> write_attrib write_file_attrib
> attrib
> MD52RPath
> is_poollink GetPoolLink jMakeFileLink
> poolname2number renumber_pool_chain delete_pool_file
> run_nightly
> jcompare zcompare zcompare2
> delete_files jtouch
> jchmod jchown jcopy jlink junlink jmkdir jmkpath jmake_path
> jrename jrmtree jutime
> );
>
> #Global variables
> our %Conf;
> our $errorcount=0;
> our $dryrun=1; #global variable - set to 1 to be safe -- should be set to
> #0 in actual program files if you don't want a
> dry-run
> #None of the routines below will change/delete/write actual
> #file data if this flag is set. The goal is to do everything
> but
> #the final write to assist with debugging or cold feet :)
>
> sub printerr
> {
> print "ERROR: " . $_[0];
> $errorcount++;
> }
>
> sub warnerr
> {
> $|++; # flush printbuffer first
> warn "ERROR: " . $_[0];
> $errorcount++;
> }
>
> # Returns the firstbyte of a file.
> # If coding $coding undefined or 0, return as unpacked 2 char hexadecimal
> # string. Otherwise, return as binary byte.
> # Return -1 on error.
> # Useful for checking the type of compressed file/checksum coding.
> sub firstbyte {
> my ($file, $coding) = @_;
> my $fbyte='';
> sysopen(my $fh, $file, O_RDONLY) || return -1;
> $fbyte = -1 unless sysread($fh, $fbyte, 1) == 1;
> close($fh);
> if (! defined($coding) || $coding == 0) {
> $fbyte = unpack('H*',$fbyte); # Unpack as 2 char hexadecimal
> string
> }
> else {
> $fbyte = vec($fbyte, 0, 8); # Return as binary byte
> }
> return $fbyte;
> }
>
> # Compute the MD5 digest of a compressed file. This is the compressed
> # file version of the Lib.pm function File2MD5.
> # For efficiency we don't use the whole file for big files
> # - for files <= 256K we use the file size and the whole file.
> # - for files <= 1M we use the file size, the first 128K and
> # the last 128K.
> # - for files > 1M, we use the file size, the first 128K and
> # the 8th 128K (ie: the 128K up to 1MB).
> # See the documentation for a discussion of the tradeoffs in
> # how much data we use and how many collisions we get.
> #
> # Returns the MD5 digest (a hex string) and the file size if suceeeds.
> # (or "" and error code if fails).
> # Note return for a zero size file is ("", 0).
> #
> # If $size < 0 then calculate size of file by fully decompressing
> # If $size = 0 then first try to read corresponding attrib file
> # (if it exists), if doesn't work then calculate by fully decompressing
> # IF $size >0 then use that as the size of the file
> #
> # If compreslvl is undefined then use the default compress level from
> # the config file
>
> sub zFile2MD5
> {
> my($bpc, $md5, $name, $size, $compresslvl) = @_;
>
> my ($fh, $rsize, $filesize, $md5size);
>
> return ("", -1) unless -f $name;
> return ("", 0) if (stat(_))[7] == 0; #Zero size file
> $compresslvl = $Conf{CompressLevel} unless defined $compresslvl;
> unless (defined ($fh = BackupPC::FileZIO->open($name, 0,
> $compresslvl))) {
> printerr "Can't open $name\n";
> return ("", -1);
> }
>
> my ($datafirst, $datalast);
> my @data;
> #First try to read up to the first 128K (131072 bytes)
> if ( ($md5size = $fh->read(\$datafirst, _128KB)) < 0 ) { #Fist 128K
> printerr "Can't read & decompress $name\n";
> return ("", -1);
> }
>
> if ($md5size == _128KB) { # If full read, continue reading up to 1st MB
> my $i=0;
> #Read in up to 1MB (_1MB), 128K at a time and alternate between
> 2 data buffers
> while ( ($rsize = $fh->read(\$data[(++$i)%2], _128KB)) == _128KB
> && ($md5size += $rsize) < _1MB ) {}
> $md5size +=$rsize if $rsize < _128KB; # Add back in partial read
> $datalast = ($i > 1 ?
> substr($data[($i-1)%2], $rsize,
> _128KB-$rsize) : '')
> . substr($data[$i%2], 0 ,$rsize); #Last 128KB (up to
> 1MB)
> }
>
> $size = 0 unless defined $size;
> if($size > 0) { #Use given size
> $filesize = $size;
> }elsif($md5size < _1MB) { #Already know the size because read it all
> (note don't do <=)
> $filesize = $md5size;
> } elsif($compresslvl == 0) { #Not compressed, so: size = actual size
> $filesize = -s $name;
> }elsif($size == 0) { # Try to find size from attrib file
> $filesize = get_attrib_value($name, "size");
> if(!defined($filesize)) {
> warn "Can't read size of $name from attrib file so
> calculating manually\n";
> }
> }
> if(!defined($filesize)) { #No choice but continue reading to find size
> $filesize = $md5size;
> while (($rsize = $fh->read(\($data[0]), _128KB)) > 0) {
> $filesize +=$rsize;
> }
> }
> $fh->close();
>
> $md5->reset();
> $md5->add($filesize);
> $md5->add($datafirst);
> $md5->add($datalast) if defined($datalast);
> return ($md5->hexdigest, $filesize);
> }
>
> #Compute md5sum of the full data contents of a file.
> #If the file is compressed, calculate the md5sum of the inflated
> #version (using the zlib routines to uncompress the stream). Note this
> #gives the md5sum of the FULL file -- not just the partial file md5sum
> #above.
> sub zFile2FullMD5
> {
> my($bpc, $md5, $name, $compresslvl) = @_;
>
> my $fh;
> my $data;
>
> $compresslvl = $Conf{CompressLevel} unless defined $compresslvl;
> unless (defined ($fh = BackupPC::FileZIO->open($name, 0,
> $compresslvl))) {
> printerr "Can't open $name\n";
> return -1;
> }
>
> $md5->reset();
> while ($fh->read(\$data, 65536) > 0) {
> $md5->add($data);
> }
>
> return $md5->hexdigest;
> }
>
> # Like MakeFileLink but for existing files where we don't have the
> # digest available. So compute digest and call MakeFileLink after
> # For each file, check if the file exists in $bpc->{TopDir}/pool.
> # If so, remove the file and make a hardlink to the file in
> # the pool. Otherwise, if the newFile flag is set, make a
> # hardlink in the pool to the new file.
> #
> # Returns 0 if a link should be made to a new file (ie: when the file
> # is a new file but the newFile flag is 0).
> # Returns 1 if a link to an existing file is made,
> # Returns 2 if a link to a new file is made (only if $newFile is set)
> # Returns negative on error.
> sub zMakeFileLink
> {
> my($bpc, $md5, $name, $newFile, $compress) = @_;
>
> $compress = $Conf{CompressLevel} unless defined $compress;
>
> my ($d,$ret) = zFile2MD5($bpc, $md5, $name, 0, $compress);
> return -5 if $ret < 0;
> $bpc->MakeFileLink($name, $d, $newFile, $compress);
> }
>
>
> # Use the following to create a new file link ($copy) that links to
> # the same pool file as the original ($orig) file does (or
> # should). i.e. this doesn't assume that $orig is properly linked to
> # the pool. This is computationally more costly than just making the
> # link, but will avoid further corruption whereby you get archive
> # files with multiple links to each other but no link to pool.
>
> # First, check if $orig links to the pool and if not create a link
> # via MakeFileLink. Don't create a new pool entry if newFile is zero.
> # If a link either already exists from the original to the pool or if
> # one was successfully created, then simply link $copy to the same
> # pool entry. Otherwise, just copy (don't link) $orig to $copy
> # and leave it unlinked to the pool.
>
> # Note that checking whether $orig is linked to the pool is
> # cheaper than running MakeFileLink since we only need the md5sum
> # checksum.
> # Note we assume that the attrib entry for $orig (if it exists) is
> # correct, since we use that as a shortcut to determine the filesize
> # (via zFile2MD5)
> # Returns 1 if a link to an existing file is made,
> # Returns 2 if a link to a new file is made (only if $newFile is set)
> # Returns 0 if file was copied (either because MakeFileLink failed or
> # because newFile=0 and no existing pool match
> # Returns negative on error.
> sub zCopyFileLink
> {
> my ($bpc, $orig, $copy, $newFile, $compress) = @_;
> my $ret=1;
> $compress = $Conf{CompressLevel} unless defined $compress;
> my $md5 = Digest::MD5->new;
> my ($md5sum, $md5ret) = zFile2MD5($bpc, $md5, $orig, 0, $compress);
>
> #If $orig is already properly linked to the pool (or linkable to pool
> after running
> #MakeFileLink on $orig) and HardLinkMax is not exceeded, then just link
> to $orig.
> if($md5ret > 0) { #If valid md5sum and non-zero length file (so should
> be linked to pool)...
> if((GetPoolLinkMD5($bpc, $orig, $md5sum, $compress, 0) == 1 ||
> #If pool link already exists
> ($ret = $bpc->MakeFileLink($orig, $md5sum, $newFile,
> $compress))> 0) #or creatable by MakeFileLink
> && (stat($orig))[3] < $bpc->{Conf}{HardLinkMax}) { # AND
> (still) less than max links
> return $ret if link($orig, $copy); # Then link from
> copy to orig
> }
> }
> if(copy($orig, $copy) == 1) { #Otherwise first copy file then try to
> link the copy to pool
> if($md5ret > 0 && ($ret = $bpc->MakeFileLink($copy, $md5sum,
> $newFile, $compress))> 0) {
> return 2; #Link is to a new copy
> }
> printerr "Failed to link copied file to pool: $copy\n";
> return 0; #Copy but not linked
> }
> die "Failed to link or copy file: $copy\n";
> return -1;
> }
>
> # Copy $source to $target, recursing down directory trees as
> # needed. The 4th argument if non-zero, means (for files) use
> # zCopyFileLink to make sure that everything is linked properly to the
> # pool; otherwise, just do a simple link. The 5th argument $force,
> # will erase a target directory if already present, otherwise an error
> # is signalled. The final 6th argument is the CompressionLevel which
> # can be left out and will the be calculated from
> # bpc->Conf{CompressLevel}
>
> # Note the recursion is set up so that the return is negative if
> # error, positive if consistent with a valid pool link (i.e. zero
> # length file or directories are consistent too), and zero if
> # successful but not consistent with a valid pool. The overall result
> # is -1 if there is any error and otherwise the AND'ed result of the
> # operations. That means if the overall result is positive then the
> # whole tree is successfully linked to the pool, so the next time
> # around we can use a simple linking (i.e $linkcheck=0). Note $source
> # and $target must be full paths from root. Note: if files are not
> # compressed properly then you won't be able to link them to pool.
> sub link_recursively_topool
> {
> my ($bpc, $source, $target, $linkcheck, $force, $compress) = @_;
> my $ret=1;
> die "Error: '$source' doesn't exist" unless -e $source;
> if (-e $target) {
> die "Error can't overwrite: $target (unless 'force' set)\n"
> unless $force;
> die "Error can't remove: $target ($!)\n" unless
> rmtree($target, 0, 0);
> }
> if (-d $source) {
> die "Error: mkdir failed to create new directory: $target
> ($!)\n"
> unless jmkdir($target);
> opendir( my $dh, $source) || die "Error: Could not open dir
> $source ($!)\n";
> foreach my $elem (readdir($dh)) {
> next if /^\.\.?$/; # skip dot files (. and ..)
> my $newsource = "$source/$elem";
> my $newtarget = "$target/$elem";
> my $newret = link_recursively_topool($bpc, $newsource,
> $newtarget, $linkcheck, $force, $compress);
> if ($newret < 0) { #Error so break out of loop & return
> closedir $dh;
> return -1
> }
> $ret = 0 if $newret == 0; #note $ret stays at 1 only if
> no elements return -1 or 0
> }
> closedir $dh;
> return $ret;
> }
> elsif ($dryrun) {return 1} # Return before making changes to filesystem
> elsif ( ! -s $source) { # zero size
> copy($source, $target); #No link since zero size
> }
> elsif ($linkcheck) { #Makes sure first that source properly linked to
> pool
> return(zCopyFileLink($bpc, $source, $target, 1, $compress));
> }
> else {#Otherwise, OK to perform simple link to source
> return (link($source, $target) == 1 ? 1 : -1)
> }
> }
>
> sub get_bakinfo
> {
> my ($bakinfofile, $entry) = @_;
> our %backupInfo = ();
>
> $bakinfofile .= "/backupInfo";
> warn "Can't read $bakinfofile\n" unless -f $bakinfofile;
>
> unless (my $ret = do $bakinfofile) { # Load the backupInfo file
> if ($@) {
> warn "couldn't parse $bakinfofile: $@\n";
> }
> elsif (!defined $ret) {
> warn "couldn't do $bakinfofile: $!\n";
> }
> elsif (! $ret) {
> warn "couldn't run $bakinfofile\n";
> }
> }
> my $value = $backupInfo{$entry};
> warn "$bakinfofile is empty or missing entry for '$entry'\n"
> unless defined $value;
> return $value;
> }
>
> # Note: getattr($attr) =$attr->{files}
> # getattr($attr, $file) =$attr->{files}{$file}
> # getattr($attr, $file, $attribute) =$attr->{files}{$file}{$attribute}
>
> sub getattr
> {
> my($attr, $fileName, $Attribute) = @_;
> return $attr->{files}{$fileName}{$Attribute} if ( defined($Attribute) );
> return $attr->{files}{$fileName} if ( defined($fileName) );
> return $attr->{files};
> }
>
>
> #Reads in the attrib file for directory $_[1] and (optional alternative
> #attrib file name $_[2]) and #stores it in the hashref $_[0] passed to
> #the function
> #Returns -1 and a blank $_[0] hash ref if attrib file doesn't exist
> #already (not necessarily an error)
> #Dies if attrib file exists but can't be read in.
> sub read_attrib
> {
> #Note: $_[0] = hash reference to attrib object
> #SO DO NOT USE LOCAL VARIABLE FOR IT (i.e. don't do my $attr=$_[0]
> $_[0] = BackupPC::Attrib->new({ compress => $Conf{CompressLevel} });
>
> # unless (defined $_[1]) { #JJK: DEBUGGING
> # print "read_attrib: \$_[1] undefined\n";
> # print Dumper @_;
> # }
> return -1 unless -f attrib($_[1], $_[2]);
> #This is not necessarily an error because dir may be empty
>
> $_[0]->read($_[1],$_[2]) or
> die "Error: Cannot read attrib file: " . attrib($_[1],$_[2]) .
> "\n";
>
> return 1;
> }
>
> #Same as Lib.pm fileNameUnmangle but doesn't require
> #unneccessary '$bpc'
> sub jfileNameUnmangle {
> my($name) = @_;
>
> $name =~ s{/f}{/}g;
> $name =~ s{^f}{};
> $name =~ s{%(..)}{chr(hex($1))}eg;
> return $name;
> }
>
> sub count_file_attribs
> {
> my ($attrib) = @_;
> return( scalar (keys (%{$attrib->get()})));
> }
>
> # Get attrib entry for $fullfilname. The corresponding hash is both returned
> and
> # also fills the hash reference (optionally) passed via $fileattrib.
> # If attrib file not present, return -1 (which may not be an error)
> # Returns -2 if not a mangled file
> # Dies if error
> sub get_file_attrib
> {
> my ($fullfilename, $fileattrib) = @_;
> $fullfilename =~ m{(.+)/(.+)}; #1=dir; $2=file
> return -2 unless defined $2;
>
> return -1 if read_attrib(my $attr, $1) < 0;
>
> %{$fileattrib} = %{$attr->{files}{jfileNameUnmangle($2)}};
> #Note unmangling removes initial 'f' AND undoes char mapping
> }
>
> # Returns value of attrib $key for $fullfilename (full path)
> # If not a mangled file or attrib file not present or there is not an
> # entry for the specificed key for the given file, then return 'undef'
> sub get_attrib_value
> {
> my ($fullfilename, $key) = @_;
> $fullfilename =~ m{(.+)/(.+)}; #1=dir; $2=file
>
> return undef unless defined $2;
> return undef if read_attrib(my $attr, $1) < 0;
> return $attr->{files}{jfileNameUnmangle($2)}{$key};
> #Note this returns undefined if key not present
> #Note unmangling removes initial 'f' AND undoes char mapping
> }
>
> # Returns value of attrib type key for $fullfilename (full path)
> # If attrib file present but filename not an entry, return -1 [not an error
> if file nonexistent]
> # If no attrib file (but directory exists), return -2 [not an error if
> directory empty]
> # If directory non-existent, return -3
> # If attrib file present but not readble, return -4 [true error]
> # Note there may an entry even if file non-existent (e.g. type 10 = delete)
> sub get_attrib_type
> {
> my ($fullfilename) = @_;
> $fullfilename =~ m{(.+)/(.+)}; #1=dir; $2=file
>
> # unless (defined $1) { #JJK: DEBUGGING
> # print "get_attrib_type: \$1 undefined\n";
> # print Dumper @_;
> # }
>
> return -3 unless -d $1;
> return -2 unless -f attrib($1);
> return -4 unless read_attrib(my $attr, $1) >= 0;
> my $type = $attr->{files}{jfileNameUnmangle($2)}{type};
> #Note unmangling removes initial 'f' AND undoes char mapping
> return (defined($type) ? $type : -1);
> }
>
> # Write out $attrib hash to $dir/attrib (or $dir/$attfilename) and
> # link appropriately to pool.
> # Non-zero 4th argument $poollink creates link to pool (using MakeFileLink)
> # after writing the attrib file.
> # 5th argument tells what to do if no files in $attrib
> # (0=return error, 1=delete attrib file and return success)
> # 6th argument is an optional alternative name for the attrib file itself
> # Note does an unlink first since if there are hard links, we don't want
> # to modify them
> # Returns positive if successful, 0 if not
> # Specifically, 1 if linked to existing, 2 if linked to new,
> # 3 if written without linking, 4 if (empty) attrib file deleted
>
> sub write_attrib
> {
> my ($bpc, $attrib, $dir, $poollink, $delempty, $attfilename) = @_;
> die "Error: Cannot write to directory: $dir\n" unless -w $dir;
>
> # unless (defined $dir) { #JJK: DEBUGGING
> # print "write_attrib: \$dir undefined";
> # print Dumper @_;
> # }
>
> my $attfilepath = attrib($dir, $attfilename);
>
> if(count_file_attribs($attrib) == 0 && -e $attfilepath && $delempty) {
> #Delete existing empty attrib file
> die "Error: could not unlink empty attrib file: $attfilepath\n"
>
> unless(junlink($attfilepath));
> return 4;
> }
>
> #First (safely) write out new attrib file to directory
> my $tempname = mktemp("attrib.XXXXXXXXXXXXXXXX");
> unless($attrib->write($dir, $tempname) == 1 &&
> (! -e $attfilepath || junlink($attfilepath)) &&
> jrename("$dir/$tempname", $attfilepath)) {
> unlink("$dir/$tempname") if -e "$dir/$tempname";
> die "Error: could not write attrib file: $attfilepath\n";
> }
> unlink("$dir/$tempname") if $dryrun;
>
> my $ret=3;
> if ($poollink) {
> my $data = $attrib->writeData;
> my $md5 = Digest::MD5->new;
> my $digest;
> if(($digest = $bpc->Buffer2MD5($md5, length($data), \$data)) eq
> -1
> || ($ret = jMakeFileLink($bpc, $attfilepath, $digest, 2,
>
> $Conf{CompressLevel})) <= 0) {
> printerr "Can't link attrib file to pool: $attfilepath
> ($ret)\n";
> }
> }
> return $ret;
> }
>
> # Write out $fileattrib for $file (without the mangle) to $dir/$attfilename
> (or
> # to the default attribute file for $dir if $attfilename is undef)
> # Reads in existing attrib file if pre-existing
> # 4th argument $poollink says whether to write to file (0) or link to
> # pool (using MakeFileLink).
> # Returns positive if successful, 0 if not
> # Specifically, 1 if linked to existing, 2 if linked to new,
> # 3 if written without linking
> sub write_file_attrib
> {
> my ($bpc, $dir, $file, $fileattrib, $poollink, $attfilename) = @_; #Note
> $fileattrib is a hashref
> my $ret=0;
>
> read_attrib(my $attr, $dir, $attfilename); #Read in existing attrib
> file if it exists
> $ret = write_attrib($bpc, $attr, $dir, $poollink, 0, $attfilename)
> if $attr->set($file, $fileattrib) > 0;
>
> # unless (defined $dir) { #JJK: DEBUGGING
> # print "write_file_attrib: \$dir undefined\n";
> # print Dumper @_;
> # }
>
> die "Error writing to '$file' entry to attrib file: " . attrib($dir,
> $attfilename) . "\n" unless $ret > 0;
> return $ret;
> }
>
> sub attrib
> {
> return (defined($_[1]) ? "$_[0]/$_[1]" : "$_[0]/attrib");
> }
>
> #
> # Given an MD5 digest $d and a compress flag, return the relative
> # path (to TopDir)
> #
> sub MD52RPath
> {
> my($bpc, $d, $compress, $poolDir) = @_;
>
> return if ( $d !~ m{(.)(.)(.)(.*)} );
> $poolDir = ($compress ? "cpool" : "pool")
> if ( !defined($poolDir) );
> return "$poolDir/$1/$2/$3/$1$2$3$4";
> }
>
> # Modified version of MakeFileLink including:
> # 1. Efficiency/clarity improvements
> # 2. Calls GetPoolLink to find candidate link targets.
> # 2. For non-compressed files, uses my improved jcompare comparison algorithm
> # 3. For compressed files, uses zcompare2 which compares only the compressed
> # data sans first-byte header & potential rsync digest trailer. This allows
> # for matches even if one file has rsync digest and other does not
> # 4. Moves file before unlinking in case subsequent link fails and needs to
> be
> # undone
> # 5. Added 6th input parameter to return pointer to the pool link name
> # 6. Extended meaning of newFile flag
> # 0 = Don't creat new pool file (as before)
> # 1 = Create new pool file IF no other links to source file
> # (this was the previous behavior for whenever newFile was set)
> # >2 = Create new pool file EVEN if source file has more than one link
> # (this will by extension link other things linked to the source
> # also to the pool -- which means that the pool might not clean
> # if it links to things outside of the pc directory -- so
> # use carefully
> # 7. Includes 'j' versions of file routines to allow dryrun
> # 8. Added check to see if already in pool and if so returns 3
>
> # For each file, check if the file exists in $bpc->{TopDir}/pool.
> # If so, remove the file and make a hardlink to the file in
> # the pool. Otherwise, if the newFile flag is set, make a
> # hardlink in the pool to the new file.
> #
> # Returns 0 if a link should be made to a new file (ie: when the file
> # is a new file but the newFile flag is 0).
> # JJK: of if newFlag =1 and $name has nlinks >1
> # Returns 1 if a link to an existing file is made,
> # Returns 2 if a link to a new file is made (only if $newFile is set)
> # Return 3 if first finds that already linked to the appropriate pool chain
> # Returns negative on error.
>
> sub jMakeFileLink
> {
> my($bpc, $name, $d, $newFile, $compress, $linkptr) = @_;
>
> my $poollink;
> my $result=GetPoolLinkMD5($bpc, $name, $d, $compress, 1, \$poollink);
> $$linkptr = $poollink if defined($linkptr) && $result > 0;
>
> if($result == 1){ #Already linked to the $d pool chain
> return 3;
> }elsif($result == 2) { #Matches existing, linkable pool file
> my $tempname = mktemp("$name.XXXXXXXXXXXXXXXX");
> return -5 unless jrename($name, $tempname); #Temorarily save
> if(!jlink($poollink, $name)) { #Link pool to source
> jrename($tempname, $name); #Restore if can't link
> return -3;
> }
> junlink($tempname); #Safe to remove the original
> return 1;
> }elsif($result == 3) { #No link or match
> if(defined($newFile) &&
> ($newFile > 1 || ($newFile == 1 && (stat($name))[3] == 1 )))
> {
> $poollink =~ m|(.*)/|;
> jmkpath($1, 0, 0777) unless -d $1 ;
> return -4 unless jlink($name, $poollink);
> return 2;
> } else { #New link should have been made but told not to
> my $dog = (stat($name))[3];
> print "$dog\n";
> return 0;
> }
> }else {
> return -6; #Error from GetPoolLink call
> }
> }
>
> # GetPoolLink
> # GetPoolLinkMD5
> #Find the pool/cpool file corresponding to file $name.
> #1. First iterates entire chain to see if *same inode* is present. I.e. if
> # already linked to the chain. If so, it returns the first instance.
> # Return = 1 and $Poolpathptr = name of the hard-linked match
> #2. If $compareflg is set, then iterate through again this time looking for
> # file *content* matches (this is much slower).
> # If so, it returns the first instance with Nlinks < HardLinkMax
> # Return = 2 and $Poolpathptr = name of the content match
> #3. Finally, if not linked (and also not matched if $compareflg set)
> # Return=3 and $$poolpathptr = first empty chain
> #Note: Return zero if zero size file
> # Return negative if error.
> #Note: if chain has multiple copies of the file, then it returns the first
> linked
> #match if present and if none and $compareflag set then the first content
> match
> sub GetPoolLink
> {
> my($bpc, $md5, $name, $compress, $compareflg, $poolpathptr) = @_;
>
> $compress = $bpc->{Conf}{CompressLevel} unless defined $compress;
>
> my ($md5sum , $ret) = defined($compress) && $compress > 0 ?
> zFile2MD5($bpc, $md5, $name, 0, $compress) :
> $bpc->File2MD5($md5, $name);
>
> return 0 if $ret == 0; #Zero-sized file
> return -3 unless $ret >0;
>
> GetPoolLinkMD5($bpc, $name, $md5sum, $compress, $compareflg,
> $poolpathptr);
> }
>
> sub GetPoolLinkMD5
> {
> my($bpc, $name, $md5sum, $compress, $compareflg, $poolpathptr) = @_;
> my($poolbase, $i);
>
> return -1 unless -f $name;
> my $inode = (stat(_))[1]; #Uses stat from -f
> return 0 if (stat(_))[7] == 0; #Zero-size (though shouldn't really
> happen since
> #md5sum input not defined for zero sized
> files
>
> $compress = $bpc->{Conf}{CompressLevel} unless defined $compress;
>
> return -2 unless
> defined($poolbase = $bpc->MD52Path($md5sum, $compress));
>
> #1st iteration looking for matching inode
> $$poolpathptr = $poolbase;
> for($i=0; -f $$poolpathptr; $i++) { #Look for linked copy (inode match)
> return 1 if ((stat(_))[1] == $inode);
> $$poolpathptr = $poolbase . '_' . $i; #Iterate
> }
>
> return 3 unless $compareflg; #No inode match
>
> #Optional 2nd iteration looking for matching content
> my $compare = defined($compress) && $compress > 0 ? \&zcompare2 :
> \&jcompare;
> $$poolpathptr = $poolbase;
> for(my $j=0; $j<$i; $j++ ) { #Look for content match
> return 2 if (stat($$poolpathptr))[3] <
> $bpc->{Conf}{HardLinkMax} &&
> !$compare->($name, $$poolpathptr);
> $$poolpathptr = $poolbase . '_' . $j; #Iterate
> }
> # No matching existing pool entry - $$poolpathptr is first empty chain
> element
> return 3;
> }
>
> #Test if $file is a valid link to the pool tree determined by $compresslvl.
> #Returns the name of the pool file link if it is valid, 0 otherwise.
> #Size is optional and is as described for zFile2MD5
> #Note this is similar to GetPoolLink but faster though less powerful
> sub is_poollink
> {
> my ($bpc, $md5, $file, $size, $compresslvl)= @_;
>
> return 0 unless -f $file;
>
> $compresslvl = $Conf{CompressLevel} unless defined $compresslvl;
> my ($filemd5, $ret) = zFile2MD5($bpc, $md5, $file, $size, $compresslvl);
> return 0 unless $ret > 0;
>
> my $finode = (stat($file))[1];
> my $poolpath = $bpc->MD52Path($filemd5, $compresslvl);
>
> my %poolhash = map {(stat($_))[1] => $_} bsd_glob($poolpath . "_*");
> $poolhash{(stat(_))[1]} = $poolpath if -f $poolpath; #Add poolpath
> #Note we can't add $poolpath to the glob in case it doesn't exist
> #since glob will return a non-existing non-pattern match entry as-is
>
> foreach (keys %poolhash) {
> if($_ == $finode) {
> $poolhash{$_} =~ m|.*/(.*)|;
> return $1;
> }
> }
> return 0;
> }
>
>
> #Convert pool name to constant length string consisting
> #of 32 hex digits for the base plus 6 (left) zero padded digits for
> #the chain suffix (suffixes are incremented by 1 so that no suffix
> #records as 0). Note this accomodates chains up to 10^6 long.
> #Use a number bigger than 6 if you have longer chains
> #Useful if you want to order (e.g., sort) pool file names numerically
> sub poolname2number
> {
> $_[0] =~ m|(.*/)?([^_]*)(_(.*))?|;
> my $suffix = defined($4) ? $4+1 : 0;
> return sprintf("%s%06s", $2, $suffix)
> }
>
> # Renumber pool chain holes starting at $firsthole and continue up the chain
> # to fill up # to $maxgap holes (which need not be contiguous).
> # If $maxgap is not specified it defaults to 1 (sufficient to cover one
> # deletion - i.e. hole -- which may be $file itself)
> # If $firsthole exists, it is an error. Use delete_pool_file instead,
> # if you want to delete first before renumbering.
> # Return 1 on success; Negative on failure
> sub renumber_pool_chain
> {
> my ($firsthole, $maxholes) = @_;
>
> $maxholes = 1 unless defined $maxholes;
>
> my ($base, $sufx);
> if($firsthole =~ m|(.*_)([0-9]+)|) {
> $base = $1;
> $sufx = $2;
> }else{
> $base = $firsthole . "_";
> $sufx = -1; #Corresponds to no suffix
> }
>
> my $nextsufx = $sufx+1;
> my $nextfile = $base . $nextsufx;
> while($nextsufx - $sufx <= $maxholes) {
> if(-e $nextfile) { #Non-hole so move/renumber
> if(-e $firsthole || ! jrename($nextfile, $firsthole)) {
> warn "Error: Couldn't renumber pool file:
> $nextfile --> $firsthole\n";
> return -2;
> }
> # print "Renumbering: $nextfile --> $firsthole\n";
> $firsthole = $base . (++$sufx); #Move hole up the chain
> }
> $nextfile = $base . (++$nextsufx);
> }
> return 1;
> }
>
> # Delete pool file (if it exists) and regardless renumber
> # chain to fill hole left by file. Fill up to $maxholes
> # above $file (including the hole left by $file) where
> # $maxholes defaults to 1.
> # If $file doesn't exist, then it just fills the hole
> # left by $file.
> # Return 1 on success; Negative on failure
> sub delete_pool_file
> {
> my ($file, $maxholes) = @_;
>
> if(-e $file && !junlink($file)) { #Try to delete if exists
> warn "Error: Couldn't unlink pool file: $file\n";
> return -3;
> }
> return(renumber_pool_chain($file,$maxholes)); #Returns -1/-2 on fail
> }
>
> # Run BackupPC_nightly
> sub run_nightly
> {
> my ($bpc) = @_;
> my $err = $bpc->ServerConnect($Conf{ServerHost}, $Conf{ServerPort});
> if ($err) {
> printerr "BackupPC_nightly: can't connect to server ($err)...\n";
> return($err);
> }
> if ((my $reply = $bpc->ServerMesg("BackupPC_nightly run")) eq "ok\n" ) {
> $bpc->ServerMesg("log $0: called for BackupPC_nightly run...");
> print "BackupPC_nightly scheduled to run...\n";
> return 0;
> }
> else {
> printerr "BackupPC_nightly ($reply)...\n";
> return $reply;
> }
> }
>
>
> # Rewrite of compare function (in File::Compare) since it messes up on
> # weird filenames with spaces and things (since it combines the "<"
> # and the filename in a single string). Also streamline the code by
> # getting rid of of extra fluff and removing code for comparing text
> # files while we are at. The basic algorithm remains the classic one
> # for comparing 2 raw files.
> # Returns 0 if equal, 1 if not equal. And negative if error.
> sub jcompare {
> my ($file1,$file2,$size) = @_;
> my ($fh1open, $fh2open, $fh1size);
> my $ret=0;
>
> local (*FH1, *FH2);
> unless (($fh1open = open(FH1, "<", $file1)) &&
> ($fh2open = open(FH2, "<", $file2))) {
> $ret = -1;
> goto compare_return;
> }
> binmode FH1;
> binmode FH2;
> if (($fh1size = -s FH1) != (-s FH2)) {
> $ret=1;
> goto compare_return;
> }
>
> unless (defined($size) && $size > 0) {
> $size = $fh1size;
> $size = LINUX_BLOCK_SIZE if $size < LINUX_BLOCK_SIZE;
> $size = TOO_BIG if $size > TOO_BIG;
> }
>
> my $data1 = my $data2 = '';
> my ($r1,$r2);
> while(defined($r1 = read(FH1,$data1,$size)) && $r1 > 0) {
> unless (defined($r2 = read(FH2,$data2,$r1)) && $r2 == $r1
> && $data2 eq $data1) {
> $ret=1;
> goto compare_return;
> }
> }
> $ret=1 if defined($r2 = read(FH2,$data2,LINUX_BLOCK_SIZE)) && $r2 > 0;
> $ret =-2 if $r1 < 0 || $r2 < 0; # Error on read
>
> compare_return:
> close(FH1) if $fh1open;
> close(FH2) if $fh2open;
> return $ret;
> }
>
> # Version of compare for BackupPC compressed files. Compares inflated
> # (uncompressed) versions of files. The BackupPC::FileZIO subroutine
> # is used to read in the files instead of just raw open & read. Note
> # this version of compare is relatively slow since you must use zlib
> # to decompress the streams that are being compared. Also, there is no
> # shortcut to just first compare the filesize since the size is not
> # known until you finish reading the file.
> sub zcompare {
> my ($file1, $file2, $compress)=@_;
> my ($fh1, $fh2);
> my $ret=0;
>
> $compress =1 unless defined $compress;
> unless ((defined ($fh1 = BackupPC::FileZIO->open($file1, 0,
> $compress))) &&
> (defined ($fh2 = BackupPC::FileZIO->open($file2, 0,
> $compress)))) {
> $ret = -1;
> goto zcompare_return;
> }
> my $data1 = my $data2 = '';
> my ($r1, $r2);
> while ( ($r1 = $fh1->read(\$data1, 65536)) > 0 ) {
> unless ((($r2 = $fh2->read(\$data2, $r1)) == $r1)
> && $data1 eq $data2) {
> $ret=1;
> goto zcompare_return;
> }
> }
> $ret =1 if ($r2 = $fh2->read(\$data2, LINUX_BLOCK_SIZE)) > 0; #see if
> anything left...
> $ret =-1 if $r1 < 0 || $r2 < 0; # Error on read
>
> zcompare_return:
> $fh1->close() if defined $fh1;
> $fh2->close() if defined $fh2;
> return $ret;
> }
>
> # Second alternative version that combines the raw speed of jcompare
> # with the ability to compare compressed files in zcompare. This
> # routine effectively should compare two compressed files just as fast
> # as the raw jcompare. The compare algorithm strips off the first byte
> # header and the appended rsync checksum digest info and then does a
> # raw compare on the intervening raw compressed data. Since no zlib
> # inflation (decompression) is done it should be faster than the
> # zcompare algorithm which requires inflation. Also since zlib
> # compression is well-defined and lossless for any given compression
> # level and block size, the inflated versions are identical if and
> # only if the deflated (compressed) versions are identical, once the
> # header (first byte) and trailers (rsync digest) are stripped
> # off. Note that only the data between the header and trailers have
> # this uniqe mapping. Indeed, since BackupPC only adds the checksum
> # the second time a file is needed, it is possible that the compressed
> # value will change with time (even though the underlying data is
> # unchanged) due to differences in the envelope. Note, to some extent
> # this approach assumes that the appended digest info is correct (at
> # least to the extent of properly indicating the block count and hence
> # compressed data size that will be compared)
> sub zcompare2 {
> my ($file1,$file2,$size) = @_;
> my ($fh1, $fh2, $fh1open, $fh2open);
> my $Too_Big = 1024 * 1024 * 2;
> my $ret=0;
>
> unless (($fh1open = open($fh1, "<", $file1)) &&
> ($fh2open = open($fh2, "<", $file2))) {
> $ret = -1;
> goto zcompare2_return;
> }
> binmode $fh1;
> binmode $fh2;
>
> my $fh1size = -s $fh1;
> my $fh2size = -s $fh2;
>
> my $data1 = my $data2 = '';
> unless (read($fh1, $data1, 1) == 1 && # Read first byte
> read($fh2, $data2, 1) == 1) {
> $ret = -1;
> goto zcompare2_return;
> }
> if (vec($data1, 0, 8) == 0xd6 || vec($data1, 0, 8) == 0xd7) {
> return -2 unless ( defined(seek($fh1, -8, 2)) );
> return -3 unless ( read($fh1, $data1, 4) == 4 );
> $fh1size -= 20 * unpack("V", $data1) + 49;
> # [0xb3 separator byte] + 20 * NUM_BLOCKS + DIGEST_FILE(32) +
> DIGEST_INFO (16)
> # where each of NUM_BLOCKS is a 20 byte Block digest consisting
> of 4 bytes of
> # Adler32 and the full 16 byte (128bit) MD4 file digest
> (checksum).
> # DIGEST_FILE is 2 copies of the full 16 byte (128bit) MD4
> digest
> # (one for protocol <=26 and the second for protocol >=27)
> # DIGEST_INFO is metadata consisting of a pack("VVVV") encoding
> of the
> # block size (should be 2048), the checksum seed, length of the
> # block digest (NUM_BLOCKS), and the magic number (0x5fe3c289).
> # Each is a 4 byte integer.
> }
>
> if (vec($data2, 0, 8) == 0xd6 || vec($data2, 0, 8) == 0xd7) {
> return -2 unless ( defined(seek($fh2, -8, 2)) );
> return -3 unless ( read($fh2, $data2, 4) == 4 );
> $fh2size -= 20 * unpack("V", $data2) + 49;
> }
>
> if ($fh1size != $fh2size) {
> $ret=1;
> goto zcompare2_return;
> }
>
> seek($fh1,1,0) ; #skip first byte
> seek($fh2,1,0) ; #skip first byte
>
>
> my $bytesleft=--$fh1size; #Reduce by one since skipping first byte
> $size = $fh1size unless defined($size) && $size > 0 && $size < $fh1size;
> $size = $Too_Big if $size > $Too_Big;
>
> my ($r1,$r2);
> while(defined($r1 = read($fh1,$data1,$size))
> && $r1 > 0) {
> unless (defined($r2 = read($fh2,$data2,$r1))
> && $data2 eq $data1) { #No need to test if
> $r1==$r2 since should be same size
> $ret=1; #Plus if it doesn't, it will be picked
> up in the $data2 eq $data1 line
> goto zcompare2_return;
> }
> $bytesleft -=$r1;
> $size = $bytesleft if $bytesleft < $size;
> }
> $ret=1 if defined($r2 = read($fh2,$data2,$size)) && $r2 > 0;
> #Note: The above line should rarely be executed since both files same size
> $ret =-2 if $r1 < 0 || $r2 < 0; # Error on read
>
> zcompare2_return:
> close($fh1) if $fh1open;
> close($fh2) if $fh2open;
> return $ret;
> }
>
> #Routine to remove files (unlink) or directories (rmtree)
> # $fullfilename is full path name to file or directory
> # Returns number of files deleted in the listref $dirfileref if defined;
> sub delete_files
> {
> my ($fullfilename, $dirfileref) = @_;
> my $listdeleted = defined($dirfileref);
> my $ret = -1;
> my $output;
> die "Error: '$fullfilename' doesn't exist or not writeable\n"
> unless -w $fullfilename;
> if(-f $fullfilename) {
> $ret = junlink($fullfilename);
> }
> elsif(-d $fullfilename) {
> $$dirfileref = "" if $listdeleted;
> open(local *STDOUT, '>', $dirfileref) #redirect standard out to
> capture output of rmtree
> if $listdeleted;
> $ret = jrmtree($fullfilename, $listdeleted, 1);
> }
> return $ret;
> }
>
> #Touch command (modified version of Unix notion)
> #First argument is the (full)filepath.
> #Zero-length file is created if non-existent.
> #If existent and second argument zero or not defined then access
> # and modification times are updated.
> #Returns 1 on success, -1 on error;
> sub jtouch
> {
> return 1 if $dryrun;
> if(! -e $_[0]) { #Create if non-existent
> return -1 unless defined(sysopen(my $fh, $_[0], O_CREAT));
> close $fh;
> }elsif(!$_[1]) {
> my $time = time();
> utime($time, $time, $_[0]);
> }
> return 1;
> }
>
> #Simple wrappers to protect filesystem when just doing dry runs
> sub jchmod
> {
> return 1 if $dryrun;
> chmod @_;
> }
>
> sub jchown
> {
> return 1 if $dryrun;
> chown @_;
> }
>
> sub jcopy
> {
> return 1 if $dryrun;
> copy @_;
> }
>
> sub jlink
> {
> return 1 if $dryrun;
> link $_[0], $_[1];
> }
>
> sub junlink
> {
> return 1 if $dryrun;
> unlink @_;
> }
>
> sub jmkdir
> {
> return 1 if $dryrun;
> mkdir @_;
> }
>
> sub jmkpath
> {
> return 1 if $dryrun;
> mkpath @_;
> }
>
> sub jmake_path
> {
> return 1 if $dryrun;
> mkpath @_;
> }
>
> sub jrename
> {
> return 1 if $dryrun;
> rename $_[0], $_[1];
> }
>
> sub jrmtree
> {
> return 1 if $dryrun;
> rmtree @_;
> }
>
> sub jutime
> {
> return 1 if $dryrun;
> utime @_;
> }
>
> 1;
>
>
> ------------------------------------------------------------------------------
> Everyone hates slow websites. So do we.
> Make your web apps faster with AppDynamics
> Download AppDynamics Lite for free today:
> http://p.sf.net/sfu/appdyn_d2d_feb
> _______________________________________________
> BackupPC-users mailing list
> BackupPC-users AT lists.sourceforge DOT net
> List: https://lists.sourceforge.net/lists/listinfo/backuppc-users
> Wiki: http://backuppc.wiki.sourceforge.net
> Project: http://backuppc.sourceforge.net/
>
--
Tyler J. Wagner
Chief Technology Officer, Talia Ltd
tyler AT talia DOT net, http://www.talia.net
+44 790 863 1442, +44 203 3181 503
--
"There is no 'eastern' and 'western' medicine. There's 'medicine' and
then there's 'stuff that has not been proven to work.'"
-- Maki Naro, "The Red Flags of Quackery, v2.0", Sci-ence.org
------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_d2d_feb
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List: https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki: http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/
|