I have seen several people asking about how to delete files from their
backups.
I am contributing this *beta* perl script for input as a potential
solution.
It allows you to remove files and or directories cleanly from one or
more backups (including adjusting the attrib file entry properly - I
hope, see below).
Options include:
- Specify: Host (or wildcard)
- Specify: Backup number (or wildcard)
- Specify: Share (or wildcard/blank)
- Allow removal of directories (like -r flag for rm)
- Optionally use non-mangled path names
- Force removal of entries in subsequent incrementals, even if
type changes from file to directory or vica-versa (see below)
- Quiet (don't show deletions)
- Trial run (print output but don't actually delete
files/directories and don't write to attrib files
- Debug mode (extra printing and for now implies trial run)
I have mostly been running this in debug mode but would like to get
input from others on both the approach (see below) and the code (see
below). Note I have only a casual knowledge of Perl so my code doesn't
have the beauty of a true monk. It also uses one external shell
command (find) since with Perl find, it is a minor PITA to implement
maxdepth (and I was lazy).
Also, I would love for people to try it (keep it on debug mode or use
it on scratch backups until you are comfortable that it doesn't have
"bad" side effects). [Note I have it hard-coded to dry run for
now. Just search for the line with "JJK" in it to comment out]
Here is the "logic" flow for the approach.
- If file/directory exists in selected backup, then delete the
file from this backup.
- Next we need to figure out what to do with the attrib entry for
the deleted file/directory. If it is a directory, then just
delete the entry from the attrib file of the parent directory.
If we deleted a file, then we need to look back at its
antecedent backups to determine what to do. If the file exists
in a previous backup without an intervening "type=10" (delete)
entry in the attrib database or without an intervening
directory with the same name, then we need to change the
current file's attrib entry to type=10 (deleted file) to signal
this is the first time file should be treated as
deleted. Otherwise just delete the file attribute entry from
the attrib file
- Finally, delete all versions of the file/directory in incremental
backups that are its children unless the type changes from file
to directory (or vica-versa). This is done primarily for
caution to prevent unwittingly causing a single file removal to
trigger a large directory delete (and it can be overriden).
In this case, we are safe to just delete the attrib entry for
any files/directories that we delete.
------------------------------------------------------------------------------------------
#!/usr/bin/perl
use strict;
use File::Path;
use lib "/usr/share/BackupPC/lib";
use BackupPC::Lib;
use BackupPC::Attrib qw(:all);
use BackupPC::FileZIO;
use Getopt::Std;
use Data::Dumper; #Just used for debugging...
my $DEBUG=''; #Overrides debug option if set to non-empty ''
die("BackupPC::Lib->new failed\n") if ( !(my $bpc = BackupPC::Lib->new) );
my $TopDir = $bpc->TopDir();
(my $root = "$TopDir/pc") =~ s|//*|/|g;
my %Conf = $bpc->Conf();
my $attrib = BackupPC::Attrib->new({ compress => $Conf{CompressLevel} });
my %opts;
if ( !getopts("h:n:s:rmfqtd:", \%opts) || @ARGV < 1 ) {
print STDERR <<EOF;
usage: $0 [options] files/directories...
Required options:
-h host Host (or - for all) from which path is offset
-n backupNum Full backup number (or - for all) from which path is offset
WARNING: this will also delete the file/directory from all
subsequent incrementals
Optional options:
-s shareName Share name (or - for all) from which path is offset
-r Allow directories to be removed too
-m Paths are unmangled (i.e. apply mangle to paths)
-f Force children to be removed even if changes from file
to directory (or vica-versa)
-q Don\'t show deletions
-t Trial run -- do everything but deletions
-d level Turn on debug level (if undefined in code)
EOF
exit(1);
}
if ( $opts{h} !~ /^([\w\.\s-]+)$/
|| $opts{h} =~ m{(^|/)\.\.(/|$)} ) {
print(STDERR "$0: bad host name '$opts{h}'\n");
exit(1);
}
my $host = my $hostregex = $opts{h};
if ( $opts{n} !~ /^(-|\d+)$/ ) {
print(STDERR "$0: bad dump number '$opts{n}'\n");
exit(1);
}
my $num = my $numregex = $opts{n};
my $share = my $shareregex = $opts{s};
my $mangle = $opts{m};
my $force = $opts{f};
my $rmdir = $opts{r};
my $quiet = $opts{q};
my $dryrun = $opts{t};
($DEBUG eq '') && ($DEBUG=$opts{d});
#$DEBUG && ($dryrun=1); #Uncomment if you want DEBUG to imply dry run
$dryrun=1; #JJK: Temporarily set to always dry-run (paranoia)
($hostregex eq '-' ) && ($hostregex = "[^/]+");
($numregex eq '-' ) && ($numregex = "[0-9]+");
if ($shareregex eq '-' || $shareregex eq '') {$shareregex = "f[^/]+";}
(my $findpath = "$hostregex/$numregex/$shareregex") =~ s|//*|/|g;
my @branches = `find $root -maxdepth 3 -regex \"$root/$findpath\"`;
chop(@branches); #Note you could do this in native perl using preprocess but
it is messy
my @Ante = my @Post =();
if ($num ne '-') {
if ($host eq '-') {
die "Error: wildcard for host requires wilcard for backupNum\n";
}
else {
(-d "$root/$host/$num") || die "Error: Backup $num on host
'$host' doesn't exist\n";
find_backup_heirarchy("$root/$host", $num, \@Ante, \@Post);
#These are the backups with lower and higher incremental
numbers (starting at 0).
#@Ante is in descending order from $backupNum
#@Post is in ascending order from $backupNum
$DEBUG > 1 && (print "ANTE[$num]: " . join(" ", @Ante) ."\n");
$DEBUG > 1 && (print "POST[$num]: " . join(" ", @Post) ."\n\n");
}
}
my @filelist=();
my $warnflg=0;
while (@branches) {
my @files = @ARGV;
while (@files) { #Start to delete each file/directory (and adjust
antecedants & descendants)
my $delflag=0;
$mangle && ($files[0] = $bpc->fileNameMangle($files[0]));
$DEBUG > 2 && $mangle && print "Mangle: $files[0]\n";
(my $file ="$branches[0]/$files[0]") =~ m|(.+)/(.+)|;
my $dir = $1;
shift @files;
$DEBUG > 2 && print "LOOKING AT: $file\n";
( -d $dir ) || next; #Nothing to do since parent directory
doesn't exist
if (-d $file) {
unless ($rmdir) {
$warnflg || warn "Can't delete directories
unless set -r flag\n";
$warnflg=1;
next;
}
}
else { # If not a dir, we first need look at earlier backups to
see if a file (not dir) exists
#with same name without an intervening same name
directory or deleted file attrib.
#If so instead of deleting the attrib entry for the
current file, we will set the type
#to 10 (delete) and blank the other (no longer
meaningful) entries.
my @ante =@Ante;
while(@ante) { #Note not executed if we set the
wildcard for numBackups since then @ante=()
(my $prevback = $file) =~
s|^($root/$hostregex)/([^/]*)/|$1/$ante[0]/|;
$DEBUG >1 && print "ANTE[$ante[0]] $prevback [" . (-f $prevback ? "F" : (-d
$prevback ? "D":"-")) . "]\n";
if (-f $prevback ) { #File exists in previous
backup so better set delete attrib
$delflag = 1;
last;
}
(-d $prevback) && last; #Directory exists with
same name in prev backup so no need to adjust attrib
( get_attrib($file,"type") == 10) && last;
#Already marked as deleted
shift @ante; # Continue looping at look at
earlier antecdents...
}
}
# Next, delete and/or adjust delete attrib for current file we
are looking at..
$DEBUG >1 && print "CURRENT[$num] $file [" . (-f $file ? "F" : (-d $file ?
"D":"-")) . "][" . ($delflag ? "DELATRIB":"-") ."]\n";
if (-r $file ) { #If file/dir exists, delete and then remove
attrib or mark delete attrib
$DEBUG && print" delete_file_and_attrib($file, $rmdir, $delflag, !$quiet)\n";
delete_file_and_attrib($file, $rmdir, $delflag,
!$quiet);
}
elsif ($delflag ==1) { #If file doesn't exist, mark removal by
setting delete attrib if $delflag
$DEBUG && print" set_delete_attrib($file, !$quiet)\n";
set_delete_attrib($file, !$quiet);
}
#Finally, delete file/directory from later incremental backups
(applies to both files & dirs)
my @post = @Post;
while (@post) { #Again note if wildcard set for numBackups then
@post=() so this is skipped
#and files will be deleted through main loop
(my $nextback = $file) =~
s|^($root/$hostregex)/([^/]*)/|$1/$post[0]/|;
$DEBUG >1 && print "POST[$post[0]] $nextback [" . (-f $nextback ? "F" : (-d
$nextback ? "D":"-")) . "]\n";
#$DEBUG >1 && print "POST:$post[0]: $nextback [" . (-r $nextback ? "R":"-") .
"]\n";
if (-r $nextback ) { #Exists in successor branch
unless ($force || (-f $file) == -f ($nextback))
{
#File subsequently replaced by directory or vica versa - be
cautious
warn "Warning: Not removing '$nextback'
(and its children) since file type changes\n";
last;
}
else {
$DEBUG && print" delete_file_and_attrib($nextback, $rmdir, 0, !$quiet)\n";
delete_file_and_attrib($nextback,
$rmdir, 0, !$quiet);
}
}
shift @post; #Continue looping and looking at further
descedants...
}
}
shift @branches;
}
exit;
sub get_attrib
{
my ($fullfilename, $attribname) = @_;
$fullfilename =~ m{(.+)/(.+)};
my $dir=$1;
my $file=$2;
(-f "$dir/attrib" ) || return 0; #This is not necessarily an error
because file may be empty
($attrib->read($dir, "attrib")) || die "Error: Cannot read attrib file
for '$dir'\n";
$attrib->{files}{$file}{type};
}
# Routine to delete file and adjust corresponding attribute file
# $fullfilename is full path name to file or directory
# $rmdirflg if 1, allows for removal of directories too
# $delttrib if 1 then changes attrib to deleted file type rather than
# deleting attribute (only applies to files)
# $listdeletes if 1 then lists deleted files and/or modified attrib entries
sub delete_file_and_attrib
{
my ($fullfilename, $rmdirflg, $delattrib, $listdeletes) = @_;
$fullfilename =~ m|(.*)/f(.*)|;
my $isdir = (-d $fullfilename) ? 1 : 0;
my $dir = $1;
my $file = $2;
($isdir && $rmdirflg ne "1") && die "Error: can't delete directories
unless rmdirflg set\n";
( -w $fullfilename) || die "Error: '$fullfilename' doesn't exist or not
writeable\n";
( -w "$dir/attrib") || die "Error: Cannot write to attrib file for
'$dir'\n";
($attrib->read($dir, "attrib")) || die "Error: Cannot read attrib file
for '$dir'\n";
(defined ($attrib->get($file))) || die "Error: '$file' not found in
attrib file\n";
if (remove($fullfilename, $listdeletes) > 0 ) { # Remove file/directory
if ($delattrib == 1 && !$isdir) {
$attrib->set($file, { # Set file to deleted type
type => 10,
mode => 0,
uid => 0,
gid => 0,
size => 0,
mtime => 0,
});
$DEBUG >1 && print "Setting type=10 (deleted file) attrib for $fullfilename\n";
}
else { # Remove attrib entry for file
$attrib->delete($file);
$DEBUG >1 && print "Removing attrib entry for: f$file\n";
}
$dryrun || ($attrib->write($dir, "attrib")) || die "Error
writing to attribute file for $dir\n";
return 0;
}
return 1;
}
sub remove
{
$_ = $_[0];
my $ret=0;
if (-f ) {
($dryrun ? $ret = 1 : ($ret = unlink));
($ret == 1) && $_[1] && print "unlink $_\n";
}
elsif (-d ) {
if($dryrun) {
print "deleting directory tree starting at: $_ \n";
$ret=1;
}
else {
$ret = rmtree($_, $_[1], 1);
}
}
return $ret;
}
sub set_delete_attrib
{
my ($fullfilename, $listdeletes) = @_;
$fullfilename =~ m|(.*)/f(.*)|;
my $dir = $1;
my $file = $2;
if (-e "$dir/attrib") {
( -w "$dir/attrib") || die "Error: Cannot write to attrib file
for '$dir'\n";
($attrib->read($dir, "attrib")) || die "Error: Cannot read
attrib file for '$dir'\n";
}
else { #Possible it doesn't exist if directory is empty (which occurs
frequently with incrementals)
$attrib = BackupPC::Attrib->new({ compress =>
$Conf{CompressLevel} });
}
$attrib->set($file, { # Set file to deleted type
type => 10,
mode => 0,
uid => 0,
gid => 0,
size => 0,
mtime => 0,
});
$DEBUG > 3 && print Dumper($attrib->get($file));
$dryrun || ($attrib->write($dir, "attrib")) || die "Error writing to
attribute file for $dir\n";
$listdeletes && print "Setting type=10 (deleted file) attrib for
$fullfilename\n";
return 0;
}
sub find_backup_heirarchy
{
my ($path, $baknum, $antecedants, $descendants) = @_;
my %backupInfo;
$path =~ s|//*|/|g;
my @baks = `find $path -maxdepth 1 -regex \"$path/[0-9]+\" -printf
"%f\n"`;
chop(@baks);
my @before = my @after = ();
while (@baks) {
if ($baks[0] <= $baknum) {
push (@before, $baks[0]);
}
else {
push (@after, $baks[0]);
}
shift @baks;
}
@before = reverse(sort numerically @before);
shift(@before); # remove $baknum from beginning;
@after = sort numerically @after;
my $oldlevel = my $baklevel = get_baklevel("$path/$baknum");
while(@before) {
my $newlevel = get_baklevel("$path/$before[0]");
if ($newlevel < $oldlevel) {
push(@$antecedants, $before[0]);
($newlevel == 0 ) && last;
$oldlevel = $newlevel;
}
shift @before;
}
while(@after) {
my $newlevel = get_baklevel("$path/$after[0]");
($newlevel == $baklevel ) && last;
if ($newlevel > $baklevel) {
push(@$descendants, $after[0]);
}
shift @after;
}
}
sub numerically {$a <=> $b}
sub get_baklevel {
our %backupInfo = ();
my $bakinfo = "$_[0]/backupInfo";
(-f $bakinfo ) || warn "Can't read $bakinfo\n";
unless (my $ret = do $bakinfo) {
warn "couldn't parse $bakinfo: $@" if $@;
warn "couldn't do $bakinfo: $!" unless defined $ret;
print "couldn't run $bakinfo" unless $ret;
}
if ( !keys(%backupInfo) || !defined($backupInfo{num}) ) {
warn "$bakinfo is empty\n";
}
return $backupInfo{level};
}
-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List: https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki: http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/
|