On Thu, Oct 14, 2004 at 09:27:17AM -0700, Paul Schmidt wrote:
> Hello,
>
> Since I have a large filesystem that is larger than my 40GB tapes, I
> use the gnutar exclude lists features to back this up. Since the
> method is somewhat error-prone to forgetting things/excluding too
> much, I was wondering if anyone had a script to show me all of the
> files on my filesystem that are NOT covered with my disklist file
> entries.
>
> The large data I am backing up is on a partition that also has smaller
> data directories on it too. The large data is all under a common
> path, and is broken down into subdirectories which themselves DO fit
> on a tape, such as:
>
> /somepath/
> smalldir/
> othersmalldir/
> largedir/
> A/
> bunch of files that all fit on a tape
> B/
> bunch of files that all fit on a tape
> ...
> anothersmalldir/
>
>
> The way I have this set up is /somepath/ has a disklist entry, with an
> exclude file that specifies ./largedir/* This gathers all of the
> small data dirs all at once.
>
> Then, each directory under largedir has its own disklist entry, such
> as /somepath/largedir/A for example. This is the error prone part.
>
> The A, B, etc. directories don't change TOO often, but they're not
> static. I am looking for a tool to make it easier to verify that all
> the necessary disklist entries have been made and that none of the
> important data (anywhere on my server) has been accidentally left out.
>
> Any suggestions for how I can do my configuration better that might
> prevent some of these issues would be appreciated as well.
>
YMMV, but I think this would work; I'm assuming you are indexing.
Compare the files listed in the index(es) with a find on the
parent (highest level) DLE directory. For example, suppose
you are trying to backup /somepath and all things under it
in several DLEs here is a plan for a shell script.
1) Create a temporary file of the entire tree
cd /somepath
find . -xdev > /tmp/some_current # use -xdev to not cross FS
2) Create a temporary file of all the files in the indexes
cd <your index directory>
# note the names of all the most recent level 0's of each
# dle of interest, probably _somepath*/*_0.gz.
# this could be automated - I think - with something like
for dir in _somepath*
do
ls $dir/*_0.gz | tail -1
# uncompress and combine each of the above into a single
# temporary file
cp /dev/null /tmp/some_index # create or empty the file
# continuing the automation from above
done |
while read idxfile
do
gzip -d $idxfile >> /tmp/some_index
done
3) Some editing, possibly with sed, will be needed to make the
find output and the index data match. - I think - this might
work (changing the last 2 lines of automation).
gzip -d $idxfile
done |
sed -e 's/^/./' -e 's,/$,/,' > /tmp/some_index
4) sort the two temporary files
sort -o /tmp/some_current /tmp/some_current
sort -o /tmp/some_index /tmp/some_index
5) use comm to determine what is missing or added
# files in both lists
comm -12 /tmp/some_current /tmp/some_index
# files in find output only
comm -23 /tmp/some_current /tmp/some_index
# files in indexes only
comm -13 /tmp/some_current /tmp/some_index
HTH
jon
--
Jon H. LaBadie jon AT jgcomp DOT com
JG Computing
4455 Province Line Road (609) 252-0159
Princeton, NJ 08540-4322 (609) 683-7220 (fax)
|