Amanda-Users

Re: Amanda, separable but related problems, cross architecture.

2009-07-03 15:37:22
Subject: Re: Amanda, separable but related problems, cross architecture.
From: Paul Bijnens <paul.bijnens AT xplanation DOT com>
To: Brian Cuttler <brian AT wadsworth DOT org>, amanda-users AT amanda DOT org
Date: Fri, 03 Jul 2009 21:22:59 +0200
On 07/03/2009 05:49 PM, Brian Cuttler wrote:
I am perfroming dump from Solaris 10 x86, using amanda 2.6.1
to an SL24/LTO4.

The following is cronological, some of the issues are separable.

The particular client with the issue is a MAC server with over
250Gig of data.

The dumps where taking a _lot_ time, so I tried to divide the DLE
into 2 separate DLEs. Of of those was still large, so I tried to
divide it into 2 again.

Rather than dumping /, I was dumping /Users and /trel, trel was _large_
over 250 gig by itself.

trel:/trel cssadmin$ ls
Active Staff Folders    Funded Programs         Quality Assurance
Applications            Graphpad                References
Archived Studies        Images                  Research Studies
Archives                LINC                    Temporary Items
DRC+ PE07 Blood         Lab Operations          Trace Elements
DRCII PE10 Urine        Lead Poisoning          Trash
DRCII PE12 Research     NYS PT Program
External PT

I attempted the following in the disklist

trel   /Users       comp-user-tar
#trel   /trel        user-tar
trel   /trelAM /trel   {
        comp-user-tar
        exclude "[N-Z]*"
        }

trel   /trelNZ /trel   {
        comp-user-tar
        exclude "[A-M]*"
        }

Which naturally, since there where 'new' DLEs attempted level 0 dumps
on both.

However the combined for the two new DLEs was not the expected 250G but
only about 50 Gig.


Those exclude statements are not doing what you expect.

There is an assymetry in Amanda's use of include and
exclude, due to the how gtar implements both features

For gnutar, the arguments for "exclude" are glob patterns
that are applied to each file *part* (basename).

Assuming a file tree like:

./
./References/
./References/Zdir/
./References/Adir/
./References/Adir/Zfile
./Archives/
./Archives/Ydir/
./Archives/Bdir/

then using gnutar like this shows the file which will be
included:

  $ tar -cvf /dev/null --exclude "A*"
  ./
  ./References/
  ./References/Zdir/

You noticed? Not only "Archived", but also the subfolder "Adir" matches
the exclude pattern (on the basename of the pathname) and is omitted.

But anchoring the pattern to the start of the path works, just like
in the shell:

  $ tar -cvf /dev/null --exclude="./A*" .
  ./
  ./References/
  ./References/Zdir/
  ./References/Adir/
  ./References/Adir/Zfile

So in your exclude scheme you exclude once ALL the files
and folders start with an uppercase "A-M" in one case and "N-Z" in
the other case. That means that you will miss files that happen to
a subfolder of the other class each time!
Consider a folder like "References/Adir": the contents will be
omitted each time. Because when excluding "[A-M]" the whole folder
Adir will be omitted and when excluding "[N-Z]" will exclude
"References" as a whole.

The implementation of "include" in gnutar is completely different.
First gnutar does NOT consider the arguments as globs. Actually
the "include" is just like naming multiple toplevel folders (like the "."
in my example above).
Having a glob on the other hand is a nice feature, so Amanda actually
expands the arguments as glob (limiting to only 1 level!) and hands
the expanded list to gnutar.

That is why this would work as you intended (changing some other
details that I find easier):

  trel   /trel/./_AM_  /trel   {
          comp-user-tar
          include "./[A-M]*"
          }

  trel   /trel/./_NZ_  /trel   {
          comp-user-tar
          include "./[N-Z]*"
          }

  trel  /trel/./_REST_  /trel  {
          comp-user-tar
          exclude append "./[A-M]*"
          exclude append "./[N-Z]*"
          }

- I use the selfdocumenting notation "/top/level/folder/./_mnemonic_" with
the whole toplevel folder first, followed by a separator "/./", followed by a
mnemonic that summarizes the contents.
- The exclude patterns *always* starts with "./" to anchor them
to the toplevel, and get the symmetry with the include syntax, as implemented
by Amanda back.
- And I *always* add a _REST_ that carefully uses "exclude append" with exactly
the same exclude patterns (that have then indeed the same meaning).
You never know how inventive people (or yourself) can be when creating
unforseen toplevel folders, like lowercase, numbers, high-ascii codes, spaces,
control chars etc.


--
Paul