Amanda-Users

Re: size estimate puts very high load on my LDAP server

2007-11-23 12:10:26
Subject: Re: size estimate puts very high load on my LDAP server
From: Cyrille Bollu <Cyrille.Bollu AT fedasil DOT be>
To: John E Hein <jhein AT timing DOT com>
Date: Fri, 23 Nov 2007 18:03:19 +0100

Hi,

Well, thank you for this deep analysis.

I quite agree with you except on one point: What you say about the advantages/disadvantages of such a feature is right but, in the case of DLE size estimation, I really can't think of any advantages of using this [--numeric-owner] option. And that's particularly what causes such a high load on my LDAP server; After DLE size estimation (about 60 minutes... Well it's backing up around 1,5TB :-) everything goes better (the backup itself takes around 4 hours :-)

What do amanda developers think? Wouldn't the following patch to amanda-2.5.2b1 do the trick?

$ diff -C 2 client-src/sendsize.c client-src/sendsize_new.c
*** client-src/sendsize.c       Fri Nov 23 17:53:56 2007
--- client-src/sendsize_new.c   Fri Nov 23 17:53:19 2007
***************
*** 1987,1990 ****
--- 1987,1993 ----
      my_argv[i++] = "--file";
      my_argv[i++] = "/dev/null";
+ #ifdef GNUTAR
+     my_argv[i++] = "--numeric-owner";
+ #endif
      my_argv[i++] = "--directory";
      my_argv[i++] = dirname;

I hope that this will be my (first) modest contribution to this wonderful piece of code ;-)

Best regards,

Cyrille Bollu
Responsable systèmes
Fedasil - ICT
tel: +32.2.213.43.49
gsm: +32.478.23.08.15


PS: I'm already using nscd on my clients for LDAP caching but that doesn't suffice. My ldap server might well be badly configured but I'm backing up around (probably more) 2.000.000 files though...

John E Hein <jhein AT timing DOT com> a écrit sur 22/11/2007 20:31:24 :

> Cyrille Bollu wrote at 17:53 +0100 on Nov 22, 2007:
>  > is there something to do to prevent "tar" to lookup for username when it's
>  > estimating the size of the DLE's (like when you do "ls -ln" instead of "ls
>  > -l")?
>  >
>  > It seems that this process puts a very high load on my LDAP server... Such
>  > a high load that I'm planing to install a new LDAP slave only for the
>  > backup
>  >
>  > Any clue?
>
> From gtar docs...
>
> `--numeric-owner'
>      This option will notify `tar' that it should use numeric user and
>      group IDs when creating a `tar' file, rather than names.
>
> I'm not sure amanda supports passing arbitrary args to gtar.  In
> client-src/sendbackup-gnutar.c, the mechanism used to optionally
> support --atime-preserve was a compile (configure) time option.  There
> used to be a --enable-gnutar-atime-preserve option to configure, but
> that looks like it has disappeared (it was never really an advisable
> option).
>
> If there is a run-time way to add options to the gtar invocation, I
> don't know about it, but I'm sure someone will chime in if there is.
>
>
> Maybe we should just turn on --numeric-owner by default.  I can't
> think of any good reason why we shouldn't.  On extraction for restore,
> it won't really help to have the username in the archive.  I don't
> think gtar supports translating username to a different uid if the uid
> differs on the extracted system.
>
> Hmmm... after testing, it seems that gtar does look up the username on
> extraction and change the uid accordingly.  If the username on the
> system where you untar has a different uid than on the system where
> the archive was created, gtar will extract and chown a file such that
> the extracted file has the new uid (unless you extract with
> --numeric-owner).
>
> For amanda, however, where you typically restore in order to recreate
> a system exactly as it was before, that seems to be an unnecessary
> option.  But I suppose I could see a case where the uid for user "joe"
> has changed (for whatever reason) and he wants a file of his restored
> from a year ago before the uid change.  In that case, he'd probably
> want the file to still be owned by joe even though the BOFH changed
> his uid on him.
>
> In any case, allowing --numeric-owner to be optionally used seems like
> a reasonable thing for amanda to support.
>
> Beyond the scope of the immediate question, but related... what about
> ldap name service caching?  I'll admit my lack of knowledge in this
> area, but that seems like one possible way to help take some of the
> load off the server.