Amanda-Users

Re: HUGE estimation time

2004-08-25 05:47:20
Subject: Re: HUGE estimation time
From: Sven Rudolph <rudsve AT drewag DOT de>
To: Geert Uytterhoeven <geert AT linux-m68k DOT org>
Date: 25 Aug 2004 11:45:46 +0200
Geert Uytterhoeven <geert AT linux-m68k DOT org> writes:

> On Tue, 17 Aug 2004, Sven Rudolph wrote:
> > Joshua Baker-LePain <jlb17 AT duke DOT edu> writes:
> > > On Thu, 29 Jul 2004 at 11:10am, Narada Hess wrote
> > > > I was having estimation timeout failures, so based on advice from this
> > > > group (thanks), I increased the etimout value in amanda.conf from 600 to
> > > > 6000. Yay, now my backups work! But I am frankly astonished at the fact
> > > > that estimation took almost twice as long as the actual dump. Is this
> > > > normal, or is there some way to speed this up?
> > >
> > > That depends.  The estimate phase with tar (which from below is what I
> > > assume you're using) does tar cf /dev/null (among some other flags).  So
> > > it just stats the files, it doesn't actually read them off the disk.  This
> > > is generally *very* fast.  But if you have a case pathological to your
> > > filesystem (e.g. *lots* of very small files), it can be slowed down
> > > immensely.
> >
> > I once watched top while I had the problem here. There is a phase in
> > GNU tar where it does no disc access but eats one CPU. Looks like it
> > is trying to sort the index of all files (probably sorting by inode in
> > order to find hardlinks).
> >
> > So the case is not pathological to the filesystem but to GNU tar. cpio
> > avoids this; it simply writes the files out twice.
> 
> So using cpio would expand my backups by a huge factor? I have a file system
> with ca. 5 million hard links for about 150000 separate files.

I'm partially wrong, this depends on the choosen cpio format. This
behaviour occurs with GNU cpio with old format, but not with newc. I
believe I saw this on an old HP/UX cpio too. So the interesting
question is whether GNU cpio with newc solves this problem more
efficient than GNU tar.

        Sven

<Prev in Thread] Current Thread [Next in Thread>