ADSM-L

Re: ADSM versus Arcserve and Backup Exec

1998-09-22 21:10:59
Subject: Re: ADSM versus Arcserve and Backup Exec
From: Gene Mangum <gmangum AT UMICH DOT EDU>
Date: Tue, 22 Sep 1998 21:10:59 -0400
IBM started talking about the DNLC (directory name lookup cache),
but we don't think that's the problem, since we don't see the bad
performance with a simple "ls" (with no flags).   Why do you say
that the Incore Inode Table is not used for "ls -l"?   Information
from the inodes is read and displayed.   We haven't found very
much info about the Incore Inode Table.   If you have a reference
where we can get more info, I would really appreciate it.   We found
only one reference to it in a book about v3.2, and it didn't give
much detail.

We see a dramatic increase in response time once we exceed a certain
ratio of #files to amount of system memory.   We ran tests with
increasing numbers of files and plotted the response time.   The line
for a Solaris system was slightly worse than linear, while there was
a sharp upward curve for the AIX line.   The knee of the curve is
farther out for a system which more physical memory.

We wrote test programs which just did "stat" calls against the files,
and we saw the same behavior as "ls -l", find . -print, Sysback/6000,
and 2 application programs.

--
Gene Mangum
Gene Mangum
University of Michigan Medical Center


On Sun, 20 Sep 1998, Peter Gathercole wrote:

> If I understand what you are referring to, you mean the Incore Inode Table. 
> This
> should not be used by the stat (or statx on AIX) system call, as used by ls 
> -l, but
> only for open files. Did IBM actually confirm your assumption?
>
> Peter Gathercole
> Open Systems Consultant
>
> Gene Mangum wrote:
>
> > I am not talking about a general problem with 1000's of files.   AIX
> > has a very specific problem when the number of files exceeds the inode
> > table.   I'm talking 100% CPU for hours to process 100,000 files.
> > Depending on the amount of physical memory, this problem can kick in
> > in the tens-of-thousands range.
> >
> > The application currently runs on a Sun 670MP, and performance with
> > this many files is acceptable.
> >
> > We ran tests on AIX, Solaris, and Linux.   Linux won :-), Solaris did
> > OK, and AIX ran for hours at 100% CPU.
> >
> > --
> > Gene Mangum
> > University of Michigan Medical Center
> >
> > On Wed, 16 Sep 1998, Richard Sims wrote:
> >
> > > >> We had a situation with one puny C20 with 256MB of memory where they 
> > > >> had
> > > >> architected the application to write images files (50K to 200K each)
> > > >> into one single directory.  Unfortunately, they tracked around 1.5
> > > >> million files in a 90 day period.
> > > >
> > > >I believe the extremely poor performance was due to a design problem
> > > >with JFS.   We've battled with this.   We think it's due to an in-core
> > > >i-node table being filled.   We found that when the number of files
> > > >in a single directory exceeds the size of this table (the size of the
> > > >table is computed at boot time based on the amount of physical memory)
> > > >reading i-nodes will peg the CPU and take a looooooong time.
> > >
> > > >We opened a crit-sit (critical situation) with IBM because of a new
> > > >application which will have 1.5 million files in one directory.
> > > >Their only solution so far is to either rewrite the application or
> > > >buy a Sun.
> > >
> > > This is an old issue which comes up about every two months on the List.
> > > It's not an exotic problem, but merely that traditional file system 
> > > directories
> > > are primitive data structures which are extremely inefficient and bog down
> > > when you attempt to keep more than about a thousand files in a single
> > > directory.  That's why there are subdirectories.  Try to have your 
> > > directory
> > > structure similar to an equilateral triangle and you will enjoy much 
> > > better
> > > performance.  And, no, buying a Sun is not the solution: I know from
> > > experience that the same situation applies there.
> > >
> > > Rather than take up the issue with IBM, it would be more appropriate to 
> > > take
> > > it up with the people who mis-designed the application which tries to keep
> > > such an unreasonable number of files within one directory level - they 
> > > don't
> > > seem to have the benefit of experience to appreciate the impact that has.
> > > A conventional directory is not a database: it performs poorly if you 
> > > attempt
> > > to use it as such.
> > >
> > >   Richard Sims, Boston University OIT
> > >
> > >
>
>
<Prev in Thread] Current Thread [Next in Thread>