Re: ADSM versus Arcserve and Backup Exec

IBM started talking about the DNLC (directory name lookup cache),
but we don't think that's the problem, since we don't see the bad
performance with a simple "ls" (with no flags).   Why do you say
that the Incore Inode Table is not used for "ls -l"?   Information
from the inodes is read and displayed.   We haven't found very
much info about the Incore Inode Table.   If you have a reference
where we can get more info, I would really appreciate it.   We found
only one reference to it in a book about v3.2, and it didn't give
much detail.

We see a dramatic increase in response time once we exceed a certain
ratio of #files to amount of system memory.   We ran tests with
increasing numbers of files and plotted the response time.   The line
for a Solaris system was slightly worse than linear, while there was
a sharp upward curve for the AIX line.   The knee of the curve is
farther out for a system which more physical memory.

We wrote test programs which just did "stat" calls against the files,
and we saw the same behavior as "ls -l", find . -print, Sysback/6000,
and 2 application programs.

--
Gene Mangum
Gene Mangum
University of Michigan Medical Center


On Sun, 20 Sep 1998, Peter Gathercole wrote:

> If I understand what you are referring to, you mean the Incore Inode Table. 
> This
> should not be used by the stat (or statx on AIX) system call, as used by ls 
> -l, but
> only for open files. Did IBM actually confirm your assumption?
>
> Peter Gathercole
> Open Systems Consultant
>
> Gene Mangum wrote:
>
> > I am not talking about a general problem with 1000's of files.   AIX
> > has a very specific problem when the number of files exceeds the inode
> > table.   I'm talking 100% CPU for hours to process 100,000 files.
> > Depending on the amount of physical memory, this problem can kick in
> > in the tens-of-thousands range.
> >
> > The application currently runs on a Sun 670MP, and performance with
> > this many files is acceptable.
> >
> > We ran tests on AIX, Solaris, and Linux.   Linux won :-), Solaris did
> > OK, and AIX ran for hours at 100% CPU.
> >
> > --
> > Gene Mangum
> > University of Michigan Medical Center
> >
> > On Wed, 16 Sep 1998, Richard Sims wrote:
> >
> > > >> We had a situation with one puny C20 with 256MB of memory where they 
> > > >> had
> > > >> architected the application to write images files (50K to 200K each)
> > > >> into one single directory.  Unfortunately, they tracked around 1.5
> > > >> million files in a 90 day period.
> > > >
> > > >I believe the extremely poor performance was due to a design problem
> > > >with JFS.   We've battled with this.   We think it's due to an in-core
> > > >i-node table being filled.   We found that when the number of files
> > > >in a single directory exceeds the size of this table (the size of the
> > > >table is computed at boot time based on the amount of physical memory)
> > > >reading i-nodes will peg the CPU and take a looooooong time.
> > >
> > > >We opened a crit-sit (critical situation) with IBM because of a new
> > > >application which will have 1.5 million files in one directory.
> > > >Their only solution so far is to either rewrite the application or
> > > >buy a Sun.
> > >
> > > This is an old issue which comes up about every two months on the List.
> > > It's not an exotic problem, but merely that traditional file system 
> > > directories
> > > are primitive data structures which are extremely inefficient and bog down
> > > when you attempt to keep more than about a thousand files in a single
> > > directory.  That's why there are subdirectories.  Try to have your 
> > > directory
> > > structure similar to an equilateral triangle and you will enjoy much 
> > > better
> > > performance.  And, no, buying a Sun is not the solution: I know from
> > > experience that the same situation applies there.
> > >
> > > Rather than take up the issue with IBM, it would be more appropriate to 
> > > take
> > > it up with the people who mis-designed the application which tries to keep
> > > such an unreasonable number of files within one directory level - they 
> > > don't
> > > seem to have the benefit of experience to appreciate the impact that has.
> > > A conventional directory is not a database: it performs poorly if you 
> > > attempt
> > > to use it as such.
> > >
> > >   Richard Sims, Boston University OIT
> > >
> > >
>
>