ADSM-L

Re: ADSM versus Arcserve and Backup Exec

1998-09-16 14:42:33
Subject: Re: ADSM versus Arcserve and Backup Exec
From: Gene Mangum <gmangum AT UMICH DOT EDU>
Date: Wed, 16 Sep 1998 14:42:33 -0400
I am not talking about a general problem with 1000's of files.   AIX
has a very specific problem when the number of files exceeds the inode
table.   I'm talking 100% CPU for hours to process 100,000 files.
Depending on the amount of physical memory, this problem can kick in
in the tens-of-thousands range.

The application currently runs on a Sun 670MP, and performance with
this many files is acceptable.

We ran tests on AIX, Solaris, and Linux.   Linux won :-), Solaris did
OK, and AIX ran for hours at 100% CPU.

--
Gene Mangum
Gene Mangum
University of Michigan Medical Center


On Wed, 16 Sep 1998, Richard Sims wrote:

> >> We had a situation with one puny C20 with 256MB of memory where they had
> >> architected the application to write images files (50K to 200K each)
> >> into one single directory.  Unfortunately, they tracked around 1.5
> >> million files in a 90 day period.
> >
> >I believe the extremely poor performance was due to a design problem
> >with JFS.   We've battled with this.   We think it's due to an in-core
> >i-node table being filled.   We found that when the number of files
> >in a single directory exceeds the size of this table (the size of the
> >table is computed at boot time based on the amount of physical memory)
> >reading i-nodes will peg the CPU and take a looooooong time.
>
> >We opened a crit-sit (critical situation) with IBM because of a new
> >application which will have 1.5 million files in one directory.
> >Their only solution so far is to either rewrite the application or
> >buy a Sun.
>
> This is an old issue which comes up about every two months on the List.
> It's not an exotic problem, but merely that traditional file system 
> directories
> are primitive data structures which are extremely inefficient and bog down
> when you attempt to keep more than about a thousand files in a single
> directory.  That's why there are subdirectories.  Try to have your directory
> structure similar to an equilateral triangle and you will enjoy much better
> performance.  And, no, buying a Sun is not the solution: I know from
> experience that the same situation applies there.
>
> Rather than take up the issue with IBM, it would be more appropriate to take
> it up with the people who mis-designed the application which tries to keep
> such an unreasonable number of files within one directory level - they don't
> seem to have the benefit of experience to appreciate the impact that has.
> A conventional directory is not a database: it performs poorly if you attempt
> to use it as such.
>
>   Richard Sims, Boston University OIT
>
>