Re: DLEs with large numbers of files



--On May 2, 2006 5:49:47 PM -0400 Ross Vandegrift <ross AT kallisti DOT us> 
wrote:

Hello everyone,

I recognize that this isn't really related to Amanda, but I thought
I'd see if anyone has a good trick...

A number of DLEs in my Amanda configuration have a huge number of
small files (sometimes hardlinks and symlinks, sometimes just copies)
- often times in the millions.  Of course this is a class corner case and
these DLEs can take a very long time to backup/restore.

I use estimated sizes and tar on these types of DLEs. Dump may be fasterif you can get away with it, but realistically they're both getting limitedby what amounts to stat() calls on the filesystem to ascertain themodification times of the various files. You can try for a filesystem thathas better small files performance or upgrade your storage hardware tosupport more IOPS.

Currently, they are mostly using dump (which will usually report
1-3MiB/s throughput).  Is there a possible performance advantage to using
tar instead?

On some of our installations I have bumped up the data timeouts.  I've
got one as high as 5400 seconds.  I suspect a reasonable maximum is
very installation dependant, but if anyone has thoughts, I'd love to
hear them.

Thanks for any ideas!

--
Ross Vandegrift
ross AT kallisti DOT us

"The good Christian should beware of mathematicians, and all those who
make empty prophecies. The danger already exists that the mathematicians
have made a covenant with the devil to darken the spirit and to confine
man in the bonds of Hell."
        --St. Augustine, De Genesi ad Litteram, Book II, xviii, 37




--
"Genius might be described as a supreme capacity for getting its possessors
into trouble of all kinds."
-- Samuel Butler