ADSM-L

Re: Performance Large Files vs. Small Files

2001-02-14 11:52:38
Subject: Re: Performance Large Files vs. Small Files
From: Thomas Denier <Thomas.Denier AT MAIL.TJU DOT EDU>
Date: Wed, 14 Feb 2001 11:53:00 -0500
> Does anyone have a TECHNICAL reason why I can backup 30GB of 2GB files that
are
> stored in one directory so much faster than 30GB of 2kb files that are
stored
> in a bunch of directories?
>
> I know that this is the case, I just would like to find out why.  If the
amount
> of data is the same and the Network Data Transfer Rate is the same between
the
> two backups, why does it take the TSM server so much longer to process the
> files being sent by the larger amount of files in multiple directories?
>
> I sure would like to have the answer to this.  We are trying to complete an
> incremental backup an NT Server with about 3 million small objects
(according
> to TSM) in many, many folders and it can't even get done in 12 hours.  The
> actual amount of data transferred is only about 7GB per night.  We have
other
> backups that can complete 50GB in 5 hours but they are in one directory and
the
> # of files is smaller.

As others have pointed out, the update activity to the TSM database is the
major factor. Even so, your performance is a bit poorer than I would expect.
We have a mail server with about 2.5 million files with an aggregate size of
about 45 GB. We migrated this server to new hardware about a month ago. The
first backup after the cutover took about 16 hours to backup the entire
population of files. Subsequent backups have been taking about an hour and a
half to inspect the entire population and backup about 100,000 files. We
recently discovered an error in the include/exclude file. Fixing the error
triggered a TSM database update per file for most of the files on the system,
but no increase in the number of files transfered. Doing 2.2 million updates
and the usual 100,000 backups took ten hours. Your note indicates that you are
backing up about a quarter of your data every night. If that implies backing
up about a quarter of the files, extrapolation from our performance figures
would predict a time in the neighborhood of five hours. You may need to do
some performance tuning to make database updates faster, as at least one other
respondent has suggested. If you are still using a TSM 3.1 client you should
probably upgrade to a TSM 3.7 or 4.1 client. We changed from a 3.1 to a 3.7
client when we migrated our mail server and saw a substantial performance
improvement. Part of this was undoubtedly due to the new hardware
configuration, but I am reasonably sure that some of the improvement resulted
from overlapping scanning for new and changed files with copying of new and
changed files.