ADSM-L

Re: 15,000,000 + files on one directory backup

2005-06-18 17:57:23
Subject: Re: 15,000,000 + files on one directory backup
From: Ted Byrne <ted.byrne AT ADELPHIA DOT NET>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Sat, 18 Jun 2005 17:57:16 -0400
I would second Bill's addition of poorly-architected applications to
Richard's list of issues that should be (but are often not) addressed, or
even considered.  At another customer, we and the customer's sysadmins were
bedeviled by a weblog analysis application (which shall remain nameless)
that chose to store its data on the filesystem, using the date of the log
data as a directory under which the data was stored (as well as the
associated reports, I believe).  The explanation we were given was that
they had chosen to do this for application performance reasons; it was
apparently quicker that using a DBMS.

This decision, although it made random access of data quicker, had horrible
implications for backup as the log data and reports accumulated over time;
recovery was even worse.  Aggravating the situation was the insistence by
the "application owner" that ALL historical log data absolutely had to be
maintained in this inside-out database format.  Just getting a count of
files and directories on this drive (via selecting Properties from the
context menu) took something on the order of 9 hours to complete.  The
volume of data, in GB, was really not that large - something on the order
of 100 GB.  All of their problems managing the data stemmed entirely from
the large number of files and directories.

When the time came to replace the server hardware and upgrade the
application, they had extreme difficulty migrating the historical data from
the old server to the new.  They did finally succeeded in copying the data
from the old server to the new, but it took days and days of
around-the-clock network traffic to complete.

Addressing the ramifications of this type of design decision after the fact
is difficult at best.  If at all possible, we need to prevent it from
occurring in the first place.

Ted