ADSM-L

Re: Thoughts on Monthly Archives

2004-07-16 12:07:54
Subject: Re: Thoughts on Monthly Archives
From: asr AT UFL DOT EDU
To: ADSM-L AT VM.MARIST DOT EDU
Date: Fri, 16 Jul 2004 12:07:39 -0400
==> In article <s0f7c1a7.077 AT health-es2.health.qld.gov DOT au>, Steve Harris 
<Steve_Harris AT HEALTH.QLD.GOV DOT AU> writes:


> at 1% , 1-(0.99**30), or about .25
> at 2%,  1-(0.98**30) , or about .45
> at 3%,  1-(0.97**30), or about .60   (Please feel free to correct my maths if 
> I'm wrong - probability was never my strong point)

I think your math is good; I'd add though that there's a -GREAT- deal of
locality of reference: Or in other words, if 3% of your files change a day,
then for user filespace probably 95% of the files that change tomorrow will be
the files that chaged yesterday, and so on.  I think that 25 - 30% for userdir
space is probably about right.


> Now, of course its often the same files which change day after day, so real
> experience should be better than this, but at the time, I decided that the
> overhead of mainitianing two TSMs (and two clients per node) wasn't worth the
> benefit, and went with archives.

But I would disagree with your logic;

In place of the incremental possibilities, which would have led you at worst
above to backing up 60% every month, you're choosing to back up 100% every
month, without fail.

I think that this represents a substantially more costly strategy.

In fact, given the monthly-for-five-years figure and your numbers above, it's
about twice as expensive in facilities to archive monthly than to run
incrementals with similar retention characteristics; If my guess is closer to
correct, it's three- to four- times the cost.

For a small amount of data, this may be cheaper than the human organizational
time to build two sets of nodes; but by the time you get even medium size, I
think it would be pretty expensive.

- Allen S. Rout