==> In article <s0f7c1a7.077 AT health-es2.health.qld.gov DOT au>, Steve Harris
<Steve_Harris AT HEALTH.QLD.GOV DOT AU> writes:
> at 1% , 1-(0.99**30), or about .25
> at 2%, 1-(0.98**30) , or about .45
> at 3%, 1-(0.97**30), or about .60 (Please feel free to correct my maths if
> I'm wrong - probability was never my strong point)
I think your math is good; I'd add though that there's a -GREAT- deal of
locality of reference: Or in other words, if 3% of your files change a day,
then for user filespace probably 95% of the files that change tomorrow will be
the files that chaged yesterday, and so on. I think that 25 - 30% for userdir
space is probably about right.
> Now, of course its often the same files which change day after day, so real
> experience should be better than this, but at the time, I decided that the
> overhead of mainitianing two TSMs (and two clients per node) wasn't worth the
> benefit, and went with archives.
But I would disagree with your logic;
In place of the incremental possibilities, which would have led you at worst
above to backing up 60% every month, you're choosing to back up 100% every
month, without fail.
I think that this represents a substantially more costly strategy.
In fact, given the monthly-for-five-years figure and your numbers above, it's
about twice as expensive in facilities to archive monthly than to run
incrementals with similar retention characteristics; If my guess is closer to
correct, it's three- to four- times the cost.
For a small amount of data, this may be cheaper than the human organizational
time to build two sets of nodes; but by the time you get even medium size, I
think it would be pretty expensive.
- Allen S. Rout
|