Re: Thoughts on Monthly Archives

Allen, 

Your post got me thinking as to just why I decided what I did  ―  and now I 
remember :)

There was another management requirement that all production servers be backed 
up in full once per year and that snapshot be kept "forever" - there is no 
reasoning with this, its one of those stupid mandates that applies to the whole 
of the state government, and if data is not able to be categorized, then it 
must be kept.

I think you'll recognise that having a third TSM Server for the yearly backup 
isn't really an option, so an archive mechanism is the only one that will work 
for that.  Backupsets weren't an option when this was set up as they were very 
new and not mature (IMHO they're still not mature because I can't stack them, 
track them in a reasonable way, or use them for TDP data).  Having an archive 
mechanism for monthlies means just changing the archive mgmtclass once a year 
and voila, your monthly becomes a yearly.

Given that management loves the idea of permanent retention, database size is 
obviously not a problem.  Hence the argument comes down to ease of maintenance 
and the monthly archive is better at that - I have a perl script that generates 
the backup schedules with correct domain statements based on the current backup 
filespaces every month, and each February we generate with the "eternal" 
mgmtclass.  Its mostly automated.

Steve.


>>> asr AT UFL DOT EDU 17/07/2004 2:07:39 >>>
==> In article <s0f7c1a7.077 AT health-es2.health.qld.gov DOT au>, Steve Harris 
<Steve_Harris AT HEALTH.QLD.GOV DOT AU> writes:


> at 1% , 1-(0.99**30), or about .25
> at 2%,  1-(0.98**30) , or about .45
> at 3%,  1-(0.97**30), or about .60   (Please feel free to correct my maths if 
> I'm wrong - probability was never my strong point)

I think your math is good; I'd add though that there's a -GREAT- deal of
locality of reference: Or in other words, if 3% of your files change a day,
then for user filespace probably 95% of the files that change tomorrow will be
the files that chaged yesterday, and so on.  I think that 25 - 30% for userdir
space is probably about right.


> Now, of course its often the same files which change day after day, so real
> experience should be better than this, but at the time, I decided that the
> overhead of mainitianing two TSMs (and two clients per node) wasn't worth the
> benefit, and went with archives.

But I would disagree with your logic;

In place of the incremental possibilities, which would have led you at worst
above to backing up 60% every month, you're choosing to back up 100% every
month, without fail.

I think that this represents a substantially more costly strategy.

In fact, given the monthly-for-five-years figure and your numbers above, it's
about twice as expensive in facilities to archive monthly than to run
incrementals with similar retention characteristics; If my guess is closer to
correct, it's three- to four- times the cost.

For a small amount of data, this may be cheaper than the human organizational
time to build two sets of nodes; but by the time you get even medium size, I
think it would be pretty expensive.

- Allen S. Rout



***********************************************************************************
This email, including any attachments sent with it, is confidential and for the 
sole use of the intended recipient(s).  This confidentiality is not waived or 
lost, if you receive it and you are not the intended recipient(s), or if it is 
transmitted/received in error.

Any unauthorised use, alteration, disclosure, distribution or review of this 
email is prohibited.  It may be subject to a statutory duty of confidentiality 
if it relates to health service matters.

If you are not the intended recipient(s), or if you have received this email in 
error, you are asked to immediately notify the sender by telephone or by return 
email.  You should also delete this email and destroy any hard copies produced.
***********************************************************************************