>> On Thu, 9 Nov 2006 11:00:32 -0600, Wesley Smith <Wesley.Smith AT LA DOT GOV>
>> said:
> I've seen the report as currently generated and noted a number of
> problems with it. The report runs at a scheduled time rather than
> having job triggers that would kick it off after the successful
> completion of all backups. As a result, the report will show
> backups that started but without showing that they have completed.
> On some days, there will be very few of these. On other days, quite
> a few. Throwing stuff like that into the mix of the real errors and
> other "pseudo errors" and you find yourself trying to chase down a
> lot of non-errors.
We smooth out some of these problems by using the abstraction "How
long has it been since I got a good backup", instead of "Did anything
fail". We set a threshold, (usually one day) and have a one day grace
period; this means that transient failures don't attract notice, but
two successive ones do.
In general, successive identification of irrelevant errors is
something a TSM admin spends a lot of time on. If you can push that
identification and determination onto the person actually responsible
for the server being backed up, that's well worth it.
> I will be passing along to the appropriate people that perhaps there
> is some additional filtering that could be done to these reports to
> reduce their size to something that is more manageable. I'm hoping
> that we will be able to come up with some filtering and scripting
> aids that will help to automate this process as much as possible and
> reduce to a minimum the need for the Tivoli support person to spend
> a lot of time every day just reviewing the night's work.
Send them here, too.
- Allen S. Rout
|