Howdy, TSM folks.
So I think I've gotten to the bottom of a performance issue I've been
seeing recently (and a crash!) and I wanted to compare notes.
Cutting to the chase, I've been seeing obnoxious log consumption on one
of my TSM servers every night recently, and once a few months ago it got
bad enough that it blew out the log and crashed with the 'gotta extend
the log' situation. Routine recovery, but irritating.
At the moment, I'm attributing the problem to a growing population of v6
clients which, when doing system state backups, are "reassigning"
objects from the previous backup to the current one.
Now, I understand why they're doing this: It gets us back into the
incremental zone, from the miserable 'system state :== full' situation.
But the law of unintended consequences comes in stage left.
Since many (most?) of the system state is in fact static, that means
that each machine is going to reassign most of its system state.
How fast can it do that? How fast is your database? That's how fast.
Last night I watched what felt like a normally busy evening, and the
log-full percentage was growing before my eyes; as in, wait a minute
and see three percent advance, and that's on an 11692MB log.
So I've got my trigger set at 60%, but it's blowing through the
remaining 40 like nothing. I get to the 'server log is [foo] full,
delaying transactions' state on a regular basis.
As a band-aid, I'm going to talk to the customers and see if some of
this population of machines can rationally be excluded from SYSTEM STATE
backups: Offhand, I think their DR plans don't include BMR from my TSM.
But that's short-term thinking.
What are you-all doing about this? Increasing number of DB incrs
between fulls? Something else?
- Allen S. Rout