I had the same issue on my 5.5. servers backing up 6.2.x.x client system
states and was able to remedy the situation by moving my database and
log files to faster drives.
Historically, my TSM servers have not had sufficient local storage to
accommodate these volumes (as well as cached disk storage) so they lived
on an EMC Clariion, which had both SATA and Fibre drives. At some point
during a San upgrade it appears that the TSM storage was moved to the
slower SATA drives.
I had been chasing pinning logs and SLOW system state backups for months
and last week as a test we decided to move the TSM database, logs, and
disk pools to an EMC Vmax. Since then log usage has not been over 20%
and system state backups are completing with almost no lag time.
Comparatively, for 3-4 months the log files have been at 85-98 percent
utilization every night, which as you probably know usually end up with
tasks being canceled to prevent a total system failure.
While I realize that everyone may not have the option of faster storage
available I thought it important to re-emphasize how the storage layout
and disk speed for the TSM database/log storage can affect server
Before the test/change system state backups would appear to hang for
several hours while calculating what needs to be backed up, and any
large files would pin the log.
From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf Of
Allen S. Rout
Sent: Monday, April 16, 2012 2:22 PM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: [ADSM-L] Objects Assigned vs. Your Database.
Howdy, TSM folks.
So I think I've gotten to the bottom of a performance issue I've been
seeing recently (and a crash!) and I wanted to compare notes.
Cutting to the chase, I've been seeing obnoxious log consumption on one
of my TSM servers every night recently, and once a few months ago it got
bad enough that it blew out the log and crashed with the 'gotta extend
the log' situation. Routine recovery, but irritating.
At the moment, I'm attributing the problem to a growing population of v6
clients which, when doing system state backups, are "reassigning"
objects from the previous backup to the current one.
Now, I understand why they're doing this: It gets us back into the
incremental zone, from the miserable 'system state :== full' situation.
But the law of unintended consequences comes in stage left.
Since many (most?) of the system state is in fact static, that means
that each machine is going to reassign most of its system state.
How fast can it do that? How fast is your database? That's how fast.
Last night I watched what felt like a normally busy evening, and the
log-full percentage was growing before my eyes; as in, wait a minute
and see three percent advance, and that's on an 11692MB log.
So I've got my trigger set at 60%, but it's blowing through the
remaining 40 like nothing. I get to the 'server log is [foo] full,
delaying transactions' state on a regular basis.
As a band-aid, I'm going to talk to the customers and see if some of
this population of machines can rationally be excluded from SYSTEM STATE
backups: Offhand, I think their DR plans don't include BMR from my TSM.
But that's short-term thinking.
What are you-all doing about this? Increasing number of DB incrs
between fulls? Something else?
- Allen S. Rout