Can this db2 param be updated while tsm6 is up?

TonyB · Jul 27, 2010

H,

Bit of a race condition here...as some of you may have gathered from previous posts I'm in test for tsm6.1(.3.4)...

Situation:

db2 active log filesystem extended to 20 GiB
TSM "activelogsize" is 7680 (7.5ish GiB) (ho ho, what was I finkin?)
DB2 primary log count = 15
DB2 secondary log count = 0
10 million object import node operation (delayed commit) is likely going to "fill" the active log in...ooh lets say 4 hours

The question is - can I update the db2 config, allowing N secondary logs to be dynamically created...without TSM chucking a huge wobbly? Is this even needed? Certainly the DB2 snap indicates that it has way less than 20 GiB of known space...but hey maybe it will dynamically allocate a primary/secondary log based on free filesystem space. Beats me.

I'd be pleased to hear if anyone has done this already...I'm bored of bouncing this TSM instance...rebuilding it will only take a few mins so breakage is no real issue.

Cheers,

T

TonyB · Jul 27, 2010

Update 1: TSM doesn't immediately chuck a wobbly...

Code:

    Total          Used          Free
Space(MB)     Space(MB)     Space(MB)
---------     ---------     ---------
    7,680      5,781.36      4,928.64

*edit*

Update 2: I'm now more confused.

So...you know the story so far. Ran the instance using a single secondary log...but forgot to disable the dbbackup trigger. At the end of the 2nd last primary log extent (14 of 15) the trigger went off and an incremental backup initiated.

At this point I'm having to revise my supposition that the import was running as a single (commit-/rollback-able) unit of work. Sure, the active logs had nearly all filled (14 of 15)...but the incremental backup caused them to be flushed out of the active log filesystem (presumably the db2logmgr task did it). According to my feeble brain that wouldn't have happened if they were still active.

So..err...no real news here then. Not exactly sure when the dbbackup trigger is *supposed* to go off (not well documented).

Bah.

Has anyone else managed to frob their instance this badly, or is it just me?

TonyB · Jul 27, 2010

OK...so it did break after all.

It would appear that the import operation is treated as a single unit of work. It failed with an interesting error.

The DB2 layer reported a log exhausted condition...amusingly though both the TSM admin interface and the DB2 sysibmadm tables reported that there was plenty of active log space (halfway through its first extent or 15).

So - my conjecture is that the triggered incremental backup cleared down the active logs BUT the unit of work still wouldn't fit within the total active log space. Neither the database manager, the database nor the TSM instance crashed - but the import operation did go belly-up. I still can't see why the db2 log manager removed the active logs given that the unit of work was still active - way beyond my current DB2 knowledge.

In a nutshell - triggered backups aren't useful to protect individual large units of work. They may do some good in protecting the server as a whole (not proven, but likely), but they also generate a misleading condition/error under certain conditions.

Addendum...I guess its just me

edit: thought I'd add some more specific data

When the transaction aborted the environment state was:

actlog mesages

Code:

ANR0130E dbieval.c(840): Server LOG space exhausted.
ANR0162W Supplemental database diagnostic information: -1:40003:-1224 ([IBM][CLI Driver] SQL1224N  The database manager is not able to accept new requests, has terminated all requests in progress, or has terminated your particular request due to an error or a force interrupt.  SQLSTATE=55032)

SQL error

SQL1224N... likely issued because the import task exceeded the DB2 max_log parameter (maximum primary log space for a single unit of work)... seems to be set to 90% either by default DB2 or TSM config ops.

DB2 log utilisation at crash time

Code:

@PTSMP3D1 / PTSMP3D1@TSMDB1 > select FIRST_ACTIVE_LOG as "First", CURRENT_ACTIVE_LOG as "Current", LAST_ACTIVE_LOG as "Last" from sysibmadm.snapdetaillog

First                Current              Last                
-------------------- -------------------- --------------------
                 159                  159                  173

TSM view of log utilisation at crash time

Code:

tsm: PTSMP3D1>q log

    Total          Used          Free
Space(MB)     Space(MB)     Space(MB)
---------     ---------     ---------
    7,680        274.41      7,375.59

TonyB · Jul 28, 2010

A postscript for the curious...

I've this persistent conjecture that the import operation is treated as a single unit of work. Doesn't seem to be subject to parameters like movebatchsize etc.

I can't seem to prove it though...which is a pity. I've tried updating the global database manager monitor to include UOW stats, and polled the connected applications - none of which have any collected uow log space utilisation. Applications can en/disable their own UOW stats though - so its just possible that the TSM devs are turning off this monitor deliberately within the dsmserv code.

Given the overall logging behaviour though it would seem that a single commit is being executed at the end of the import operation. Interestingly a commit seems to occur even if the import aborts with an SQL log space condition. Presumably the devs have added an exception handler which invokes commit when the abort occurs.

As to why the abort is occurring - I've got a feeling its down to what consitutes an "active log". The "source of truth" is probably the log buffer within the running DB2 instance. The logger process uses a write ahead policy to flush log buffer pages into the active log directory, and the same policy causes in-flight active logs to be archived to the arch directory as they become full.

The TSM level triggered backup clears down the active log directory - but I don't believe that it is (or can, by design) clear down the internal log buffer. Once this log buffer fills (or the unit of work occupies 90% of the buffer) you get the abort. Confusingly at the point of abort the db2 (and TSM) layers are reporting a largely empty active log.

The idea of using secondary logs won't help in this case, by the way - unless there is enough other transactional activity to occupy primary log space.

So - bottom line... using an export/import method of progressive migration between TSM5 and TSM6 has its associated perils. You'll need a large active log. In the absence of any other activity the environment I'm testing is consuming 15 GiB in active log space to import a 10 megafile node (not looking forwards to the one with 44 million files).

hth,

Tony

TonyB · Aug 2, 2010

After a little pondering it seems unlikely that the entire active log is buffered - too inefficient. One might conjecture then, that the head LSN + volume of transactional work is being compared to max_log. Same result either way.

Can this db2 param be updated while tsm6 is up?

TonyB

TonyB

TonyB

TonyB

TonyB

Data Privacy Impact Assessment

Sponsor ADSM.ORG

Navigation Menu

NordVPN 3 Months FREE

Forum statistics