The behavior you are referring to was
corrected some time ago. There have been two
changes. In '97, the man man page for
ovsnmp.conf was updated to match the actual
behavior, which was NOT exponential
increments, but something more arcane as described
in the APAR below. That behavior (decrementing!)
I remember as resulting in a lot of
false alarms that appeared to make
no sense. That behavior was modified in V6.0.1
to just do what you told it to do.
However, the man page for ovsnmp.conf still talks
about increments and decrements. Time
for another update...
(What increment exponentially, as I
recall, is name resolution timeouts with retries.)
Here's the current behavior, from the
release notes for 6.0.1, in the new features section:
Ping and SNMP Timeout Values
The netmon daemon no longer dynamically adjusts the
ping and
SNMP timeout values. These values remain as configured
in the
SNMP Options dialog.
And here's the description of the apar
that tells how it was behaving before that:
APAR - IX68241
TIMEOUT VALUE FOR STATUS
POLL IS INCREASED BY 1 SECOND, BUT MAN PAGE STATES AN EXPONENTIAL ALGORITHM
IS USED.
The man page for ovsnmp.conf describes
the timeoutInterval for
polling a device. The timeoutInterval value is described
to
begin at x and is doubled until the specified timeoutInterval
value is reached on the last retry. This is not how the
timeoutInterval value actually works. The first status
poll
to a device uses the timeoutInterval value specified. If the
device replies within this time, then the value is decremented
by 1 second. The value will continue to be decremented until
it reaches either 1 second or the device no longer responds to
the poll within the time. When the device no longer responds
within the time, then the timeoutInterval value is increased
by 1 second, except on the last retry the timeoutInterval value
is increased to the full value specified plus 1 second.
The ovsnmp.conf man page has been updated
to correct the description
of the timeoutInterval value.
Cordially,
Leslie A. Clark
IBM Global Services - Systems Mgmt & Networking
(248) 552-4968 Voicemail, Fax, Pager
"Evans, Bill" <Bill.Evans AT hq.doe DOT gov>
Sent by: owner-nv-l AT lists.us.ibm DOT com
06/24/2005 04:49 PM
|
To
| "'nv-l AT lists.us.ibm DOT com'"
<nv-l AT lists.us.ibm DOT com>
|
cc
|
|
Subject
| RE: [nv-l] Status Polling |
|
Your description was much clearer than mine.
"NetView
compensates by its geometrically increasing waits on retries"
Translation: "Every retry doubles the timeout value"
= "a geometric progression".
The NetView for Administrators class also used to warn
attendees not to set the retries and wait time so high that the total exceeded
the polling cycle. Your example of 7 retries with 1 second timeout
would exceed a two minute polling cycle. NetView Administration
is not for the arithmetically challenged or those who don't appreciate
the relationships among the tuning values.
Thanks for clearing up my muddy wording.
Bill Evans
-----Original Message-----
From: owner-nv-l AT lists.us.ibm DOT com [mailto:owner-nv-l AT lists.us.ibm DOT com]
On Behalf Of Barr, Scott
Sent: Friday, June 24, 2005 4:28 PM
To: nv-l AT lists.us.ibm DOT com
Subject: RE: [nv-l] Status Polling
One warning about retries.
Each time you retry, the SNMP or ping, netmon appears to
double the
timeout value. So, if you set 7 retries with 1 second time out, you get
1, 2, 4, 8, 16, 32, 64 seconds timeout values.
If you have a lot of nodes this way, that can cause more
issues than it
solves.
One caveat, I've been doing TEC/Framework/ITM for a while
so the way
netmon behaves may have changed some time ago.
|