Veritas-bu

[Veritas-bu] Qued jobs not being assigned to tapes.

2001-06-29 10:40:14
Subject: [Veritas-bu] Qued jobs not being assigned to tapes.
From: jmeyer AT ptc DOT com (Jonathan Meyer)
Date: Fri, 29 Jun 2001 10:40:14 -0400
Robert,

we had a problem which was very similar to the symptoms you describe.
Our problem occurred on a single solaris master server and exhibited
the following symptoms.

    o Tapes were not unloaded when backups completed.
    o Jobs which were queued never became active.
    o Once the problem started, we could not get anything to run until
      we stopped and started all netbackup processes.

We worked with veritas support on this issue for a while, and they
could not find anything similar at first.

Eventually, they suggested that similar symptoms can be caused by
running out of reserved ports on the system.  The symptoms above are
not the listed symptoms caused by running out of reserved ports, but
apparently running out of these ports can cause unpredictable
behavior.

If your problem is similar to the one we experienced, there are a
number of possible solutions discussed at
http://seer.support.veritas.com/docs/234618.htm.

The solution we implemented was to reduce tcp_close_wait_interval by
putting the following in our boot sequence.

ndd -set /dev/tcp tcp_close_wait_interval 1000

This change fixed our problem.  This was on a solaris 2.6 system.  I
am not sure if the same parameter applies to other OS versions.

As the technote warns, this change should not be made lightly and
should be closely observed afterward.  The solaris default for this
parameter is 250000, so this is a significant change.

However, for those of you who have the vault product, you may
recognize that this same change is recommended in the vault manuals to
solve some problems with duplication on solaris.  It is my
understanding that this parameter will probably not cause adverse
effects if the systems are connected by a high speed switched network.

The technote describes other possible solutions, and also some details
about what you might see in your bptm log if this is the problem you
are having.

--------------------------------------------------
Jonathan Meyer
(781)370-6594
UNIX Systems Administrator
Paramtric Technology Corporation
--------------------------------------------------


<Prev in Thread] Current Thread [Next in Thread>