[Veritas-bu] NBU 4.5 FP5 Activity Monitor Allocation Failed Error
2003-10-30 12:00:11
Subject: |
[Veritas-bu] NBU 4.5 FP5 Activity Monitor Allocation Failed Error |
From: |
David Rock <dave-bu AT graniteweb DOT com> (David Rock) |
Date: |
Thu, 30 Oct 2003 11:00:11 -0600 |
--H1spWtNR+x+ondvy
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable
* kdeems AT parker DOT com <kdeems AT parker DOT com> [2003-10-30 09:41]:
>=20
> Hi,
> I am having the same problem in an AIX environment. Did anybody test the
> changes suggested in tech note 237665? My system is running much slower
> since the FP5 upgrade, not to mention the whole system actually locks up.
The problem is stale Java connections between the client and the server.
It has NOTHING to do with settings on the clientside Java. I talked to
Veritas support to resolve the issue of cleaning up defunct processes on
the master, and the response was to identify the parent java process to
the defunct process and kill it. Most of our defunct processes are from
clients that have been left running for long periods of time and have
either gone stale on their own, or have been cut off be networking
parameters to disconnect idle connections. Here is an example:
UID PID PPID C STIME TTY TIME CMD
root 7069 7068 0 07:08:52 ? 0:00 /usr/openv/netbackup/bin/bpj=
ava-susvc root 0 -1 en_US /usr/openv/java/auth.conf
root 7070 7068 0 07:08:53 ? 0:00 /usr/openv/netbackup/bin/bpj=
ava-susvc root 0 -1 en_US /usr/openv/java/auth.conf
root 7352 7068 0 0:00 <defunct>
root 7347 7068 0 0:00 <defunct>
root 7355 7068 0 0:00 <defunct>
root 7346 7068 0 07:14:16 ? 0:00 /usr/openv/netbackup/bin/bpj=
ava-susvc root 0 -1 en_US /usr/openv/java/auth.conf
root 7084 7068 0 07:08:56 ? 0:00 /usr/openv/netbackup/bin/bpj=
ava-susvc root 0 -1 en_US /usr/openv/java/auth.conf
root 7068 1 0 07:08:52 ? 0:00 /usr/openv/netbackup/bin/bpj=
ava-susvc root 0 -1 en_US /usr/openv/java/auth.conf
root 7108 7068 0 0:01 <defunct>
root 14401 7068 0 07:45:13 ? 0:00 /usr/openv/netbackup/bin/bpj=
ava-susvc root 0 -1 en_US /usr/openv/java/auth.conf
root 14888 7068 0 07:56:39 ? 0:07 /usr/openv/netbackup/bin/bpj=
ava-susvc root 0 -1 en_US /usr/openv/java/auth.conf
root 7079 7068 0 07:08:55 ? 0:00 /usr/openv/netbackup/bin/bpj=
ava-susvc root 0 -1 en_US /usr/openv/java/auth.conf
root 8970 7068 0 0:06 <defunct>
root 7077 7068 0 07:08:55 ? 0:00 /usr/openv/netbackup/bin/bpj=
ava-susvc root 0 -1 en_US /usr/openv/java/auth.conf
These are all the processes that have PPID 7068. You will notice=20
that a number of them are defunct. The PID 7068 is bpjava-susvc ( the
Java client connection ). If you kill That process, all the other
related clide processes will die also.
Here is the korn shell script I am using to clean up these processes.
Please keep in mind that this is on a Solaris 8 system, so your exact
process may be a little different. The reason for the logic at the end
is so you don't kill the parent process UNLESS it is bpjava-susvc. This
works well for me, but PLEASE test this on your own system before
putting it into production. =20
#!/bin/ksh
#
# cleanbpjava.ksh
#
# script to clean up defunct NetBackup java client connections
#
GREP=3D/usr/xpg4/bin/grep
for p in `ps -eaf | $GREP defunct | $GREP -v grep | awk '{ print $3 }' | so=
rt -u`
do
ps -eaf | /usr/local/bin/gawk -v PID=3D$p 'PID=3D=3D$2{print}' | $GREP -q b=
pjava-susvc
if [ "$?" -eq "0" ]
then
kill -9 $p
fi
done
# End of script
--=20
oavid Rock
david AT graniteweb DOT com
--H1spWtNR+x+ondvy
Content-Type: application/pgp-signature
Content-Disposition: inline
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.7 (GNU/Linux)
iD8DBQE/oUOaMrO4/Yb/xwYRAr/dAKC6W92TFSIrGZy57oCamR0yL+GctACfWozU
P/04JmFp2Q9DEk+yjVpr6Uk=
=+Vbs
-----END PGP SIGNATURE-----
--H1spWtNR+x+ondvy--
|
|
|