Our NBU Master is an HP DL385 G2 with 4GB of memory. We are running NBU 6.5.1
on Solaris 10 x86.
On Monday and again last night the server completely ran out of Swap space
which all but brought the server to it's knees. Backups started crawling and
it's almost impossible to issue any commands on the server because it takes so
long to respond. On Monday I managed to stop the NBU services
(/etc/init.d/netbackup stop) which greatly improved things but to make sure, I
also rebooted the server. Everything seemed find until last night when I
started seeing backup errors again. The box wasn't hung up to the same extent
that it was on Monday but a quick look in dmesg showed the following...
Jul 10 21:08:49 nbumaster01 inetd[414]: [ID 702911 daemon.error] Unable to fork
inetd_start method of instance svc:/network/vnetd/tcp:default: Not enough space
Jul 10 21:08:50 nbumaster01 last message repeated 1 time
Jul 10 21:08:51 nbumaster01 genunix: [ID 470503 kern.warning] WARNING: Sorry,
no swap space to grow stack for pid 4721 (nbproxy)
Jul 10 21:08:57 nbumaster01 tmpfs: [ID 518458 kern.warning] WARNING:
/etc/svc/volatile: File system full, swap space limit exceeded
Jul 10 21:08:57 nbumaster01 last message repeated 1 time
Jul 10 21:18:45 nbumaster01 inetd[414]: [ID 702911 daemon.error] Unable to fork
inetd_start method of instance svc:/network/vnetd/tcp:default: Not enough space
Jul 10 21:19:11 nbumaster01 inetd[414]: [ID 702911 daemon.error] Unable to fork
inetd_start method of instance svc:/network/bpcd/tcp:default: Not enough space
Jul 10 21:19:14 nbumaster01 inetd[414]: [ID 702911 daemon.error] Unable to fork
inetd_start method of instance svc:/network/vnetd/tcp:default: Not enough space
Jul 10 21:19:28 nbumaster01 tmpfs: [ID 518458 kern.warning] WARNING:
/etc/svc/volatile: File system full, swap space limit exceeded
Jul 10 21:19:28 nbumaster01 last message repeated 1 time
Jul 10 21:20:08 nbumaster01 genunix: [ID 470503 kern.warning] WARNING: Sorry,
no swap space to grow stack for pid 4722 (bprd)
Jul 10 21:20:51 nbumaster01 inetd[414]: [ID 702911 daemon.error] Unable to fork
inetd_start method of instance svc:/network/vnetd/tcp:default: Not enough space
Jul 10 21:20:54 nbumaster01 last message repeated 1 time
Jul 10 21:21:52 nbumaster01 genunix: [ID 470503 kern.warning] WARNING: Sorry,
no swap space to grow stack for pid 4857 (egrep)
Jul 10 21:21:52 nbumaster01 elfexec: [ID 163280 kern.notice] ps: Cannot map
/lib/ld.so.1
Jul 10 21:21:52 nbumaster01 elfexec: [ID 163280 kern.notice] egrep: Cannot map
/lib/ld.so.1
Jul 10 21:21:52 nbumaster01 last message repeated 1 time
Jul 10 21:21:52 nbumaster01 genunix: [ID 470503 kern.warning] WARNING: Sorry,
no swap space to grow stack for pid 4860 (sh)
Jul 10 21:21:52 nbumaster01 elfexec: [ID 163280 kern.notice] ps: Cannot map
/lib/ld.so.1
Jul 10 21:21:52 nbumaster01 genunix: [ID 470503 kern.warning] WARNING: Sorry,
no swap space to grow stack for pid 4866 (sh)
Jul 10 21:21:52 nbumaster01 genunix: [ID 470503 kern.warning] WARNING: Sorry,
no swap space to grow stack for pid 4863 (ps)
Jul 10 21:21:52 nbumaster01 tmpfs: [ID 518458 kern.warning] WARNING:
/etc/svc/volatile: File system full, swap space limit exceeded
Jul 10 21:25:04 nbumaster01 last message repeated 7 times
Jul 10 21:34:11 nbumaster01 tmpfs: [ID 518458 kern.warning] WARNING:
/etc/svc/volatile: File system full, swap space limit exceeded
Jul 10 21:34:11 nbumaster01 last message repeated 1 time
Jul 10 21:35:03 nbumaster01 inetd[414]: [ID 702911 daemon.error] Unable to fork
inetd_start method of instance svc:/network/vnetd/tcp:default: Not enough space
Jul 10 21:35:06 nbumaster01 last message repeated 1 time
Jul 10 21:40:14 nbumaster01 tmpfs: [ID 518458 kern.warning] WARNING:
/etc/svc/volatile: File system full, swap space limit exceeded
Jul 10 21:41:07 nbumaster01 last message repeated 4 times
Jul 10 22:19:52 nbumaster01 vmd[962]: [ID 631293 daemon.notice] terminating -
successful (0)
Jul 10 22:19:52 nbumaster01 vmd[962]: [ID 715111 daemon.error] volume daemon
terminating because it received a signal (15)
Jul 10 22:19:52 nbumaster01 vmd[962]: [ID 164182 daemon.error] terminating -
daemon terminated (7)
Jul 10 22:23:10 nbumaster01 vmd[6960]: [ID 617826 daemon.notice] ready for
connections
Restarting the NBU services got all the backups running normally again and the
problem didn't appear again overnight.
Before I start beefing up the box with more memory etc.. has anyone heard of
any memory leak issues, particularly with NBU 6.5.1 ? We had this box as our
master server running NBU 6.0 MP4 on redhat for about 18 months prior to the
6.5.1 upgrade with not even the slightest hint of memory related issues. This
server doesn't do anything else but act as the NBU master and we have an
identical box working as a media server, again with no issues so far.
Any ideas ?
Mark :)
+----------------------------------------------------------------------
|This was sent by mark.glazerman AT spartech DOT com via Backup Central.
|Forward SPAM to abuse AT backupcentral DOT com.
+----------------------------------------------------------------------
_______________________________________________
Veritas-bu maillist - Veritas-bu AT mailman.eng.auburn DOT edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
|