Veritas-bu

[Veritas-bu] Memory issues on Solaris 10 x86 Master Server

2008-07-11 13:34:06
Subject: [Veritas-bu] Memory issues on Solaris 10 x86 Master Server
From: mdglazerman <netbackup-forum AT backupcentral DOT com>
To: VERITAS-BU AT mailman.eng.auburn DOT edu
Date: Fri, 11 Jul 2008 09:00:34 -0400
Our NBU Master is an HP DL385 G2 with 4GB of memory.  We are running NBU 6.5.1 
on Solaris 10 x86.

On Monday and again last night the server completely ran out of Swap space 
which all but brought the server to it's knees.  Backups started crawling and 
it's almost impossible to issue any commands on the server because it takes so 
long to respond.  On Monday I managed to stop the NBU services 
(/etc/init.d/netbackup stop) which greatly improved things but to make sure, I 
also rebooted the server.  Everything seemed find until last night when I 
started seeing backup errors again.  The box wasn't hung up to the same extent 
that it was on Monday but a quick look in dmesg showed the following...

Jul 10 21:08:49 nbumaster01 inetd[414]: [ID 702911 daemon.error] Unable to fork 
inetd_start method of instance svc:/network/vnetd/tcp:default: Not enough space

Jul 10 21:08:50 nbumaster01 last message repeated 1 time

Jul 10 21:08:51 nbumaster01 genunix: [ID 470503 kern.warning] WARNING: Sorry, 
no swap space to grow stack for pid 4721 (nbproxy)

Jul 10 21:08:57 nbumaster01 tmpfs: [ID 518458 kern.warning] WARNING: 
/etc/svc/volatile: File system full, swap space limit exceeded

Jul 10 21:08:57 nbumaster01 last message repeated 1 time

Jul 10 21:18:45 nbumaster01 inetd[414]: [ID 702911 daemon.error] Unable to fork 
inetd_start method of instance svc:/network/vnetd/tcp:default: Not enough space

Jul 10 21:19:11 nbumaster01 inetd[414]: [ID 702911 daemon.error] Unable to fork 
inetd_start method of instance svc:/network/bpcd/tcp:default: Not enough space

Jul 10 21:19:14 nbumaster01 inetd[414]: [ID 702911 daemon.error] Unable to fork 
inetd_start method of instance svc:/network/vnetd/tcp:default: Not enough space

Jul 10 21:19:28 nbumaster01 tmpfs: [ID 518458 kern.warning] WARNING: 
/etc/svc/volatile: File system full, swap space limit exceeded

Jul 10 21:19:28 nbumaster01 last message repeated 1 time

Jul 10 21:20:08 nbumaster01 genunix: [ID 470503 kern.warning] WARNING: Sorry, 
no swap space to grow stack for pid 4722 (bprd)

Jul 10 21:20:51 nbumaster01 inetd[414]: [ID 702911 daemon.error] Unable to fork 
inetd_start method of instance svc:/network/vnetd/tcp:default: Not enough space

Jul 10 21:20:54 nbumaster01 last message repeated 1 time

Jul 10 21:21:52 nbumaster01 genunix: [ID 470503 kern.warning] WARNING: Sorry, 
no swap space to grow stack for pid 4857 (egrep)

Jul 10 21:21:52 nbumaster01 elfexec: [ID 163280 kern.notice] ps: Cannot map 
/lib/ld.so.1

Jul 10 21:21:52 nbumaster01 elfexec: [ID 163280 kern.notice] egrep: Cannot map 
/lib/ld.so.1

Jul 10 21:21:52 nbumaster01 last message repeated 1 time

Jul 10 21:21:52 nbumaster01 genunix: [ID 470503 kern.warning] WARNING: Sorry, 
no swap space to grow stack for pid 4860 (sh)

Jul 10 21:21:52 nbumaster01 elfexec: [ID 163280 kern.notice] ps: Cannot map 
/lib/ld.so.1

Jul 10 21:21:52 nbumaster01 genunix: [ID 470503 kern.warning] WARNING: Sorry, 
no swap space to grow stack for pid 4866 (sh)

Jul 10 21:21:52 nbumaster01 genunix: [ID 470503 kern.warning] WARNING: Sorry, 
no swap space to grow stack for pid 4863 (ps)

Jul 10 21:21:52 nbumaster01 tmpfs: [ID 518458 kern.warning] WARNING: 
/etc/svc/volatile: File system full, swap space limit exceeded

Jul 10 21:25:04 nbumaster01 last message repeated 7 times

Jul 10 21:34:11 nbumaster01 tmpfs: [ID 518458 kern.warning] WARNING: 
/etc/svc/volatile: File system full, swap space limit exceeded

Jul 10 21:34:11 nbumaster01 last message repeated 1 time

Jul 10 21:35:03 nbumaster01 inetd[414]: [ID 702911 daemon.error] Unable to fork 
inetd_start method of instance svc:/network/vnetd/tcp:default: Not enough space

Jul 10 21:35:06 nbumaster01 last message repeated 1 time

Jul 10 21:40:14 nbumaster01 tmpfs: [ID 518458 kern.warning] WARNING: 
/etc/svc/volatile: File system full, swap space limit exceeded

Jul 10 21:41:07 nbumaster01 last message repeated 4 times

Jul 10 22:19:52 nbumaster01 vmd[962]: [ID 631293 daemon.notice] terminating - 
successful (0)

Jul 10 22:19:52 nbumaster01 vmd[962]: [ID 715111 daemon.error] volume daemon 
terminating because it received a signal (15)

Jul 10 22:19:52 nbumaster01 vmd[962]: [ID 164182 daemon.error] terminating - 
daemon terminated (7)

Jul 10 22:23:10 nbumaster01 vmd[6960]: [ID 617826 daemon.notice] ready for 
connections

Restarting the NBU services got all the backups running normally again and the 
problem didn't appear again overnight.  

Before I start beefing up the box with more memory etc.. has anyone heard of 
any memory leak issues, particularly with NBU 6.5.1 ?  We  had this box as our 
master server running NBU 6.0 MP4 on redhat for about 18 months prior to the 
6.5.1 upgrade with not even the slightest hint of memory related issues. This 
server doesn't do anything else but act as the NBU master and we have an 
identical box working as a media server, again with no issues so far.

Any ideas ?

Mark :)

+----------------------------------------------------------------------
|This was sent by mark.glazerman AT spartech DOT com via Backup Central.
|Forward SPAM to abuse AT backupcentral DOT com.
+----------------------------------------------------------------------


_______________________________________________
Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu

<Prev in Thread] Current Thread [Next in Thread>