Re: [Bacula-users] Memory and swap usage, plus duplicated processes
2008-05-28 03:36:53
Hi,
28.05.2008 00:52, Andy Shellam wrote:
Ah, another Nagios-user mailing list person :-)
> Hi,
>
> I have a virtual server which I'm trying to backup with Bacula 2.2.8,
> but it keeps crashing the machine. The machine has 256MB RAM allocated,
> and 512MB swap space (minimal specs I know but it doesn't do a lot!)
>
> While monitoring the server using top, when the backup starts the free
> memory drops steadily from 200MB free down to 2MB, then hovers around
> 2-5MB. The swap space isn't touched. 5 minutes later the machine dies
> with a memory allocation error, still with all 512MB swap space free.
>
> My server provider tells me this is expected, that because it's a single
> process eating into all the RAM, the server can't swap it. Is this
> true?
Hmm... I'm not sure. Might be. But that shouldn't crash the whole
machine. If the FD gets killed - ok. But nothing worse should happen IMO.
If that's and old linux kernel the behaviour you see might be
expected; newer ones tend to kill some processes before things get
really dramatic (OOM killer as "Out Of Memory").
> I have 2 other servers with the same provider, identical in every
> way except they have 512MB RAM and 1GB swap, and they both backup just
> fine.
>
> Another thing that is different is that on this troublesome machine, the
> bacula startup script (/usr/local/bacula/etc/bacula start) starts 3
> bacula-fd processes, but on all my other machines it only starts 1. Is
> there any reason for this?
Are these the same OS versions?
I suspect that on the small machine an older version of the (linux) OS
or the ps program is running.
The thing is that threads (aka Light-weight Processes) are sometimes
displayed as spearate processes, and sometimes not, depending on the
software you use.
For example, on a reasonably new OS:
arno@elf:~> ps -lfLC bacula-fd
F S UID PID PPID LWP C NLWP PRI NI ADDR SZ WCHAN STIME
TTY TIME CMD
1 S root 4718 1 4718 0 2 76 0 - 11968 - 2007 ?
00:00:21 /usr/sbin/bacula-fd -c /etc/bacula/bacula-fd.conf
1 S root 4718 1 4719 0 2 76 0 - 11968 322561 2007 ?
00:00:04 /usr/sbin/bacula-fd -c /etc/bacula/bacula-fd.conf
This displays the threads separately.
arno@elf:~> ps -lfC bacula-fd
F S UID PID PPID C PRI NI ADDR SZ WCHAN STIME TTY
TIME CMD
1 S root 4718 1 0 76 0 - 11968 - 2007 ?
00:00:21 /usr/
This does not, but on older software it would display two processes.
> On troublesome server (with backup job disabled):
>
> root ~ # ps aux|grep bacula-fd
> root 3160 0.0 0.5 13200 1392 ? Ss 21:55 0:00
> /usr/local/bacula/sbin/bacula-fd -u root -g root -v -c
> /usr/local/bacula/etc/bacula-fd.conf
> root 3162 0.0 0.5 13200 1392 ? S 21:55 0:00
> /usr/local/bacula/sbin/bacula-fd -u root -g root -v -c
> /usr/local/bacula/etc/bacula-fd.conf
> root 3163 0.0 0.5 13200 1392 ? S 21:55 0:00
> /usr/local/bacula/sbin/bacula-fd -u root -g root -v -c
> /usr/local/bacula/etc/bacula-fd.conf
>
> On other servers (with backup job live as normal):
> root ~ # ps aux|grep bacula-fd
> root 11114 0.0 0.3 27208 1628 ? Ssl 19:26 0:00
> /usr/local/bacula/sbin/bacula-fd -u root -g root -v -c
> /usr/local/bacula/etc/bacula-fd.conf
See the "l" in the process state? From my ps(1) man page:
> l is multi-threaded (using CLONE_THREAD, like NPTL pthreads do)
...
Arno
--
Arno Lehmann
IT-Service Lehmann
www.its-lehmann.de
-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users
|
|
|