Bacula-users

Re: [Bacula-users] Memory and swap usage, plus duplicated processes

2008-05-28 07:21:20
Subject: Re: [Bacula-users] Memory and swap usage, plus duplicated processes
From: Arno Lehmann <al AT its-lehmann DOT de>
To: bacula-users AT lists.sourceforge DOT net
Date: Wed, 28 May 2008 13:19:59 +0200
Hi,

28.05.2008 11:47, Andy Shellam wrote:
> Hi Arno,
> 
> Thanks for your reply - my comments are inline.
> 
> 
> Quoting Arno Lehmann <al AT its-lehmann DOT de>:
> 
>> Hi,
>>
>> 28.05.2008 00:52, Andy Shellam wrote:
>>
>> Ah, another Nagios-user mailing list person :-)
> 
> Yes, I thought I'd seen your name somewhere else!  I use Nagios to  
> monitor the Bacula processes, hence how my attention was drawn to the  
> fact that this one machine displays 3 processes instead of 1 for  
> bacula-fd :-)
> 
> 
>>> While monitoring the server using top, when the backup starts the free
>>> memory drops steadily from 200MB free down to 2MB, then hovers around
>>> 2-5MB.  The swap space isn't touched.  5 minutes later the machine dies
>>> with a memory allocation error, still with all 512MB swap space free.
>>>
>>> My server provider tells me this is expected, that because it's a single
>>> process eating into all the RAM, the server can't swap it.  Is this
>>> true?
>> Hmm... I'm not sure. Might be. But that shouldn't crash the whole
>> machine. If the FD gets killed - ok. But nothing worse should happen IMO.
> 
> That's what I thought; I can't believe the kernel would allow this to  
> happen.  The start of the kernel stack-trace at the time of the crash  
> is:
> 
> [2362406.679548] BUG: unable to handle kernel paging request at
> virtual address                                00100104
> [2362406.679561]  printing eip:
> [2362406.679564] c0175fce
> [2362406.679568] 086e9000 -> *pde = 00000000:607e5001
> [2362406.679571] 08f38000 -> *pme = 00000000:00000000
> [2362406.679575] Oops: 0002 [#1]
> [2362406.679577] SMP

That's beyond me to analyze.

...
> Yep, they're all Debian Linux 4.0.  It is a newer machine, so I guess  
> the kernel versions could be different; I'll have a check.  I wouldn't  
> mind betting that it's not fully patched come to think of it, so I'll  
> try that also.
> 
...
> Could this be an indication that the threading library/model used on  
> the smaller machine is different to my other 2?  If this is the case,  
> could this be a contributing factor to the crash?

Possible. I know that there are NPTL and pthread threading models, I 
know that they differ, I know that pthreads maps threads to processes, 
and I know that you can - sometimes - select which one to use.

I know nothing about the links between these facts ;-)

In other words - sure, but I wouldn't know.

Perhaps one of the persons more familiar with this sort of things has 
more helpful comments.

Arno

-- 
Arno Lehmann
IT-Service Lehmann
www.its-lehmann.de

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users

<Prev in Thread] Current Thread [Next in Thread>