Bacula-users

[Bacula-users] bacula-dir virtual memory limit during restore

2008-06-03 10:52:49
Subject: [Bacula-users] bacula-dir virtual memory limit during restore
From: John Kloss <John.Kloss AT jhmi DOT edu>
To: baculausers <bacula-users AT lists.sourceforge DOT net>
Date: Tue, 3 Jun 2008 10:52:34 -0400
Hello,

I am currently running bacula-2.2.8 compiled as a 32bit binary.
I am using postgresql-8.3.1 as the catalog, also compiled as a 32bit  
binary.
I am running bacula on solaris 9 running in 64bit mode which of  
course means I can run both 64bit and 32bit binaries.

ulimit -v shows unlimited.  I know that's a lie and that soft limit  
is 2GB.  I know that I can change that to 3.5GB or so for a 32bit  
process.  I have done so and then started bacula-dir.
ulimit -d shows unlimited.  I know that's also a lie and that the  
default limit is 2GB.  I know that I change that to 3.5GB or so for a  
32bit process.  I have done so (along with ulimit -v) and then  
started bacula-dir.

I am trying to restore 2.5 terabytes of data composed of 6.5 million  
files.

My process is

Run bconsole
Choose restore
Chose the most recent restore for a client
Wait for the directory structure to be generated in memory (10  
minutes tops-- postgres temp files are on a ram disk which makes life  
fast)
  Chose 'mark *'
Watch bacula churn away for a couple of minutes and then report 6.5  
million files marked.
Type done.
See that the prompt never returns.  The restore never happens.   
Actually, I don't have time to wait for forever so I waited for 36  
hours instead and saw that nothing had changed.  No prompt.  No restore.

Bacula-dir consumes up to 2GB of memory and then freezes.  Running  
pmap on the bacula-dir shows that, along with kernel space usage and  
library space and whatever, the heap usage is 2GB.  ulimit was set to  
give bacula-dir 3.5GB.  This gift of memory is apparently ignored.

Thinking this was a 32bit limit I switched to a 64bit compile of  
bacula and tried the same thing.  This time bacula-dir took 2.5GB of  
heap and no more even though I gifted it 10GB (I have 16GB of memory  
and it's pretty much for postgres and bacula).  I still never see the  
prompt return after a 'mark *; done'.

Running 64bit versions of bacula on solaris completely hoses any  
dates such as last written for media.  I think this is a solaris  
sprintf thing and has nothing to do with bacula.  Regardless, I don't  
want to run a 64bit version of bacula on solaris and, given the above  
limitations, it wouldn't help me anyway.

Running truss -u *:: on the bacula-dir process shows it continually  
spinning in mutex locks and unlocks around memory allocation and frees.

Previous versions of bacula (1.36) were able to restore 5.5 terabytes  
of data composed of 9 million files via the above method.  Same  
machine, less memory, 32bit binaries, old version of postgres (8.0).   
The new version as I have compiled and configured cannot.

How does one recover 2.5 terabytes and 6.5 million files using the  
latest version of bacula?  What am I doing wrong?  Is there anyway to  
change smartalloc so that it will use 3.5GBs of memory (nothing  
popped out at me when I scanned the include files)?

I should note that a couple of weeks ago I had a complete system  
failure of my SAN and lost 25 terabytes of data.  Bacula 1.36  
restored all of it.  I got to keep my job.  Thank you bacula.

        John Kloss. <John.Kloss AT jhmi DOT edu>
        IT Manager, Systems Manager
        Institute of Genetic Medicine
        Johns Hopkins Medical Institution



-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users