Bacula-users

Re: [Bacula-users] bacula hang waiting for storage

2008-12-03 11:34:53
Subject: Re: [Bacula-users] bacula hang waiting for storage
From: Bob Hetzel <beh AT case DOT edu>
To: bacula-users AT lists.sourceforge DOT net
Date: Wed, 03 Dec 2008 11:28:01 -0500
Previously, From: Arno Lehmann <al AT its-lehmann DOT de> said...

>> Thread 26 (Thread -1215112304 (LWP 5413)):
>> > #0  0xb7f6a410 in __kernel_vsyscall ()
>> > #1  0xb7adba41 in ___newselect_nocancel () from /lib/libc.so.6
>> > #2  0x080a99b9 in bnet_thread_server (addrs=0x80f9860, max_clients=20, 
>> > client_wq=0x80f64e0, handle_client_request=0x808d536 
>> > <handle_UA_client_request>)
>> >      at bnet_server.c:161
> 
> The above line looks like it might be related to the problem... in 
> general, there's one thread per job running (plus the parent threads), 
> and the variable max_clients might indicate the number of currently 
> active thread servers is exhausted or something...
> 
[snip]

> Ok... there are quite a number of threads that could be console 
> connections. There is a hard limit of the active console connections - 
> it seems possible that you ran into that limit.
> 
> Have you checked how many console connections are currently open?
> 
> IIRC, if you SIGTERM a console, it does not necessarily die... so 
> there could be console processes laying around somewhere, keeping 
> their connections open.
> 
> If you find those and 'kill -9' them, do your new console connections 
> work?
> 
> Arno
> 

I generally operate with no more than two console connections.  I think 
I may have had a hung console connection before doing that traceback 
which I would have ctrl-c'd to get out of.  Of course at this point I 
don't still have it stalled, and I restarted the server a couple of 
times since then for other reasons.  If it happens again I'll do the 
traceback and also do a "ps -ef" too.  I think in this case I only had 
the -sd, -dir, and -fd running though so was there something else you 
meant?  Also, after I ctrl-c'd the connections I was able to run a new 
console connection and do certain things, but it would hang in the same 
spot if I did a "status storage" or a mount request.

In addition, I don't know if this has any bearing but here are the 
concurrency values I was operating under...
In bacula-sd.conf
Maximum Concurrent Jobs = 20
3 drives with spooling turned on.
In bacula-dir.conf
in the Director section:
Maximum Concurrent Jobs = 12
In the Jobs sections,
Maximum Concurrent Jobs = 12
In the Storage section
    Maximum Concurrent Jobs = 12

There are 3 drives in my autochanger, spooling is turned on.  I've 
temporarily set bacula back to using only two drives since things were 
running more smoothly before I added the 3rd one.


Thanks,

   Bob

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users