Bacula-users

Re: [Bacula-users] bextract 5.0.3/64bit hangs ? 100% cpu, no result

2011-07-13 00:50:48
Subject: Re: [Bacula-users] bextract 5.0.3/64bit hangs ? 100% cpu, no result
From: Pierre Bourgin <pierre.bourgin AT free DOT fr>
To: bacula-users AT lists.sourceforge DOT net
Date: Wed, 13 Jul 2011 06:47:31 +0200
On 07/12/2011 06:02 PM, Martin Simmons wrote:
>>>>>> On Tue, 12 Jul 2011 16:15:59 +0200 (CEST), Pierre Bourgin said:
>>
>> ----- "Martin Simmons"<martin AT lispworks DOT com>  wrote:
>>
>>>>>>>> On Mon, 11 Jul 2011 11:42:35 +0200 (CEST), Pierre Bourgin said:
>>>>
>>>> Hello,
>>>>
>>>> I have installed bacula 5.0.3 on a CentOS 5.4 x86_64 system (RPM
>>> x86_64 rebuilt from source) and it's working great since a year.
>>>>
>>>> After a mistake I mad, I need to restore my catalog.
>>>> So I tried to use bextract in order to restore a 51 MB file from a
>>> volume-disk file of 20GB.
>>>> bextract hangs a lot: 100% CPU used, no I/O wait at all.
>>>> After several minutes of run, I stopped it without any success:
>>> restored file with created, but empty.
>>>>
>>>> Since I really need this file, I've tried the 32bit version of
>>> bextract on the same system: worked fine !
>>>>
>>>> I've tried to debug it by the use of strace, but I'm not clever
>>> enough to find anything usefull in these outputs.
>>>> (please find the strace files attached to this email)
>>>>
>>>> So I don't know if it's a bug from the packaging or a bextract bug
>>> related to 64bit platform ?
>>>>
>>>> If someone has a clue ...
>>>
>>> To find out where is it looping, attach gdb to the process when it is
>>> hanging
>>> (use gdb -p $pidofbextract) and then issue the gdb commands
>>>
>>> thread apply all bt
>>> detach
>>> quit
>>>
>>> Do this a few times to get an idea of how it changes.
>>
>> Hello,
>>
>> Thanks for your help.
>>
>> Once bextract has started, I've launched a batched gdb once per minute with 
>> the gdb commands you provided.
>> gdb then always shows a similar output like this (see below):
>> - adresses of the Thread 1 stack are always the same
>> - addresss of the Thread 2 stack: only #0 and #1 are different 
>> (inflate_table() and inflate()),
>> - Thread 1: sometime call to inflate_table() does not appears
>>
>> # while [ 1 ]; do gdb -p `pgrep bextract` -x gdb.show-backtrace.commands  ; 
>> sleep 4; done
>>
>> ============== gdb sample output ==========================================
>> This GDB was configured as "x86_64-redhat-linux-gnu".
>> Attaching to process 10007
>> Reading symbols from /usr/sbin/bextract...(no debugging symbols 
>> found)...done.
>> Reading symbols from /lib64/libacl.so.1...(no debugging symbols 
>> found)...done.
>> Loaded symbols for /lib64/libacl.so.1
>> ....
>> Reading symbols from /lib64/libpthread.so.0...done.
>> [Thread debugging using libthread_db enabled]
>> [New Thread 0x2b7ee0469850 (LWP 10007)]
>> [New Thread 0x40af4940 (LWP 10008)]
>> Loaded symbols for /lib64/libpthread.so.0
>> ....
>> Loaded symbols for /lib64/libselinux.so.1
>> Reading symbols from /lib64/libsepol.so.1...done.
>> Loaded symbols for /lib64/libsepol.so.1
>> 0x00002b7ee025a106 in inflate_table () from /usr/lib64/libz.so.1
>>
>> Thread 2 (Thread 0x40af4940 (LWP 10008)):
>> #0  0x0000003a7620dfe1 in nanosleep () from /lib64/libpthread.so.0
>> #1  0x0000003e32a1425b in bmicrosleep (sec=30, usec=0) at bsys.c:63
>> #2  0x0000003e32a40efb in check_deadlock () at lockmgr.c:571
>> #3  0x0000003a76206617 in start_thread () from /lib64/libpthread.so.0
>> #4  0x0000003a75ad3c2d in clone () from /lib64/libc.so.6
>>
>> Thread 1 (Thread 0x2b7ee0469850 (LWP 10007)):
>> #0  0x00002b7ee025a106 in inflate_table () from /usr/lib64/libz.so.1
>> #1  0x00002b7ee0257537 in inflate () from /usr/lib64/libz.so.1
>> #2  0x00002b7ee0252396 in uncompress () from /usr/lib64/libz.so.1
>> #3  0x0000000000406b6f in record_cb ()
>> #4  0x0000000000425298 in read_records ()
>> #5  0x0000000000406438 in main ()
>> ============== gdb sample output ==========================================
>>
>> so trouble related to zlib and its use by bextract ?
>
> Nicely done.  It looks like bug 1703:
>
> http://bugs.bacula.org/view.php?id=1703

Thanks for the pointer.
Indeed: bug in bextract if using GZIP compression scheme.
Unfortunately, I did not think about the bacula' bugs database, so made 
the debug job twice :(

This bug is corrected "only" in the devel branch code.
So I just have to wait some weeks for the 5.2 release or backport the 
patch on the 5.0.3 code on my own, right ?

In the interval, I will avoid to make things implying the use of 
bextract :-)

>> Another thing:
>> bextract and bconsole do not have the same entry point for libz.so, and only 
>> for that one;
>> does it mean they do not use libz the same way ?
>
> By "entry point", do you mean the number shown after the filename in the
> output of ldd?  That is the base address in memory and isn't important.

oops ... sorry for this.

>>> If you have debuginfo packages for bacula, then install them first.
>>
>> these packages definitions are not provided by bacula.spec from the 
>> bacula-5.0.3 sources.
>> Would you have such a .spec file to generate them ?
>
> Don't worry about it -- some rpm build configurations generate them
> automatically, but maybe yours doesn't.

Using the standard bacula.spec from source with rpmbuild on CentOS do 
not generate them.


Thanks again for your help !

Regards,

Pierre Bourgin

------------------------------------------------------------------------------
AppSumo Presents a FREE Video for the SourceForge Community by Eric 
Ries, the creator of the Lean Startup Methodology on "Lean Startup 
Secrets Revealed." This video shows you how to validate your ideas, 
optimize your ideas and identify your business strategy.
http://p.sf.net/sfu/appsumosfdev2dev
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users