Bacula-users

Re: [Bacula-users] bextract 5.0.3/64bit hangs ? 100% cpu, no result

2011-07-12 10:18:55
Subject: Re: [Bacula-users] bextract 5.0.3/64bit hangs ? 100% cpu, no result
From: Pierre Bourgin <pierre.bourgin AT free DOT fr>
To: bacula-users AT lists.sourceforge DOT net
Date: Tue, 12 Jul 2011 16:15:59 +0200 (CEST)
----- "Martin Simmons" <martin AT lispworks DOT com> wrote:

> >>>>> On Mon, 11 Jul 2011 11:42:35 +0200 (CEST), Pierre Bourgin said:
> > 
> > Hello,
> > 
> > I have installed bacula 5.0.3 on a CentOS 5.4 x86_64 system (RPM
> x86_64 rebuilt from source) and it's working great since a year.
> > 
> > After a mistake I mad, I need to restore my catalog.
> > So I tried to use bextract in order to restore a 51 MB file from a
> volume-disk file of 20GB.
> > bextract hangs a lot: 100% CPU used, no I/O wait at all.
> > After several minutes of run, I stopped it without any success:
> restored file with created, but empty.
> > 
> > Since I really need this file, I've tried the 32bit version of
> bextract on the same system: worked fine !
> > 
> > I've tried to debug it by the use of strace, but I'm not clever
> enough to find anything usefull in these outputs.
> > (please find the strace files attached to this email)
> > 
> > So I don't know if it's a bug from the packaging or a bextract bug
> related to 64bit platform ?
> > 
> > If someone has a clue ...
> 
> To find out where is it looping, attach gdb to the process when it is
> hanging
> (use gdb -p $pidofbextract) and then issue the gdb commands
> 
> thread apply all bt
> detach
> quit
> 
> Do this a few times to get an idea of how it changes.

Hello,

Thanks for your help.

Once bextract has started, I've launched a batched gdb once per minute with the 
gdb commands you provided.
gdb then always shows a similar output like this (see below):
- adresses of the Thread 1 stack are always the same
- addresss of the Thread 2 stack: only #0 and #1 are different (inflate_table() 
and inflate()), 
- Thread 1: sometime call to inflate_table() does not appears

# while [ 1 ]; do gdb -p `pgrep bextract` -x gdb.show-backtrace.commands  ; 
sleep 4; done

============== gdb sample output ==========================================
This GDB was configured as "x86_64-redhat-linux-gnu".
Attaching to process 10007
Reading symbols from /usr/sbin/bextract...(no debugging symbols found)...done.
Reading symbols from /lib64/libacl.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib64/libacl.so.1
....
Reading symbols from /lib64/libpthread.so.0...done.
[Thread debugging using libthread_db enabled]
[New Thread 0x2b7ee0469850 (LWP 10007)]
[New Thread 0x40af4940 (LWP 10008)]
Loaded symbols for /lib64/libpthread.so.0
....
Loaded symbols for /lib64/libselinux.so.1
Reading symbols from /lib64/libsepol.so.1...done.
Loaded symbols for /lib64/libsepol.so.1
0x00002b7ee025a106 in inflate_table () from /usr/lib64/libz.so.1

Thread 2 (Thread 0x40af4940 (LWP 10008)):
#0  0x0000003a7620dfe1 in nanosleep () from /lib64/libpthread.so.0
#1  0x0000003e32a1425b in bmicrosleep (sec=30, usec=0) at bsys.c:63
#2  0x0000003e32a40efb in check_deadlock () at lockmgr.c:571
#3  0x0000003a76206617 in start_thread () from /lib64/libpthread.so.0
#4  0x0000003a75ad3c2d in clone () from /lib64/libc.so.6

Thread 1 (Thread 0x2b7ee0469850 (LWP 10007)):
#0  0x00002b7ee025a106 in inflate_table () from /usr/lib64/libz.so.1
#1  0x00002b7ee0257537 in inflate () from /usr/lib64/libz.so.1
#2  0x00002b7ee0252396 in uncompress () from /usr/lib64/libz.so.1
#3  0x0000000000406b6f in record_cb ()
#4  0x0000000000425298 in read_records ()
#5  0x0000000000406438 in main ()
============== gdb sample output ==========================================

so trouble related to zlib and its use by bextract ?

# rpm -qf /usr/lib64/libz.so.1
zlib-1.2.3-3

My RPM build system uses exactly the same version for the -devel version
(unchanged since build of bacula):
zlib-devel-1.2.3-3

On the bacula server, I've updated my zlib package with the most recent one: 
zlib-1.2.3-4.
No difference, the same problem arises with bextract.

I've checked my bacula's backups: they are fine:
I've restored the bacula.sql file (BackupCatalog job) with bconsole and its 
"restore" command in seconds 
(for the same tape file).

Another thing: 
bextract and bconsole do not have the same entry point for libz.so, and only 
for that one;
does it mean they do not use libz the same way ?

# ldd /usr/sbin/bextract /usr/sbin/bconsole
        libacl.so.1 => /lib64/libacl.so.1 (0x0000003a7ba00000)
        libbacfind-5.0.3.so => /usr/lib64/libbacfind-5.0.3.so 
(0x0000003e33200000)
        libbaccfg-5.0.3.so => /usr/lib64/libbaccfg-5.0.3.so (0x0000003e32e00000)
        libbac-5.0.3.so => /usr/lib64/libbac-5.0.3.so (0x0000003e32a00000)
        libz.so.1 => /usr/lib64/libz.so.1 (0x00002b0f663b7000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x0000003a76200000)
        libdl.so.2 => /lib64/libdl.so.2 (0x0000003a75e00000)
        libssl.so.6 => /lib64/libssl.so.6 (0x0000003a79200000)
        libcrypto.so.6 => /lib64/libcrypto.so.6 (0x0000003a78200000)
        libstdc++.so.6 => /usr/lib64/libstdc++.so.6 (0x0000003a78600000)
        libm.so.6 => /lib64/libm.so.6 (0x0000003a76600000)
        libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x0000003a78a00000)
        libc.so.6 => /lib64/libc.so.6 (0x0000003a75a00000)
        libattr.so.1 => /lib64/libattr.so.1 (0x0000003a7aa00000)
        /lib64/ld-linux-x86-64.so.2 (0x0000003a75600000)
        libgssapi_krb5.so.2 => /usr/lib64/libgssapi_krb5.so.2 
(0x0000003a78e00000)
        libkrb5.so.3 => /usr/lib64/libkrb5.so.3 (0x0000003a7b200000)
        libcom_err.so.2 => /lib64/libcom_err.so.2 (0x0000003a79600000)
        libk5crypto.so.3 => /usr/lib64/libk5crypto.so.3 (0x0000003a7ae00000)
        libkrb5support.so.0 => /usr/lib64/libkrb5support.so.0 
(0x0000003a7b600000)
        libkeyutils.so.1 => /lib64/libkeyutils.so.1 (0x0000003a79a00000)
        libresolv.so.2 => /lib64/libresolv.so.2 (0x0000003a79e00000)
        libselinux.so.1 => /lib64/libselinux.so.1 (0x0000003a76a00000)
        libsepol.so.1 => /lib64/libsepol.so.1 (0x0000003a76e00000)
/usr/sbin/bconsole:
        libreadline.so.5 => /usr/lib64/libreadline.so.5 (0x000000311fc00000)
        libncurses.so.5 => /usr/lib64/libncurses.so.5 (0x0000003a7a600000)
        libbaccfg-5.0.3.so => /usr/lib64/libbaccfg-5.0.3.so (0x0000003e32e00000)
        libbac-5.0.3.so => /usr/lib64/libbac-5.0.3.so (0x0000003e32a00000)
        libz.so.1 => /usr/lib64/libz.so.1 (0x00002b688e3de000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x0000003a76200000)
        libdl.so.2 => /lib64/libdl.so.2 (0x0000003a75e00000)
        libssl.so.6 => /lib64/libssl.so.6 (0x0000003a79200000)
        libcrypto.so.6 => /lib64/libcrypto.so.6 (0x0000003a78200000)
        libstdc++.so.6 => /usr/lib64/libstdc++.so.6 (0x0000003a78600000)
        libm.so.6 => /lib64/libm.so.6 (0x0000003a76600000)
        libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x0000003a78a00000)
        libc.so.6 => /lib64/libc.so.6 (0x0000003a75a00000)
        /lib64/ld-linux-x86-64.so.2 (0x0000003a75600000)
        libgssapi_krb5.so.2 => /usr/lib64/libgssapi_krb5.so.2 
(0x0000003a78e00000)
        libkrb5.so.3 => /usr/lib64/libkrb5.so.3 (0x0000003a7b200000)
        libcom_err.so.2 => /lib64/libcom_err.so.2 (0x0000003a79600000)
        libk5crypto.so.3 => /usr/lib64/libk5crypto.so.3 (0x0000003a7ae00000)
        libkrb5support.so.0 => /usr/lib64/libkrb5support.so.0 
(0x0000003a7b600000)
        libkeyutils.so.1 => /lib64/libkeyutils.so.1 (0x0000003a79a00000)
        libresolv.so.2 => /lib64/libresolv.so.2 (0x0000003a79e00000)
        libselinux.so.1 => /lib64/libselinux.so.1 (0x0000003a76a00000)
        libsepol.so.1 => /lib64/libsepol.so.1 (0x0000003a76e00000)


> If you have debuginfo packages for bacula, then install them first.

these packages definitions are not provided by bacula.spec from the 
bacula-5.0.3 sources. 
Would you have such a .spec file to generate them ?

Regards,

Pierre Bourgin

------------------------------------------------------------------------------
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users