Re: [Bacula-users] bextract 5.0.3/64bit hangs ? 100% cpu, no result
2011-07-12 10:18:55
----- "Martin Simmons" <martin AT lispworks DOT com> wrote:
> >>>>> On Mon, 11 Jul 2011 11:42:35 +0200 (CEST), Pierre Bourgin said:
> >
> > Hello,
> >
> > I have installed bacula 5.0.3 on a CentOS 5.4 x86_64 system (RPM
> x86_64 rebuilt from source) and it's working great since a year.
> >
> > After a mistake I mad, I need to restore my catalog.
> > So I tried to use bextract in order to restore a 51 MB file from a
> volume-disk file of 20GB.
> > bextract hangs a lot: 100% CPU used, no I/O wait at all.
> > After several minutes of run, I stopped it without any success:
> restored file with created, but empty.
> >
> > Since I really need this file, I've tried the 32bit version of
> bextract on the same system: worked fine !
> >
> > I've tried to debug it by the use of strace, but I'm not clever
> enough to find anything usefull in these outputs.
> > (please find the strace files attached to this email)
> >
> > So I don't know if it's a bug from the packaging or a bextract bug
> related to 64bit platform ?
> >
> > If someone has a clue ...
>
> To find out where is it looping, attach gdb to the process when it is
> hanging
> (use gdb -p $pidofbextract) and then issue the gdb commands
>
> thread apply all bt
> detach
> quit
>
> Do this a few times to get an idea of how it changes.
Hello,
Thanks for your help.
Once bextract has started, I've launched a batched gdb once per minute with the
gdb commands you provided.
gdb then always shows a similar output like this (see below):
- adresses of the Thread 1 stack are always the same
- addresss of the Thread 2 stack: only #0 and #1 are different (inflate_table()
and inflate()),
- Thread 1: sometime call to inflate_table() does not appears
# while [ 1 ]; do gdb -p `pgrep bextract` -x gdb.show-backtrace.commands ;
sleep 4; done
============== gdb sample output ==========================================
This GDB was configured as "x86_64-redhat-linux-gnu".
Attaching to process 10007
Reading symbols from /usr/sbin/bextract...(no debugging symbols found)...done.
Reading symbols from /lib64/libacl.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib64/libacl.so.1
....
Reading symbols from /lib64/libpthread.so.0...done.
[Thread debugging using libthread_db enabled]
[New Thread 0x2b7ee0469850 (LWP 10007)]
[New Thread 0x40af4940 (LWP 10008)]
Loaded symbols for /lib64/libpthread.so.0
....
Loaded symbols for /lib64/libselinux.so.1
Reading symbols from /lib64/libsepol.so.1...done.
Loaded symbols for /lib64/libsepol.so.1
0x00002b7ee025a106 in inflate_table () from /usr/lib64/libz.so.1
Thread 2 (Thread 0x40af4940 (LWP 10008)):
#0 0x0000003a7620dfe1 in nanosleep () from /lib64/libpthread.so.0
#1 0x0000003e32a1425b in bmicrosleep (sec=30, usec=0) at bsys.c:63
#2 0x0000003e32a40efb in check_deadlock () at lockmgr.c:571
#3 0x0000003a76206617 in start_thread () from /lib64/libpthread.so.0
#4 0x0000003a75ad3c2d in clone () from /lib64/libc.so.6
Thread 1 (Thread 0x2b7ee0469850 (LWP 10007)):
#0 0x00002b7ee025a106 in inflate_table () from /usr/lib64/libz.so.1
#1 0x00002b7ee0257537 in inflate () from /usr/lib64/libz.so.1
#2 0x00002b7ee0252396 in uncompress () from /usr/lib64/libz.so.1
#3 0x0000000000406b6f in record_cb ()
#4 0x0000000000425298 in read_records ()
#5 0x0000000000406438 in main ()
============== gdb sample output ==========================================
so trouble related to zlib and its use by bextract ?
# rpm -qf /usr/lib64/libz.so.1
zlib-1.2.3-3
My RPM build system uses exactly the same version for the -devel version
(unchanged since build of bacula):
zlib-devel-1.2.3-3
On the bacula server, I've updated my zlib package with the most recent one:
zlib-1.2.3-4.
No difference, the same problem arises with bextract.
I've checked my bacula's backups: they are fine:
I've restored the bacula.sql file (BackupCatalog job) with bconsole and its
"restore" command in seconds
(for the same tape file).
Another thing:
bextract and bconsole do not have the same entry point for libz.so, and only
for that one;
does it mean they do not use libz the same way ?
# ldd /usr/sbin/bextract /usr/sbin/bconsole
libacl.so.1 => /lib64/libacl.so.1 (0x0000003a7ba00000)
libbacfind-5.0.3.so => /usr/lib64/libbacfind-5.0.3.so
(0x0000003e33200000)
libbaccfg-5.0.3.so => /usr/lib64/libbaccfg-5.0.3.so (0x0000003e32e00000)
libbac-5.0.3.so => /usr/lib64/libbac-5.0.3.so (0x0000003e32a00000)
libz.so.1 => /usr/lib64/libz.so.1 (0x00002b0f663b7000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x0000003a76200000)
libdl.so.2 => /lib64/libdl.so.2 (0x0000003a75e00000)
libssl.so.6 => /lib64/libssl.so.6 (0x0000003a79200000)
libcrypto.so.6 => /lib64/libcrypto.so.6 (0x0000003a78200000)
libstdc++.so.6 => /usr/lib64/libstdc++.so.6 (0x0000003a78600000)
libm.so.6 => /lib64/libm.so.6 (0x0000003a76600000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x0000003a78a00000)
libc.so.6 => /lib64/libc.so.6 (0x0000003a75a00000)
libattr.so.1 => /lib64/libattr.so.1 (0x0000003a7aa00000)
/lib64/ld-linux-x86-64.so.2 (0x0000003a75600000)
libgssapi_krb5.so.2 => /usr/lib64/libgssapi_krb5.so.2
(0x0000003a78e00000)
libkrb5.so.3 => /usr/lib64/libkrb5.so.3 (0x0000003a7b200000)
libcom_err.so.2 => /lib64/libcom_err.so.2 (0x0000003a79600000)
libk5crypto.so.3 => /usr/lib64/libk5crypto.so.3 (0x0000003a7ae00000)
libkrb5support.so.0 => /usr/lib64/libkrb5support.so.0
(0x0000003a7b600000)
libkeyutils.so.1 => /lib64/libkeyutils.so.1 (0x0000003a79a00000)
libresolv.so.2 => /lib64/libresolv.so.2 (0x0000003a79e00000)
libselinux.so.1 => /lib64/libselinux.so.1 (0x0000003a76a00000)
libsepol.so.1 => /lib64/libsepol.so.1 (0x0000003a76e00000)
/usr/sbin/bconsole:
libreadline.so.5 => /usr/lib64/libreadline.so.5 (0x000000311fc00000)
libncurses.so.5 => /usr/lib64/libncurses.so.5 (0x0000003a7a600000)
libbaccfg-5.0.3.so => /usr/lib64/libbaccfg-5.0.3.so (0x0000003e32e00000)
libbac-5.0.3.so => /usr/lib64/libbac-5.0.3.so (0x0000003e32a00000)
libz.so.1 => /usr/lib64/libz.so.1 (0x00002b688e3de000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x0000003a76200000)
libdl.so.2 => /lib64/libdl.so.2 (0x0000003a75e00000)
libssl.so.6 => /lib64/libssl.so.6 (0x0000003a79200000)
libcrypto.so.6 => /lib64/libcrypto.so.6 (0x0000003a78200000)
libstdc++.so.6 => /usr/lib64/libstdc++.so.6 (0x0000003a78600000)
libm.so.6 => /lib64/libm.so.6 (0x0000003a76600000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x0000003a78a00000)
libc.so.6 => /lib64/libc.so.6 (0x0000003a75a00000)
/lib64/ld-linux-x86-64.so.2 (0x0000003a75600000)
libgssapi_krb5.so.2 => /usr/lib64/libgssapi_krb5.so.2
(0x0000003a78e00000)
libkrb5.so.3 => /usr/lib64/libkrb5.so.3 (0x0000003a7b200000)
libcom_err.so.2 => /lib64/libcom_err.so.2 (0x0000003a79600000)
libk5crypto.so.3 => /usr/lib64/libk5crypto.so.3 (0x0000003a7ae00000)
libkrb5support.so.0 => /usr/lib64/libkrb5support.so.0
(0x0000003a7b600000)
libkeyutils.so.1 => /lib64/libkeyutils.so.1 (0x0000003a79a00000)
libresolv.so.2 => /lib64/libresolv.so.2 (0x0000003a79e00000)
libselinux.so.1 => /lib64/libselinux.so.1 (0x0000003a76a00000)
libsepol.so.1 => /lib64/libsepol.so.1 (0x0000003a76e00000)
> If you have debuginfo packages for bacula, then install them first.
these packages definitions are not provided by bacula.spec from the
bacula-5.0.3 sources.
Would you have such a .spec file to generate them ?
Regards,
Pierre Bourgin
------------------------------------------------------------------------------
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security
threats, fraudulent activity, and more. Splunk takes this data and makes
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users
|
|
|