Bacula-users

Re: [Bacula-users] Client backups crash director until full backup is run -- UPDATE

2009-08-19 14:18:37
Subject: Re: [Bacula-users] Client backups crash director until full backup is run -- UPDATE
From: Corey Shaw <cshaw AT q90 DOT com>
To: bacula-users AT lists.sourceforge DOT net
Date: Wed, 19 Aug 2009 12:12:21 -0600 (MDT)
I ran the memtest on our bacula server last night.  After 14 hours and 8 passes it didn't find any problems.  I'm at the end of my rope here.  I'm trying a new virtual server to see if that fixes the issue.

_____________________
Corey Shaw
Technology Specialist
O. 801.491.0705 (x. 157)
F. 801.491.8774


----- Original Message -----
From: "Corey Shaw" <cshaw AT q90 DOT com>
To: bacula-users AT lists.sourceforge DOT net
Sent: Tuesday, August 18, 2009 2:38:48 PM GMT -07:00 US/Canada Mountain
Subject: Re: [Bacula-users] Client backups crash director until full backup is        run -- UPDATE

This is a physical machine.  If absolutely necessary I could run memtest on the box.  I would like to exhaust other options if possible first though.  The machine is in our datacenter and I'd like to save myself a couple of trips up there.  What can I say?  I'm lazy. :)

_____________________
Corey Shaw
Technology Specialist
O. 801.491.0705 (x. 157)
F. 801.491.8774


----- Original Message -----
From: "Jean Gobin" <JGobin AT StrozFriedberg DOT com>
To: "Corey Shaw" <cshaw AT q90 DOT com>, bacula-users AT lists.sourceforge DOT net
Sent: Tuesday, August 18, 2009 2:35:16 PM GMT -07:00 US/Canada Mountain
Subject: RE: [Bacula-users] Client backups crash director until full backup is        run -- UPDATE

Hello,

 

Virtual or physical machine?

 

Is running Memtest on this for a couple of hours an option?

 

J.

 

 

Jean F. Gobin, CCENT, CCNA
Network Engineer

Tel:

 212.542.3175

Mobile:

 917.213.2532

Fax:

 212.981.6545

 

32 Avenue of the Americas, 4th Floor, New York, NY 10013

 

 jgobin AT strozfriedberg DOT com

 

 www.strozfriedberg.com

 

S T R O Z   F R I E D B E R G

 

 

 

This message is for the named person's use only. It may contain confidential, proprietary or legally privileged information. No right to confidential or privileged treatment of this message is waived or lost by any error in transmission. If you have received this message in error, please immediately notify the sender by e-mail or by telephone, delete the message and all copies from your system and destroy any hard copies. You must not, directly or indirectly, use, disclose, distribute, print or copy any part of this message if you are not the intended recipient.

 

 

From: Corey Shaw [mailto:cshaw AT q90 DOT com]
Sent: Tuesday, August 18, 2009 4:30 PM
To: bacula-users AT lists.sourceforge DOT net
Subject: [Bacula-users] Client backups crash director until full backup is run -- UPDATE

 

Version:  3.0.2

OS:  Gentoo     

 

My Bacula director recently decided that it needs to crash randomly after doing backups.  It mostly happens when backups run from the schedule, but sometimes happens when I run backups manually.  This suddenly started happening on August 11.  Up to that point I had been running 3.0.1 for about a month just fine and without problems.  I upgraded to 3.0.2 to see if it would fix the problem, but it didn't.

 

I have tried rebuilding the MySQL database as well as re-compiling in case I missed something, but neither of those ideas worked either. 

 

Any ideas that people can shed on the subject would be very helpful.  It looks like libbac.so.1 is causing some sort of issue.  Using gdb, I got the following output:

 

Program received signal SIGSEGV, Segmentation fault.

[Switching to Thread 0x7f889599c950 (LWP 5933)]

0x00007f889ad68622 in sm_realloc_pool_memory () from /usr/lib/libbac.so.1

 

Thread 14 (Thread 0x7f889699e950 (LWP 5935)):

#0  0x00007f8898894b92 in select () from /lib/libc.so.6

#1  0x00007f889ad71f09 in tls_bsock_readn () from /usr/lib/libbac.so.1

#2  0x00007f889ad56b25 in BSOCK::recv () from /usr/lib/libbac.so.1

#3  0x000000000041d820 in ?? ()

#4  0x0000000000429248 in ?? ()

#5  0x00007f8899aa5007 in start_thread () from /lib/libpthread.so.0

#6  0x00007f889889b48d in clone () from /lib/libc.so.6

#7  0x0000000000000000 in ?? ()

 

Thread 13 (Thread 0x7f889599c950 (LWP 5933)):

#0  0x00007f889ad68622 in sm_realloc_pool_memory () from /usr/lib/libbac.so.1

#1  0x00007f889ad68e72 in pm_strcat () from /usr/lib/libbac.so.1

#2  0x00007f889b3a9757 in db_get_int_handler () from /usr/lib/libbacsql.so.1

#3  0x00007f889b3a30d9 in db_sql_query () from /usr/lib/libbacsql.so.1

#4  0x00007f889b3a98ba in db_accurate_get_jobids () from /usr/lib/libbacsql.so.1

#5  0x000000000041204a in ?? ()

#6  0x00000000004125b6 in ?? ()

#7  0x0000000000421cb5 in ?? ()

#8  0x0000000000423e58 in ?? ()

#9  0x00007f8899aa5007 in start_thread () from /lib/libpthread.so.0

#10 0x00007f889889b48d in clone () from /lib/libc.so.6

#11 0x0000000000000000 in ?? ()

 

Thread 12 (Thread 0x7f889619d950 (LWP 5932)):

#0  0x00007f8899aac181 in nanosleep () from /lib/libpthread.so.0

#1  0x00007f889ad51fb6 in bmicrosleep () from /usr/lib/libbac.so.1

#2  0x00000000004244b1 in ?? ()

#3  0x00007f8899aa5007 in start_thread () from /lib/libpthread.so.0

#4  0x00007f889889b48d in clone () from /lib/libc.so.6

#5  0x0000000000000000 in ?? ()

 

Thread 5 (Thread 0x7f889719f950 (LWP 5897)):

#0  0x00007f8898894b92 in select () from /lib/libc.so.6

#1  0x00007f889ad71f09 in tls_bsock_readn () from /usr/lib/libbac.so.1

#2  0x00007f889ad56b25 in BSOCK::recv () from /usr/lib/libbac.so.1

#3  0x0000000000446bcd in ?? ()

#4  0x00007f889ad7a642 in workq_server () from /usr/lib/libbac.so.1

#5  0x00007f8899aa5007 in start_thread () from /lib/libpthread.so.0

#6  0x00007f889889b48d in clone () from /lib/libc.so.6

#7  0x0000000000000000 in ?? ()

 

Thread 4 (Thread 0x7f88979a0950 (LWP 5894)):

#0  0x00007f8899aa903d in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/libpthread.so.0

#1  0x00007f889ad7a047 in watchdog_thread () from /usr/lib/libbac.so.1

#2  0x00007f8899aa5007 in start_thread () from /lib/libpthread.so.0

#3  0x00007f889889b48d in clone () from /lib/libc.so.6

#4  0x0000000000000000 in ?? ()

 

Thread 3 (Thread 0x7f88987ca950 (LWP 5893)):

#0  0x00007f8898894b92 in select () from /lib/libc.so.6

#1  0x00007f889ad54a0e in bnet_thread_server () from /usr/lib/libbac.so.1

#2  0x0000000000446b3c in ?? ()

#3  0x00007f8899aa5007 in start_thread () from /lib/libpthread.so.0

#4  0x00007f889889b48d in clone () from /lib/libc.so.6

#5  0x0000000000000000 in ?? ()

---Type <return> to continue, or q <return> to quit---

 

Thread 1 (Thread 0x7f889b9d1700 (LWP 5889)):

#0  0x00007f8899aac181 in nanosleep () from /lib/libpthread.so.0

#1  0x00007f889ad51fb6 in bmicrosleep () from /usr/lib/libbac.so.1

#2  0x000000000042fb37 in ?? ()

#3  0x000000000040ee2c in ?? ()

#4  0x00007f88987e95c6 in __libc_start_main () from /lib/libc.so.6

#5  0x000000000040cba9 in ?? ()

#6  0x00007fffa39e3918 in ?? ()

#7  0x000000000000001c in ?? ()

#8  0x0000000000000005 in ?? ()

#9  0x00007fffa39e4191 in ?? ()

#10 0x00007fffa39e41a6 in ?? ()

#11 0x00007fffa39e41a9 in ?? ()

#12 0x00007fffa39e41ac in ?? ()

#13 0x00007fffa39e41af in ?? ()

#14 0x0000000000000000 in ?? ()

#0  0x00007f889ad68622 in sm_realloc_pool_memory () from /usr/lib/libbac.so.1

_____________________
Corey Shaw
Technology Specialist
O. 801.491.0705 (x. 157)
F. 801.491.8774

 

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users