Bacula-users

Re: [Bacula-users] one system always failing backup -- logs, SD debug, config files

2008-12-17 02:36:10
Subject: Re: [Bacula-users] one system always failing backup -- logs, SD debug, config files
From: Jo Rhett <jrhett AT netconsonance DOT com>
To: Ryan Novosielski <novosirj AT umdnj DOT edu>
Date: Tue, 16 Dec 2008 23:33:16 -0800
On Dec 15, 2008, at 9:31 PM, Ryan Novosielski wrote:
> Sorta looks like something related to the catalog to me more than the
> system. Probably what I would do is a dbcheck and or DB
> consistency/repair (I skimmed but I don't see anyplace obvious where  
> the

Actually, this host was the 1st host scheduled for a backup with the  
brand new bacula system, and it failed but all others succeeded.  So  
the first failure was with a non-initialized catalog ;-)

> DB type was listed) and make sure there are no odd temporary files on
> either the client or the server, particularly any that might have the
> wrong ownership somehow (like owned by root whereas your current  
> runtime
> user may well not be root).

bacula-fd runs as root.  Nothing non-standard here.

> You may have already thought of all of that, but if not, it's  
> somewhere
> to start. I'm assuming you already would have checked for any TCP/IP
> oddities (netstat, etc.). As for the results always being the same, is
> there any "same spot" that the job fails? I'm not sure how obvious it
> would be as I bet a most of the numbers change every time.

I can't seem to find any debug options which tell me what is happening  
when it fails.  I wish I could :-(

> Also, what else might this job have in common with other jobs that DO
> work so we can rule it out? Is the media different or the same or what
> have you?

Nope.  All systems have the same pool, same filesystems, same backup  
client package, same OS and same hardware.  The only difference in  
their Job setups is the fd-name.

The only difference between the two hosts is that this host has  
significantly more files to backup than any other host.  It's our  
repository of RRD files, and we have thousands of them.  I would  
assume that a problem is there, but I can't find any debug options to  
figure out what the FD daemon is doing when it fails.

FWIW: a constant ping running between the hosts fails when the backup  
fails.   But again, flat network and nothing between the hosts and no  
other networking problems.  An SSH session between the hosts is not  
terminated and I can type at it during the moments the ping is  
failing.  Points at a network problem, but there's no evidence to  
identify it.

-- 
Jo Rhett
Net Consonance : consonant endings by net philanthropy, open source  
and other randomness



------------------------------------------------------------------------------
SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada.
The future of the web can't happen without you.  Join us at MIX09 to help
pave the way to the Next Web now. Learn more and register at
http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users