Bacula-users

Re: [Bacula-users] Volumes in Error

2013-01-31 08:42:38
Subject: Re: [Bacula-users] Volumes in Error
From: Bill Arlofski <waa-bacula AT revpol DOT com>
To: bacula-users AT lists.sourceforge DOT net
Date: Thu, 31 Jan 2013 08:37:56 -0500
HI Jean-François Leroux,

This may not be the cause of your problems, but I had this same problems a
very short while back at a client's site. As it turned out, the filesystem on
which the file volumes existed was (very) corrupted.

The problems I saw (mismatches of catalog vs filesize, marking volumes in
error etc) crept in a little at a time.

Also, just as you are seeing, the previous job to use that volume teterminated
OK, with no errors. So there was nothing in the Bacula logs to indicate what
actually caused the problem in the first place.

Then it got bad enough to cause a kernel panic or two along the way (never saw
THAT before!).

It was difficult to diagnose becasue the filesize mismatch issues and volumes
being marked in error were just a couple of random issues in a list of a few
other non-related network problems that were all happening in the same time 
frame.

OH... And also, we had bad memory on the server! That was one other problem I
almost forgot about.... The SD kept crashing (dmesg would show stack errors or
something like that - I think I may have posted in here when that problem
first came up)

Once it was clear that all the other issues were fixed, including migrating
the Bacula install and DB to another server with good memory, there was an
occasion where the kernel failed to mount the 6TB RAID5 array, claiming
filesystem problems.

After running a filesystem check with tree-rebuild (this was reiserfs BTW) and
then manually cleaning up the known-bad file volumes:

- Deleting bad file volumes from db
- Deleting then from the filesystem
- Re-adding and relabeling

The system has been working without a problem since.

Again, this may not be your problem, (and I see now that you had already check
the filesystem on the SD) but I thought I would still mention it here since
Bacula was exhibiting semi-random strange problems which turned out to be
caused by plain, ordinary filesystem issues.

Maybe it will help someone else. :)

--
Bill Arlofski
Reverse Polarity, LLC

------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_d2d_jan
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users

<Prev in Thread] Current Thread [Next in Thread>