Bacula-users

Re: [Bacula-users] job dying after it finishes

2011-10-05 03:45:24
Subject: Re: [Bacula-users] job dying after it finishes
From: Jeremy Maes <jma AT schaubroeck DOT be>
To: Mark Yarbrough <marky AT itechnv DOT com>
Date: Wed, 05 Oct 2011 09:42:59 +0200
Op 4/10/2011 18:52, Mark Yarbrough schreef:

I am running bacula w/vchanger on Debian and backing up several machines successfully, however my largest file store (which takes a few days to run) is crashing after it has written the entire library to disk/tape.  I am not completely sure what to do with this error.  The error reads that there is a problem with the database table, and there was a network error.  I have looked at the switch logs and there are no indications of a network failure.  Same goes with the client or the server.  I am thinking that one of my partitions isn’t big enough or something like that.  What I find most interesting is the error reports that nothing was backed up but if I calculate the amount of space that is used by the backups and compare it to the amount of space that is supposed to back up they are the same.  Any advice is truly appreciated. 

 

--begin paste—

 

Drive sizes

Filesystem                          Size        Used     Avail      Use%    Mounted on

/dev/cciss/c0d0p1            323M    146M     161M      48%       /

tmpfs                                    32G        0              32G         0%         /lib/init/rw

udev                                      32G        200K      32G        1%          /dev

tmpfs                                    32G          0             32G        0%         /dev/shm

/dev/cciss/c0d0p9           19G        179M     18G        1%          /home

/dev/cciss/c0d1p1           270G       210M    256G       1%         /srv

/dev/cciss/c0d0p8           368M     247M     103M      71%       /tmp

/dev/cciss/c0d0p5           8.3G       707M     7.2G       9%          /usr

/dev/cciss/c0d0p6           2.8G        462M     2.2G       18%       /var

 

...

02-Oct 15:33 backup-dir JobId 105: Fatal error: sql_get.c:373 sql_get.c:373 query SELECT VolumeName,MAX(VolIndex) FROM JobMedia,Media WHERE JobMedia.JobId=105 AND JobMedia.MediaId=Media.MediaId GROUP BY VolumeName ORDER BY 2 ASC failed:

Got error 28 from storage engine


This error means that there is no space left on the device.

Since this seems to be a huge job, your mysql working directory (or database dir - /var/lib/mysql on rhel, not sure about debian) probably fills up after a while, either from lack of space for the bacula database, or from lack of space for the temporary table bacula uses.
Given from the fact that bacula thinks it only wrote 34.99GB that's probably the point at which the issue happens.

Bacula does keep writing and backing up stuff, it just doesn't know afterwards because your database doesn't show it did.

Regards,
Jeremy

**** DISCLAIMER ****
http://www.schaubroeck.be/maildisclaimer.htm

------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2dcopy1
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users
<Prev in Thread] Current Thread [Next in Thread>