Bacula-users

Re: [Bacula-users] (no subject)

2015-09-15 04:29:16
Subject: Re: [Bacula-users] (no subject)
From: Craig Shiroma <shiroma.craig.2 AT gmail DOT com>
To: Ana Emília M. Arruda <emiliaarruda AT gmail DOT com>
Date: Mon, 14 Sep 2015 22:24:06 -1000
Hi Ana,

Thanks again for the help!

Yes, the database is large.  The File table has 370+ million records and Bacula is pruning.  I'll see what we can do with mysqltuner.  I have a feeling I'm occasionally getting the timeout error because there are so many records.  If I can't find a cure for the situation, I'm thinking of splitting up our backups using two catalog servers -- one for production and one for test/dev.  Do you think this is wise?  Currently, our DBAs don't support Postgres so it may not be an option.  Would Oracle be an option instead?

I'll have to check on the "--enable-batch-insert" option.  Is there a Bacula command to see what options Bacula was built with?  I did not set up our installation and am new to Bacula.

Would you have any idea why a new duplicate job is starting when one was rescheduled?  I looked over the configs I know about, but could not find an option that starts a new job given any situation.  At least the job will have two more chances to run if I can prevent the new dup. job from starting when the lock is detected.  I know that won't cure the database problem, but I have a feeling the jobs might complete successfully on the rescheduled tries (thinking the lock will be gone by then) until the DB can be tuned.

Warmest regards,
-craig

On Mon, Sep 14, 2015 at 4:54 PM, Ana Emília M. Arruda <emiliaarruda AT gmail DOT com> wrote:
Hello Craig,

You're welcome! I hope give you some tips here. It seems you have a database tunning issue because of "Lock wait timeout exceeded; try restarting transaction". I'm not sure about your database size, but based on your JobIds, it seems large. You can try http://mysqltuner.com/  if you're using MySQL. If you have a really large database, you can think about migrating to PostgreSQL if tunning of MySQL do not solve this problem.

Have you build your bacula with "--enable-batch-insert" option? This is a good idea when dealing with large number of files.

Best regards,
Ana

On Mon, Sep 14, 2015 at 12:40 AM, Craig Shiroma <shiroma.craig.2 AT gmail DOT com> wrote:
Hi Ana,

I'm using 7.0.5.  Thanks for the help!

-craig

On Sat, Sep 12, 2015 at 2:10 PM, Ana Emília M. Arruda <emiliaarruda AT gmail DOT com> wrote:
Hello Craig,

Which Bacula version are you using?

Best regards,
Ana

On Fri, Sep 11, 2015 at 4:14 PM, Craig Shiroma <shiroma.craig.2 AT gmail DOT com> wrote:
My apologies.  I hit the send button before entering a subject.

On Fri, Sep 11, 2015 at 9:13 AM, Craig Shiroma <shiroma.craig.2 AT gmail DOT com> wrote:
Hello All,

I'm getting the following problem occasionally:
2015-09-10 23:47:24bacula-dir JobId 140080: Fatal error: JobId 139901 already running. Duplicate job not allowed.

Due to this type of error:
2015-09-10 23:47:22bacula-dir JobId 139901: Fatal error: sql_create.c:870 Fill File table Query failed: INSERT INTO File (FileIndex, JobId, PathId, FilenameId, LStat, MD5, DeltaSeq) SELECT batch.FileIndex, batch.JobId, Path.PathId, Filename.FilenameId,batch.LStat, batch.MD5, batch.DeltaSeq FROM batch JOIN Path ON (batch.Path = Path.Path) JOIN Filename ON (batch.Name = Filename.Name): ERR=Lock wait timeout exceeded; try restarting transaction

This happens when a database lock has timed out on a backup and the job is rescheduled.  For some reason, it seems a new job is starting up as soon as the error is detected.  I posted about this issue earlier and someone mentioned it is happening because I configured Bacula to do that (or at least that's the impression I got from the post).  Would anyone know which config would have the setting to start up a new job for the client backup when an error like a lock is detected?  So far, I've only found settings for rescheduling, not restarting such as below:

   Reschedule Interval = 1 hour
    Reschedule Times = 3
    Cancel Lower Level Duplicates = yes
    Allow Duplicate Jobs = no

Obviously, the backup is getting canceled because of the last two settings above.  But, what setting is causing a new job to be created when I get a lock timeout error is detected that says it has rescheduled the job for 3600 minutes later?

I realize it appears I may need to do some database fixing/turning.  But, my immediate wonder is why a new job is being created when one has been rescheduled?

Regards,
-craig


------------------------------------------------------------------------------

_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users





------------------------------------------------------------------------------
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users
<Prev in Thread] Current Thread [Next in Thread>