Bacula-users

Re: [Bacula-users] Catalog backup while job running?

2012-04-03 10:42:37
Subject: Re: [Bacula-users] Catalog backup while job running?
From: Stephen Thompson <stephen AT seismo.berkeley DOT edu>
To: Martin Simmons <martin AT lispworks DOT com>
Date: Tue, 03 Apr 2012 07:40:34 -0700

On 4/3/12 3:28 AM, Martin Simmons wrote:
>>>>>> On Mon, 02 Apr 2012 15:06:31 -0700, Stephen Thompson said:
>>
>>>> That aside, I'm seeing something unexpected.  I am now able to
>>>> successfully run jobs while I use mysqldump to dump the bacula Catalog,
>>>> except at the very end of the dump there is some sort of contention.  A
>>>> few of my jobs (3-4 out of 150) that are attempting to despool
>>>> attritbutes at the tail end of the dump yield this error:
>>>>
>>>> Fatal error: sql_create.c:860 Fill File table Query failed: INSERT INTO
>>>> File (FileIndex, JobId, PathId, FilenameId, LStat, MD5, DeltaSeq) SELECT
>>>> batch.FileIndex, batch.JobId, Path.PathId,
>>>> Filename.FilenameId,batch.LStat, batch.MD5, batch.DeltaSeq FROM batch
>>>> JOIN Path ON (batch.Path = Path.Path) JOIN Filename ON (batch.Name =
>>>> Filename.Name): ERR=Lock wait timeout exceeded; try restarting transaction
>>>>
>>>> I have successful jobs before and after this 'end of the dump' timeframe.
>>>>
>>>> It looks like I might be able to "fix" this by increasing my
>>>> innodb_lock_wait_timeout, but I'd like to understand WHY I need to
>>>> icnrease it.  Anyone know what's happening at the end of a dump like
>>>> this that would cause the above error?
>>>>
>>>> mysqldump -f --opt --skip-lock-tables --single-transaction bacula
>>>>    >>bacula.sql
>>>>
>>>> Is it the commit on this 'dump' transaction?
>>>
>>> --skip-lock-tables is referred to in the mysqldump documentation, but
>>> isn't actually a valid option.  This is actually an increasingly
>>> horrible problem with mysqldump.  It has been very poorly maintained,
>>> and has barely developed at all in ten or fifteen years.
>>>
>>
>> This has me confused.  I have jobs that can run, and insert records into
>> the File table, while I am dumping the Catalog.  It's only at the
>> tail-end that a few jobs get the error above.  Wouldn't a locked File
>> table cause all concurrent jobs to fail?
>
> Are you sure that jobs are inserting records into the File table whilst they
> are running?  With spooling, file records are not inserted until the end of
> the job.
>
> Likewise, in batch mode (as above), the File table is only updated once at the
> end.
>

Yes, I have completed jobs before and after the problem jobs (which 
aren't always the same jobs, or happen at the same time, except that 
they seem to correlate with the end of the Catalog dump, which could 
also be the end of the File table dump, since it's 99% of the db).

I can view the inserted records from jobs that complete while the 
Catalog dump is running.  And I am spooling, so jobs are inserting all 
attrs at the end of the job.  The jobs with the errors are clearly 
moving their records from the batch file to the File table at the 
conclusion of their run.

I have never seen this before moving to InnoDB, but of course, I moved 
to InnoDB to be able to run my Catalog dump concurrently with jobs 
(knowing I won't capture the records from the running jobs).  So at this 
point, I'm not sure if I'm getting the error because of something 
happening at the end of the dump, or if it's merely a 'collision' of 
jobs all wanting to insert batch records at the same time.  I know that 
the Innodb engine has a lock wait timeout default of 50s, but I'm not 
sure who this was handled with MyISAM where I never saw this problem 
(but again, also, never ran my jobs concurrently with dump).

Stephen




> __Martin
>
> ------------------------------------------------------------------------------
> Better than sec? Nothing is better than sec when it comes to
> monitoring Big Data applications. Try Boundary one-second
> resolution app monitoring today. Free.
> http://p.sf.net/sfu/Boundary-dev2dev
> _______________________________________________
> Bacula-users mailing list
> Bacula-users AT lists.sourceforge DOT net
> https://lists.sourceforge.net/lists/listinfo/bacula-users

-- 
Stephen Thompson               Berkeley Seismological Laboratory
stephen AT seismo.berkeley DOT edu    215 McCone Hall # 4760
404.538.7077 (phone)           University of California, Berkeley
510.643.5811 (fax)             Berkeley, CA 94720-4760

------------------------------------------------------------------------------
Better than sec? Nothing is better than sec when it comes to
monitoring Big Data applications. Try Boundary one-second 
resolution app monitoring today. Free.
http://p.sf.net/sfu/Boundary-dev2dev
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users