Bacula-users

Re: [Bacula-users] VirtualFull mysql query blocks other jobs for a long time

2010-04-06 12:03:48
Subject: Re: [Bacula-users] VirtualFull mysql query blocks other jobs for a long time
From: ebollengier <eric AT eb.homelinux DOT org>
To: bacula-users AT lists.sourceforge DOT net
Date: Tue, 6 Apr 2010 09:01:13 -0700 (PDT)
Hello Graham,


Graham Keeling wrote:
> 
> Hello,
> I'm using bacula-5.0.1.
> I have a 2.33GHz CPU with 2G of RAM.
> I am using MySQL.
> I had a VirtualFull scheduled for my client.
> 
> My log says the following:
> 
> Apr  4 18:56:02  Start Virtual Backup JobId 56,
> Job=Linux:cvs.2010-04-04_18.56.00_03
> Apr  4 18:56:02  This Job is not an Accurate backup so is not equivalent
> to a Full backup.
> Apr  6 03:56:12  Bootstrap records written to
> /var/lib/bacula/backup.dev.equiinet.com-.restore.1.bsr
> Apr  6 10:23:20  Ready to read from volume "backup-0002" on device "Tower
> 2.0" (/write/mnt/Tower 2).
> Apr  6 10:23:20  Labeled new Volume "backup-0046" on device "Tower 2.1"
> (/write/mnt/Tower 2).
> ...and now, backup-0046 is being written to.
> 
> At about Apr 6 10:00, I was logged on to the director, and attempting to
> run
> a 'status, director' command on bconsole.
> bconsole printed a few bits of information, then paused. Presumably
> because JobId 56 had locked the database. After 10:23:20, the command
> printed
> the rest of its information and gave me the prompt back.
> 
> At about Apr 6 10:00, I logged into mysql, and ran 'show full
> processlist;',
> which gave the following.
> 
> | Id   | User | Host      | db     | Command | Time  | State     | Info
> | 3032 | root | localhost | bacula | Query   | 22464 | Locked    | UPDATE
> Job SE
> T JobStatus='R',Level='F',StartTime='2010-04-06
> 03:56:14',ClientId=2,JobTDate=12
> 70522574,PoolId=7,FileSetId=2 WHERE JobId=56                                  
>   
> | 3033 | root | localhost | bacula | Query   | 22464 | executing | SELECT
> Path.P
> ath, Filename.Name, Temp.FileIndex, Temp.JobId, LStat, MD5 FROM ( SELECT
> FileId,
>  Job.JobId AS JobId, FileIndex, File.PathId AS PathId, File.FilenameId AS
> Filena
> meId, LStat, MD5 FROM Job, File, ( SELECT MAX(JobTDate) AS JobTDate,
> PathId, Fil
> enameId FROM ( SELECT JobTDate, PathId, FilenameId FROM File JOIN Job
> USING (Job
> Id) WHERE File.JobId IN (22,23,31,34,42,48,52) UNION ALL SELECT JobTDate,
> PathId
> , FilenameId FROM BaseFiles JOIN File USING (FileId) JOIN Job  ON   
> (BaseJobId 
> = Job.JobId) WHERE BaseFiles.JobId IN (22,23,31,34,42,48,52) ) AS tmp
> GROUP BY P
> athId, FilenameId ) AS T1 WHERE (Job.JobId IN ( SELECT DISTINCT BaseJobId
> FROM B
> aseFiles WHERE JobId IN (22,23,31,34,42,48,52)) OR Job.JobId IN
> (22,23,31,34,42,
> 48,52)) AND T1.JobTDate = Job.JobTDate AND Job.JobId = File.JobId AND
> T1.PathId 
> = File.PathId AND T1.FilenameId = File.FilenameId ) AS Temp JOIN Filename
> ON (Fi
> lename.FilenameId = Temp.FilenameId) JOIN Path ON (Path.PathId =
> Temp.PathId) WH
> ERE FileIndex > 0 ORDER BY Temp.JobId, FileIndex ASC | 
> 
> 'Time 22464' is equivalent to about 6 hours, so I presume that after
> 'Bootstrap
> records written', bacula has been waiting for this sql command, during
> which
> time it could not do anything else. Backups of other clients were failing
> because I had Max Wait Time set to something less than 6 hours.
> 
> The following is a listing of the number of files that each previous
> backup
> said it had backed up, which doesn't seem extraordinarily large to my
> mind.
> 
> Full: 810,400
> Incr:  52,487
> Incr:  52,485
> Incr:  52,485
> Incr: 332,649
> Incr:  52,492
> Incr:  52,593
> Incr: 332,677
> Incr:  52,497
> Incr:  52,497
> Incr:  52,497
> 
> 
> So, my questions are:
> 
> Does it really need to take so long?
> Is it expected that this Job blocks everything else for 6 hours?
> 

I think that this part of the code doesn't use a dedicated connection, it
would be nice to
improve this area. (other part of the code are using a specific connection
to do this work)


Graham Keeling wrote:
> 
> Is there anything I can do that would speed it up?
> Perhaps even more importantly, what was it doing for the 33 hour period
> between Apr 4 18:56:02 and Apr 6 03:56:12? Was it just writing the
> bootstrap
> records?
> 
> On the Path table, I have these:
>    PRIMARY KEY(PathId),
>    INDEX (Path(255))
> 
> On the File table, I have these:
>    PRIMARY KEY(FileId),
>    INDEX (JobId, PathId, FilenameId),
>    INDEX (PathId),
>    INDEX (FilenameId)
> 
> On the Job table, I have these:
>    PRIMARY KEY(JobId),
>    INDEX (Name(128)),
>    INDEX (ClientId)
> 

Did you run tests with the original schema? It looks like you removed
important index such as the one
on JobId and you added others, and i'm not sure that MySQL can play properly
with composed indexes (it doesn't looks like) in this treatment.

I've made tests with 400,000,000 files and 40,000,000 filenames over 600
jobs, and it doesn't run fast as PostgreSQL (factor 6 between both engines),
but it should handle the job rather quickly. (on my pc, for 2M files, it's
between 1min and 1min30 (15seconds with PostgreSQL))

Bye

-- 
View this message in context: 
http://old.nabble.com/VirtualFull-mysql-query-blocks-other-jobs-for-a-long-time-tp28149748p28153906.html
Sent from the Bacula - Users mailing list archive at Nabble.com.


------------------------------------------------------------------------------
Download Intel&#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users