Bacula-users

Re: [Bacula-users] Bacula MySQL Catalog binlog restore

2012-04-13 16:49:12
Subject: Re: [Bacula-users] Bacula MySQL Catalog binlog restore
From: Joe Nyland <joe AT joenyland.co DOT uk>
To: Bacula Users <bacula-users AT lists.sourceforge DOT net>
Date: Fri, 13 Apr 2012 21:47:04 +0100
On 13 Apr 2012, at 18:14, Martin Simmons wrote:

>>>>>> On Wed, 11 Apr 2012 11:09:12 +0100, =?utf-8?Q?Joe Nyland?= said:
>> 
>> Ok, firstly: sorry for not following up sooner. There have been several
>> replies for this thread since my last reply, so thanks everyone for your
>> input.
>> 
>> Phil, you point out that the problem here is most likely to do with the fact
>> that other jobs are running whilst the full DB backup is being run and I
>> agree with this.
>> 
>> Now, to me, the simple resolution to this is to assign a backup window for
>> the catalog backup to run in, when no other backups are running. There are,
>> however, two issues I can foresee: 
>> - The MySQL backup script is being called by a Bacula job as a run before
>> script, so there will allways be a job 'running' whilst the catalog dump is
>> being performed. 
> 
> Should be OK because Bacula isn't in the middle of inserting File records for
> that job when the before script runs.
> 
> 
>> - One backup job in particular takes approx 24-48 hrs to complete, so this
>> will be constantly updating the catalog with file records, no matter when I
>> schedule the full catalog backup for. 
> 
> Yes, that will be a problem.
> 
> 
>> I'm therefore back to square one.
>> 
>> Whilst I understand the complications with temporary tables (and threads)
>> that Phil has pointed out, I do agree with what Martin was suggesting that
>> there may be a better way to use temporary tables. I don't, however, have
>> enough MySQL knowledge to provide any further suggestions to improve this in
>> later versions of Bacula.
>> 
>> Stephen, yes, I am running the backup whilst other backups are running, for
>> reasons stated above. You are right - if there were no jobs running, the
>> temp tables wouldn't be referred to in the first place, as they would not
>> exist.
>> 
>> Your suggestion of keeping a 'buffer' of bin logs is interesting and this
>> was the last thing I tried to resolve this problem with. I basically removed
>> the '--delete-master-logs' option from the mysqldump line, so that old
>> binary logs are not removed. (Side point: I'm not sure when the
>> '--delete-master-logs' actually deletes the binary logs - the mysqldump docs
>> state that it "[sends a] PURGE BINARY LOGS statement to the server __after__
>> performing the dump operation". This doesn't make sense to me, as anything
>> that's created between the dump being initiated and the dump completing will
>> be deleted from the logs and there's no guarantee it's included in the
>> dump). So I removed the '--delete-master-logs' option, in a hope this would
>> somehow allow me to see the CREATE TEMPORARY TABLE statements earlier in the
>> binary logs, but it didn't.
>> 
>> What I didn't do that you have suggested is to restore older logs first
>> i.e. logs from before the dump, then let it work through them, skipping any
>> duplicate key errors. My issue with this workaround though, is that I fear
>> that it could corrupt the database in some way (unbeknown to me at the time
>> of restoring) if I am just 'skipping errors'.
> 
> Yes, it sounds like a risky approach.  I think you would have filter the older
> logs to find the commands that are needed to make the newer logs work.
> 
> 
>> To me, this all does beg the question: "Why use temporary tables in the
>> first place?" Again, I may not have enough MySQL experience to realise this
>> for myself yet, but it's my view at this stage, after battling this for a
>> while now!
> 
> The main reason was probably for simplicity with multiple connections to the
> catalog (each connection automatically gets its own temporary tables).  I see
> that some of the accurate-mode code is using a different approach though, with
> a table named after the jobid to make it unique.
> 
> __Martin
> 

Martin and Phil, thanks for your responses.

I seem to have settled on the suggestion that maybe I should't worry about 
these temporary tables too much. As I think Phil is trying to explain; if there 
were a way in which these temporary tables could be backed up and restored 
somehow, then the actual data they would bring to the database would be useless 
anyway - the job wouldn't suddenly start running again from where it left off. 
So for all intents and purposes, the information about files being backed up in 
the job running at the moment when the database is being backed up may as well 
not even be in the database. After all, the temporary batch tables would only 
be deleted later on the the binary logs when the file records are actually 
moved to the file table, so to me, it seems pointless to worry about the batch 
table references. (If I've misunderstood something here, please accept my 
apologies)

So... I've been doing daily test restores, but with the addition of the '-f' 
flag to mysql, so that it will continue to read from the binary logs, even if 
it come across an issue.

Further to the above, all I've found it's complained about is the fact that the 
bacula.batch table does't exist. If we apply Phil's theory from above, then the 
messages about the batch table can be 'disregarded' and the rest of the 
database can be restored regardless. Obviously, using the '-f' flag means that 
any other 'serious' errors will be ignored, but they are still echoed to stout, 
so it's pretty easy to pick up on them, should they crop up when restoring. 
Like I say though, the only errors that I've seen are references to the batch 
table not existing. IF all else fails and I can't do a point in time restore, 
then it's not the end of the world - if I was doing a restore of my Bacula 
catalog, then I think I would have a fair amount of other stuff on my mind at 
that point, which I would be more worried about than getting a bang up to date 
catalog. Getting incremental catalog backups was more of a convenience/learning 
exercise, rather than a strict requirement.

It would be nice to be able to do these kind of restores of the catalog without 
having to force things through, without the possible risk of corrupting the 
database, but I'm not sure how this would happen without changing the way 
Bacula handles tables when inserting file records into the database - and I 
have no idea how to do this! 

Again, if I've completely misunderstood anything above, please feel free to 
correct me.

Thanks,

Joe



------------------------------------------------------------------------------
For Developers, A Lot Can Happen In A Second.
Boundary is the first to Know...and Tell You.
Monitor Your Applications in Ultra-Fine Resolution. Try it FREE!
http://p.sf.net/sfu/Boundary-d2dvs2
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users