Bacula-users

Re: [Bacula-users] Multiple full backups in same month

2015-06-25 15:04:53
Subject: Re: [Bacula-users] Multiple full backups in same month
From: Silver Salonen <silver.salonen AT gmail DOT com>
To: Mike Ruskai <thannyd AT earthlink DOT net>
Date: Thu, 25 Jun 2015 22:03:14 +0300
On Thu, Jun 25, 2015 at 8:57 PM, Mike Ruskai <thannyd AT earthlink DOT net> wrote:
On 6/25/2015 10:21 AM, Silver Salonen wrote:
On 06/25/2015 05:06 PM, Rodrigo Abrantes Antunes wrote:

Citando Silver Salonen <silver.salonen AT gmail DOT com>:

On 06/25/2015 03:05 PM, Rodrigo Abrantes Antunes wrote:

Citando Heitor Faria <heitor AT bacula.com DOT br>:

Hi, I had a problem with space in my bacula server in May 2 so the jobs had to stay in the queue. In May 3 bacula should do a full backup but due to the space the job stayed in the queue and then failed. The next jobs starting to entering the queue too ending up in a long queue.


I solved the problem with space in May 23 and then the queue started to run again and in May 23 it started an incremental job scheduled to run on May 6. Since the full backup job failed and left the queue on May 3 then this incremental was upgraded to full. When the job finished the next one in the queue started, an incremental one that was scheduled to run on May 7.

Now the problem:
This new incremental backup was upgraded to full too even that the previous one was full. It appears that bacula considered the date of the jobs in the queue to decide if it should upgrade and not the date of the last full job. This ending up in bacula upgrading every incremental backup in the queue to full until May 23 (when I solved the space problem and the full backup was done) resulting in about 17 full backups in the same month.

Is this normal behaviour? Shouldn't bacula consider the date of the last full backup since the date the job is running instead of the date the job was scheduled?

Hello Rodrigo: yes, this seems accurate. Bacula can only perform incremental backups if it terminates successfully a backup from the specific client, with a given FileSet.
About the duplicated jobs that stalled you can avoid that with the Allow Duplicate Job=no directive.

Ok, I understand that bacula needs a full backup to perform incrementals, that's why it upgraded the incrementals to full.

But why it upgraded the other incrementals in the queue if the first incremental was upgraded to full?

How it decides if the next ones should be upgraded? Why it didn't compare with the date of the last full backup run since the current date instead of compare with the date of the last full backup run since the job was scheduled?


As I stated in my other e-mail, this is related to "Rerun Failed Levels = yes".

Here's what I assume happened:

  1. 03.May - Full backup is queued
  2. 04.May - Incremental backup is queued.
    It's also checked that the previous Full backup (from 03.May) did/has not completed successfully and it's therefore upgraded to Full.
  3. 05.May - Incremental backup is queued and upgraded to Full again.
  4. Full backup from 03.May is completed.
  5. Full backup (upgraded from Incr) from 04.May is started.
  6. Full backup (upgraded from Incr) from 04.May is completed.
  7. Full backup (upgraded from Incr) from 05.May is started.
  8. etc


Was it like that? :)

--
Silver

According to the logs this is what happened (all of the same server):

1. 03.May - Full backup is queued (Job14809 - backup_server1.2015-05-03_01.00.00_54)
2. 04.May - Incremental backup is queued. (Job14828 - backup_server1.2015-05-04_01.00.03_13) (In the logs there is no mention of it being upgraded to full here, I think it is upgraded only when it runs)
3. 05.May - Incremental backup is queued. (Job14847 - backup_server1.2015-05-05_01.00.00_32) (In the logs there is no mention of it being upgraded to full here, I think it is upgraded only when it runs)
4. 06.May - Incremental backup is queued. (Job14866 - backup_server1.2015-05-06_01.00.01_00) (In the logs there is no mention of it being upgraded to full here, I think it is upgraded only when it runs)
4. 07.May - Incremental backup is queued. (Job14885 - backup_server1.2015-05-07_01.00.00_31) (In the logs there is no mention of it being upgraded to full here, I think it is upgraded only when it runs)
2. 09.May - Job14809 exceeded max waiting time and failed
4. 22 May - Job14828 exceeded max waiting time and failed
5. 22 May - Job14847 exceeded max waiting time and failed
6. 23 May - Job14866 is upgraded to full and successfully done (Here there is mention of it being upgraded)
7. 23 May - Job14885 is upgraded to full and successfully done (Here there is mention of it being upgraded)

The question now is: bacula decides if it will upgrade jobs when it queues the jobs or when it starts the jobs? According to the logs above I think it is when it starts.


To my mind it's upgraded when it's queued... I hope I'm wrong :)


I don't think you are, but it's as stated above - you don't know it's been upgraded until the job starts running. 

I've had several cases, due to varying circumstances, where the only full backup was running while incrementals of the same job became queued.  In the director status, they show up as incremental.  Only when they start running do they get upgraded to full, even though by the time that happens, the full has successfully completed, and there's no reason to promote them.  So on the occasions where I need to start from scratch (usually due to storage changes that can't be done on the fly for one reason or another), I now know to watch for and cancel any incrementals that start while the full is still running.  Which is, of course, a hassle.  But running out of space because you have too many full backups is a worse hassle.

I'd be glad to see this algorithm flaw rectified

Maybe that little change with help of all other duplicate jobs control directives could solve such issues once and for all.

I wonder if anyone has requested such a change? I didn't find any at the moment in bugs.bacula.org.

--
Silver
------------------------------------------------------------------------------
Monitor 25 network devices or servers for free with OpManager!
OpManager is web-based network management software that monitors 
network devices and physical & virtual servers, alerts via email & sms 
for fault. Monitor 25 devices for free with no restriction. Download now
http://ad.doubleclick.net/ddm/clk/292181274;119417398;o
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users