Bacula-users

Re: [Bacula-users] Error with DR backups

2009-10-29 17:07:09
Subject: Re: [Bacula-users] Error with DR backups
From: DAve <dave.list AT pixelhammer DOT com>
To: bacula-users AT lists.sourceforge DOT net
Date: Thu, 29 Oct 2009 17:02:35 -0400
DAve wrote:
> Ryan Novosielski wrote:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> DAve wrote:
>>> DAve wrote:
>>>> Ryan Novosielski wrote:
>>>>> -----BEGIN PGP SIGNED MESSAGE-----
>>>>> Hash: SHA1
>>>>>
>>>>> DAve wrote:
>>>>>> DAve wrote:
>>>>>>> DAve wrote:
>>>>>>>> DAve wrote:
>>>>>>>>> Good afternoon.
>>>>>>>>>
>>>>>>>>> I am having a recurring issue with a backup that is configured for DR 
>>>>>>>>> purposes. The client purchased a fixed amount of space and wants to 
>>>>>>>>> overwrite the volumes each night. They have a local backup system in 
>>>>>>>>> place and we are using Bacula to get those backups offsite for the 
>>>>>>>>> evening only. I setup Bacula to use a number of volumes of fixed 
>>>>>>>>> size, 
>>>>>>>>> and the volumes are written over each night.
>>>>>>>>>
>>>>>>>>> Everything worked fine for a period and then began producing an 
>>>>>>>>> error. 
>>>>>>>>> There have been days when the error does not occur and I can see 
>>>>>>>>> nothing 
>>>>>>>>> different.
>>>>>>>>>
>>>>>>>>> I am putting the client's config below and the larger backup output 
>>>>>>>>> and 
>>>>>>>>> media list online at these URLs.
>>>>>>>>>
>>>>>>>>> Job Output
>>>>>>>>> http://pixelhammer.com/Backup-allied-ex3-fd%20Full.txt
>>>>>>>>>
>>>>>>>>> bconsole media list
>>>>>>>>> http://pixelhammer.com/allied-media.txt
>>>>>>>>>
>>>>>>>>> The error I am seeing,
>>>>>>>>> 05-Oct 08:38 director-dir: Allied-ex3.2009-10-05_01.00.02 Warning: 
>>>>>>>>> Error 
>>>>>>>>> updating job record. sql_update.c:194 Update problem: affected_rows=0
>>>>>>>>> 05-Oct 08:38 director-dir: Allied-ex3.2009-10-05_01.00.02 Warning: 
>>>>>>>>> Error 
>>>>>>>>> getting job record for stats: sql_get.c:293 No Job found for JobId 
>>>>>>>>> 20126
>>>>>>>>> 05-Oct 08:38 director-dir: Allied-ex3.2009-10-05_01.00.02 Error: 
>>>>>>>>> Bacula 
>>>>>>>>> 2.0.3 (06Mar07): 05-Oct-2009 08:38:53
>>>>>>>>>
>>>>>>>>> The client config,
>>>>>>>>> Job {
>>>>>>>>>    Name = "Allied-ex3"
>>>>>>>>>    FileSet = "Allied-ex3"
>>>>>>>>>    Write Bootstrap = "/data/backups/Allied-ex3.bsr"
>>>>>>>>>    Type = Backup
>>>>>>>>>    Level = Full
>>>>>>>>>    Client = allied-ex3-fd
>>>>>>>>>    Schedule = "Allied-ex3"
>>>>>>>>>    Storage = storage2-allied-ex3
>>>>>>>>>    Messages = Allied
>>>>>>>>>    Pool = ex3-allied-Pool
>>>>>>>>>    Priority = 10
>>>>>>>>>    #Enabled = No
>>>>>>>>>    }
>>>>>>>>>
>>>>>>>>> FileSet {
>>>>>>>>>    Name = "Allied-ex3"
>>>>>>>>>    Enable VSS = no
>>>>>>>>>    Include {
>>>>>>>>>        Options {
>>>>>>>>>              #compression = gzip
>>>>>>>>>              IgnoreCase = yes
>>>>>>>>>                 }
>>>>>>>>>        File = "D:/archivesink/"
>>>>>>>>>        }
>>>>>>>>>
>>>>>>>>>    Exclude {
>>>>>>>>>            }
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> Schedule {
>>>>>>>>>    Name = "Allied-ex3"
>>>>>>>>>    Run = Level=Full FullPool=ex3-allied-Pool mon-sun at 01:00
>>>>>>>>>    }
>>>>>>>>>
>>>>>>>>> Client {
>>>>>>>>>    Name = allied-ex3-fd
>>>>>>>>>    Address = xxx.xxx.105.12
>>>>>>>>>    FDPort = 49202
>>>>>>>>>    Catalog = DataVault
>>>>>>>>>    Password = "xx"
>>>>>>>>>    File Retention = 1 week
>>>>>>>>>    Job Retention = 1 week
>>>>>>>>>    AutoPrune = yes
>>>>>>>>>    }
>>>>>>>>>
>>>>>>>>> Storage {
>>>>>>>>>    Name = storage2-allied-ex3
>>>>>>>>>    Address = xxx.tls.net
>>>>>>>>>    SDPort = 49022
>>>>>>>>>    Password = "xx"
>>>>>>>>>    Device = FileStorage-allied-ex3
>>>>>>>>>    Media Type = File
>>>>>>>>>    }
>>>>>>>>>
>>>>>>>>> Pool {
>>>>>>>>>    Name = ex3-allied-Pool
>>>>>>>>>    Pool Type = Backup
>>>>>>>>>    LabelFormat = "ex3-allied-"
>>>>>>>>>    Recycle = yes
>>>>>>>>>    Recycle Oldest Volume = yes
>>>>>>>>>    Purge Oldest Volume = yes
>>>>>>>>>    Volume Retention = 12 hours
>>>>>>>>>    Maximum Volumes = 60
>>>>>>>>>    Maximum Volume Jobs = 0
>>>>>>>>>    Maximum Volume Bytes = 1G
>>>>>>>>>    }
>>>>>>>>>
>>>>>>>>> I am reasonably certain the problem is PEBKAC and my understanding of 
>>>>>>>>> pruning and retention. I cannot see where I have gone wrong.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>> DAve
>>>>>>>> Hmmm, I have a second client configured in the same manner. The only 
>>>>>>>> difference is that the second client has 240 1gb volumes instead of 60 
>>>>>>>> 1gb volumes. The configs are identical and the larger client has no 
>>>>>>>> issues. Both backup jobs start and finish within 10 minutes of each 
>>>>>>>> other, yet the smaller backup has it's job purged and the larger 
>>>>>>>> backup 
>>>>>>>> does not.
>>>>>>>>
>>>>>>>> Still digging.
>>>>>>>>
>>>>>>>> DAve
>>>>>>>>
>>>>>>> Changed the pool resource to not autoprune and the error was the same 
>>>>>>> last night.
>>>>>>>
>>>>>>> Pool {
>>>>>>>    Name = ex3-allied-Pool
>>>>>>>    Pool Type = Backup
>>>>>>>    LabelFormat = "ex3-allied-"
>>>>>>>    Recycle = yes
>>>>>>>    Recycle Oldest Volume = yes
>>>>>>>    Purge Oldest Volume = yes
>>>>>>>    AutoPrune = no
>>>>>>>    Volume Retention = 12 hours
>>>>>>>    Maximum Volumes = 60
>>>>>>>    Maximum Volume Jobs = 0
>>>>>>>    Maximum Volume Bytes = 1G
>>>>>>>    }
>>>>>>>
>>>>>>> The larger client mentioned above, again, no problems. If I have "Job 
>>>>>>> Retention = 1 week" then why is my current job not found in the catalog?
>>>>>>>
>>>>>>>  From the manual,
>>>>>>>
>>>>>>> "Job Retention = <time-period-specification> The Job Retention directive
>>>>>>> defines the length of time that Bacula will keep Job records in
>>>>>>> the Catalog database after the Job End time. When this time period
>>>>>>> expires, and if AutoPrune is set to yes Bacula will prune (remove)
>>>>>>> Job records that are older than the specified File Retention period.
>>>>>>> As with the other retention periods, this affects only records in the
>>>>>>> catalog and not data in your archive backup."
>>>>>>>
>>>>>>> And the error clearly states "No Job found for JobId 20126", when the 
>>>>>>> job is still running.
>>>>>>>
>>>>>>> the only mention I ever seem to find of this error is a recent post by 
>>>>>>> Joshua J. Kugler, with no solution other than his issue went away and 
>>>>>>> he 
>>>>>>> will keep an eye on it until it returns.
>>>>>>>
>>>>>>> DAve
>>>>>>>
>>>>>> I find nothing different between the two configs that would explain why 
>>>>>> one works and the other does not. So I created a new media pool called 
>>>>>> exch3-allied-Pool, changed the name of the pool resource to the new name 
>>>>>> in the director config, added volumes same as before, and now I have 
>>>>>> been running without error since the 7th.
>>>>>>
>>>>>> I am at a loss to understand why. I need to go into SQL and see what is 
>>>>>> different as doing "list media pool=exch3-allied-Pool" shows no 
>>>>>> difference between the pools.
>>>>> The "show" commands provide more information. Bear in mind, also, that
>>>>> volume property changes must be updated on the individual volumes. If
>>>>> you change them in the config file, they will affect new volume
>>>>> defaults, not current ones. I do not know if this applies to your
>>>>> situation or not, but "show" should tell you.
>>>>>
>>>> I will check into that. I have not removed the old pool.
>>>>
>>>> I did do an update pool from resource any time I make a change to the
>>>> config. My understanding from the manual is that is the proper thing to
>>>> do. Never had a problem before with updating pools.
>>>>
>>>> DAve
>>>>
>>> Odd, whenever I try to do a show pools, the director crashes. That is 
>>> the only time it has ever done so, through at least three different 
>>> version of bacula.
>>>
>>> But even more strange, after I created a new pool the problem when away 
>>> as I stated earlier. I created the new pool on Oct 7th, no further 
>>> errors. Until Oct 26th, then the error returned.
>>>
>>> 29-Oct 07:09 director-dir: Allied-ex3.2009-10-29_01.00.02 Warning: Error 
>>> updating job record. sql_update.c:194 Update problem: affected_rows=0
>>> 29-Oct 07:09 director-dir: Allied-ex3.2009-10-29_01.00.02 Warning: Error 
>>> getting job record for stats: sql_get.c:293 No Job found for JobId 20580
>>> 29-Oct 07:09 director-dir: Allied-ex3.2009-10-29_01.00.02 Error: Bacula 
>>> 2.0.3 (06Mar07):
>>>
>>> I am unable to understand what is going wrong. As each backup is a full 
>>> backup, every night, and the volume expire in less than 24 hours, why 
>>> did this work for 18 days?
>>>
>>> Still, the backup of the second server for that client, has always 
>>> worked and is still working with no changes to it's config.
>>>
>>> I am stumped.
>> Have you run a consistency check on your DB, either via mysqlcheck or
>> dbcheck or, preferably, both? I suspect something may be awry.
> 
> Yep,
> 
> [root@director /usr/local/etc]# mysqlcheck -uroot -p bacula
> Enter password:
> bacula.BaseFiles                                   OK
> bacula.CDImages                                    OK
> bacula.Client                                      OK
> bacula.Counters                                    OK
> bacula.Device                                      OK
> bacula.File                                        OK
> bacula.FileSet                                     OK
> bacula.Filename                                    OK
> bacula.Job                                         OK
> bacula.JobMedia                                    OK
> bacula.Location                                    OK
> bacula.LocationLog                                 OK
> bacula.Log                                         OK
> bacula.Media                                       OK
> bacula.MediaType                                   OK
> bacula.Path                                        OK
> bacula.Pool                                        OK
> bacula.Status                                      OK
> bacula.Storage                                     OK
> bacula.UnsavedFiles                                OK
> bacula.Version                                     OK
> 
> 
> Running dbcheck now.
> 
> DAve

Ran dbcheck, found several paths to correct and also orphaned files.
dbcheck results, second run, everything looks good.

Select function number: 16
Checking for Filenames with a trailing slash
Found 0 bad Filename records.
Checking for Paths without a trailing slash
Found 0 bad Path records.
Checking for duplicate Filename entries.
Found 0 duplicate Filename records.
Checking for duplicate Path entries.
Found 0 duplicate Path records.
Checking for orphaned JobMedia entries.
Checking for orphaned File entries. This may take some time!
Checking for orphaned Path entries. This may take some time!
Terminated

Reran mysqlcheck as #mysqlcheck -q -r -uroot -p bacula
Enter password:
bacula.BaseFiles                                   OK
bacula.CDImages                                    OK
bacula.Client                                      OK
bacula.Counters                                    OK
bacula.Device                                      OK
bacula.File                                        OK
bacula.FileSet                                     OK
bacula.Filename                                    OK
bacula.Job                                         OK
bacula.JobMedia                                    OK
bacula.Location                                    OK
bacula.LocationLog                                 OK
bacula.Log                                         OK
bacula.Media                                       OK
bacula.MediaType                                   OK
bacula.Path                                        OK
bacula.Pool                                        OK
bacula.Status                                      OK
bacula.Storage                                     OK
bacula.UnsavedFiles                                OK
bacula.Version                                     OK

I will see what happens tonight.

DAve


-- 
"Posterity, you will know how much it cost the present generation to
preserve your freedom.  I hope you will make good use of it.  If you
do not, I shall repent in heaven that ever I took half the pains to
preserve it." John Quincy Adams

http://appleseedinfo.org


------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users