Bacula-users

Re: [Bacula-users] question about how retention works

2013-12-20 11:48:38
Subject: Re: [Bacula-users] question about how retention works
From: Dan Langille <dan AT langille DOT org>
To: bacula-users AT lists.sourceforge DOT net
Date: Fri, 20 Dec 2013 11:45:31 -0500
On 2013-12-19 04:18 PM, Greg Woods wrote:
> On Thu, 2013-12-19 at 12:38 -0500, Dan Langille wrote:
> 
>> I set my FILE and JOB retentions high.  3 years.  Then I set my VOLUME
>> retention lower.  Whichever retention period expires first, that's the
>> one which counts. <== I will refer to that as 'first one counts' later
>> in this email.
>> 
>> Therefore, in your archive pool, set your VOLUME retention to 3 years,
>> for example.
>> 
>> And in your main pool, set it to 5 weeks, for example.
>> 
>> Does that help?
> 
> Yes. This is more-or-less what I have now. I have File Retention = Job
> Retention = 365 days. In the main pool, Volume Retention = 30 days, and
> in the archive pool, Volume Retention = 365 days. I'm just trying to
> make sure that this is really going to accomplish what I want.
> 
> The one big question remaining is how Copy jobs affect this. If I use a
> Copy job to make a copy of one or more jobs (usually all the jobs for a
> given client) from the main pool to the archive pool, what happens to
> the File records? Are there now two File records for every file on the
> backup?

Yes, I believe so.

Let's confirm this.  Let's look at this job:

154613  Incr          1    6.416 G  OK       20-Dec-13 10:20 
BackupCatalog

I found that through: status storage=OverlandTapeLibrary


I know I backup just one file in that job, and here it is:

bacula=> select pathid, filenameid from file where jobid = 154613  ;
  pathid  | filenameid
---------+------------
  1320335 |    7936395
(1 row)

bacula=>

What file is that?

bacula=> select F.pathid, F.filenameid, FN.name, P.path from file F, 
filename FN, path P where F.filenameid = FN.filenameid AND P.pathid = 
F.pathid AND F.jobid = 154613  ;
  pathid  | filenameid |      name      |        path
---------+------------+----------------+--------------------
  1320335 |    7936395 | MyCatalog.dump | /usr/local/bacula/
(1 row)

bacula=>


bacula=> select count(*) from job where name = 'BackupCatalog';
  count
-------
   1956
(1 row)

bacula=>

I have 1956 jobs for that.

919 of those jobs are copy jobs:

bacula=> select count(*) from job where name = 'BackupCatalog' and 
priorjobid is not null and priorjobid != 0;
  count
-------
    919
(1 row)

How many references do we have in the jobmedia table?

bacula=> select count(*) from file where filenameid = 7936395;
  count
-------
    185
(1 row)

So that's just 185 jobs which reference MyCatalog.dump... but not all of 
those jobs are the job in question.  (this is an aside)


bacula=> select count(*) from file F, job J where F.filenameid = 7936395 
and F.jobid = J.jobid and J.name = 'BackupCatalog';
  count
-------
     93
(1 row)

There... 93 jobs for BackupCatalog which still reference MyCatalog.dump.

I have a BackupCatalog job queued for Copy. Let's run that copy now.


[long time interval removed]

bacula=> select count(*) from file F, job J where F.filenameid = 7936395 
and F.jobid = J.jobid and J.name = 'BackupCatalog';
  count
-------
     94
(1 row)


So I think the answer to your question is: yes.




> For me that's fine, but in an enterprise setup with thousands of
> clients, effectively doubling the size of the database would be a 
> pretty
> big deal, so it wouldn't surprise me to learn that there are some
> efficiency improvements there. That's what leads me to wonder what
> really happens when a Volume in the main pool is recycled. The Recycle
> Algorithm document suggests that the File records associated with that
> Volume are pruned, so does that or does that not mean that the
> corresponding jobs in the archive pool no longer have File records? I
> may have to poke around in the database a little to see if I can answer
> that question for myself.
> 
> 
>> > I notice that the paragraph on Automatic Pruning says that you can't
>> > restore files once the File records are gone, but I know that's not
>> > true
>> > because I have done it. I suspect the documentation may be a few
>> > versions old and I can't necessarily trust everything it says.
>> 
>> You've done it a different way.  Not the 'usual way'.
> 
> I guess that depends on how you define "the usual way", so I'll 
> clarify.
> This is what I see, running "restore" out of "bconsole". I first select
> "most recent backup for a client", and go through the usual process of
> selecting the client and the File Set, and then I see:
> 
> You have selected the following JobIds:
> 56,3603,3609,3616,3622,3710,3716,3722,3832,3839
> 
> Building directory tree for JobId(s)
> 56,3603,3609,3616,3622,3710,3716,3722,3832,3839 ...
> ++++++++++++++++++++++
> 
> For one or more of the JobIds selected, no files were found,
> so file selection is not possible.

Ahh, look there: 'one or more'

Perhaps you are able to restore by regex because you're using the jobs 
which are not passed the File retention period?

> Most likely your retention policy pruned the files.
> 
> Do you want to restore all the files? (yes|no): no
> 
> Regexp matching files to restore? (empty to abort):
> 
> At this point, I can type in a regex. Then it shows the usual 
> parameters
> followed by "yes/no/mod", which I can then modify in the usual way, and
> then, although it takes quite a while, it will restore the files that
> matched the regex.
> 
>> 
>> Does what you've observed make sense when you consider my claim of
>> 'first one counts'?
> 
> I'm not sure I can say it makes sense to me until I find some time to
> poke around in the database to see what is really going on.
> 
> The one thing I am afraid of is having a Volume in the main pool 
> getting
> recycled causing the File records for files in the archive pool getting
> pruned.

I think that is why I keep File and Job retention to 3 years.  To avoid 
this situation.  I let Volume retention dictate how long I keep File and 
Job records.  i.e. up to 3 years.

-- 
Dan Langille - http://langille.org/

------------------------------------------------------------------------------
Rapidly troubleshoot problems before they affect your business. Most IT 
organizations don't have a clear picture of how application performance 
affects their revenue. With AppDynamics, you get 100% visibility into your 
Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users