Bacula-users

Re: [Bacula-users] Job auto-pruning and automatic volume recycling with disabled pool auto-pruning

2015-02-07 05:56:15
Subject: Re: [Bacula-users] Job auto-pruning and automatic volume recycling with disabled pool auto-pruning
From: Kern Sibbald <kern AT sibbald DOT com>
To: Heiko Wundram <modelnine AT modelnine DOT org>, bacula-users AT lists.sourceforge DOT net
Date: Sat, 07 Feb 2015 11:50:57 +0100
Hello,

On 06.02.2015 13:52, Heiko Wundram wrote:
> Hey,
>
> Am 06.02.2015 13:43, schrieb Heitor Faria:
>> Since backups are stored in byte sequence in general there is no
>> practical advantage on have different retention for jobs within the
>> same volume, since it would require a huge computational effort to
>> claim the space from a single recycled volume.
> that's not what I meant: I don't want to reclaim parts of a volume, but 
> simply that Bacula keeps the (complete) volume until no more jobs are 
> stored on it (i.e. no more jobs in the catalog reference the volume), 
> and then (and only then) recycles it. 

The above (i.e. what you are asking for) is exactly how Bacula works. 
The data on a Volume can be destroyed or overwritten in one of two ways:

1. The Volume is recycled, and Bacula wants to use it again, so it
truncates the Volume.

2. You explicitly cause the Volume to be Truncated, by using the
"truncate" command in Bacula version 7.0, or some more complicated
procedures in prior versions.

Best regards,
Kern

> I know that this "wastes" space, 
> but generally, I do think behaviour like that is useful when you have 
> rather small(ish) tapes (as the backup is disk-based, my simulated 
> "tapes" are 500MB each, so this kind of usage is perfectly possible 
> without lots of wasted space). And, from what I gather with the catalog 
> state, it wouldn't be too difficult to implement logic like this, as 
> volume pruning needs to resolve the jobs/files to remove anyway, so the 
> inverse is also possible when requring a new volume.
>
>> I don't think you need to create one pool per different network
>> uplink, but each storage device should have at least one pool for
>> sure.
> Yes, I do, as the storage is reachable via different IPs depending on 
> the uplink used. Think of internal vs. external network, so if I have 
> three different retention levels like you proposed, I'd need nine 
> storages. When grouping the retention times I currently have in use, I 
> get five different groups, so basically, I'd need to define fifteen 
> storage pools. Ugh.
>
>> Usually I have daily pool (aprox. 7 days); weekly (aprox. 30 days);
>> monthly (aprox. 1 year retention); and I try to fit the backup
>> retention needs in one of those levels (GFS).
>> Eg.: if I need to retain a specific data for 2 weeks, I would schedule
>> this job to run on daily and weekly basis.
> Thanks for the clarification, and I'll need to see what to do now to 
> implement the logic...
>


------------------------------------------------------------------------------
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users