Hi Dan,
Thanks a lot for your help, it has gone much clearer to me.
I still have a few questions left:
Le 17/10/2012 21:55, Dan Langille a écrit :
On Oct 17, 2012,
at 11:30 AM, Florent Krieg wrote:
Hi there!
We are currently using bacula for a while (3/4 years maybe) to
backup many
servers (VMs as weel as physical machines) on a storage server
(volumes are
not tapes, but basically only labelled files of 1GB size).
We don't have a complicated architecture, even if we backup
every night
around 50 servers. The only thing we need is to be able to
restore one or
more files, but having different version of the same file is
not important
to us.
So here is what we used to have (until we realized we were
wrong):
- 2 full backups a month (1st and 3rd weeks of the month, on
Saturday night)
- incremental backups otherwise
This means for a month, for instance:
FIIIIII-IIIIIII-FIIIIII-IIIIII
Everything seemed to be OK until I tried to restore something
yesterday.
The problem I found is that when we initially setuped bacula,
storage was a
huge constraint (we hadn't any storage server actually) and
every retention
parameters were set to 7 days:
- Client: file/job/volume retention = 7 days
- Pool (the same pool is used for full and incr backups, which
is ok in our
case, am I right?): volume retention = 7 days
Well, it's OK, but the problem is, any backup more that 7 days
will not be listed
in the Catalog. It may well exist on disk, but without the
Catalog entries, there
is no *EASY* way to restore that
backup. And ease-of-restore is exactly what
the Catalog is there for.
Ok so it's quite
obvious we had a huge problem there. To my mind, we should
have no problem to restore anything the week following the
full backup but
there were still two blacks weeks in the month where we
couldn't restore in
a proper way (we could restore as best effort though... What I
actually
did).
Thus to rethink our bacula configuration I've tried since to
read bacula
(5.0.0) manual and forums on the Internet, without being able
to clearly
understand what is the purpose of each retention time.
Logically I would set
the retentions to 14 days everywhere, and this should solves
the problem,
but I am not sure of that and as I could experiment, I'd
prefer understand
what I'm doing.
Retention is all about the Catalog, not necessarily the backup.
Retention is: how long do you want to record this backup in the
Catalog.
I recommend keeping all retentions the same: Volume, Job, File.
Ok.
Could somebody
point me to a manual section that would explain me (without
going into the deep details of Bacula) how to set either:
- client: file/job/volume retention parameters?
Volume retention is not a Client attribute. Job and File are.
Yes sorry, this is how we defined them in the config files.
'show job=SERVER_job' in bconsole drove me in the wrong me,
showing me stuff as:
--> Client: name=SUPER-VIL-1_fd
address=super-vil-1.mgt.sewan.fr FDport=9102 MaxJobs=1
JobRetention=7 days FileRetention=7 days AutoPrune=1
--> Pool: name=SUPER-VIL-1_pool PoolType=Backup
use_cat=1 use_once=0 cat_files=1
max_vols=0 auto_prune=1 VolRetention=7 days
VolUse=0 secs recycle=1 LabelFormat=SUPER-VIL-1-
CleaningPrefix=
*None*
LabelType=0
RecyleOldest=0 PurgeOldest=0 ActionOnPurge=0
MaxVolJobs=10 MaxVolFiles=0 MaxVolBytes=1073741824
MigTime=0 secs MigHiBytes=0 MigLoBytes=0
JobRetention=0 secs FileRetention=4447866 years 9 months 27
days 6 hours 46 mins
Where do these values come from?
I would search for all mentions of retention here
http://www.bacula.org/5.2.x-manuals/en/main/main/Configuring_Director.html#SECTION0022130000000000000000
NOTE this is the same file as the next URL, just a different
section
- pool: which
retention parameters are available and if they are redundant
or not with the others above?
Volume retention is specified in the Pool resource. I would
search for all mentions
of retention here:
http://www.bacula.org/5.2.x-manuals/en/main/main/Configuring_Director.html#SECTION0022150000000000000000
Thanks for the link, I have read both sections carefully and I
think I got the point.
Also, I know the
scenario is really simple compared to most of yours, but if
somebody already achieved something similar (2 or 1 full a
month and then
incrementals), I'd be very greatful if he'd explain to me how
to do it.
Decide how far back you want to be able to restore a file, then
go from there.
Two months? Two years? Decide that, then set your retention
values. You'll need to update your
resources, then update your Pools based on that, using the
update command in bconsole. But that you can ask about
later.
Read all that, then get back to the list with any questions.
Let me know if I am wrong:
Let's consider that when a Full backup is done, I don't mind
previous backups.
A full backup is a reliable reference to me (basically backups are
just used in case of emergency in our environment, so if a full
backup of a working server is available, that's fine).
Our schedule policy is:
Schedule {
Name = "NightlySave"
Run = Level=Full 1st,3rd sat at 06:05
Run = Level=Incremental mon-fri,sun at 04:05
Run = Level=Incremental 2nd,4th,5th sat at 04:05
}
Thus, retention times of 'time inbetween two full backups' should
definitely fit our needs, right? Say 7+7+1(error margin)=15d ..?
Two questions that I have now are:
1. We have all AutoPrune/AutoPurge/Recycle/... parameters set to
Yes. In that case, the retention times are directly involved in
how long we keep the backup Volumes, right?
Moreover, we don't have any max volume files for each pool
therefore if a new volume is needed for a backup, a new one will
be created right?
2. With 7 days retention times, if I list job a server:
*list job=HSS-VIL-1_job
+--------+---------------+---------------------+------+-------+-----------+-----------------+-----------+
| JobId | Name | StartTime | Type | Level |
JobFiles | JobBytes | JobStatus |
+--------+---------------+---------------------+------+-------+-----------+-----------------+-----------+
| 40,840 | HSS-VIL-1_job | 2012-10-06 22:03:57 | B | F |
1,681,621 | 156,228,704,503 | T |
| 40,893 | HSS-VIL-1_job | 2012-10-07 10:58:43 | B | I |
3,066 | 325,259,836 | T |
| 40,946 | HSS-VIL-1_job | 2012-10-08 04:12:06 | B | I |
2,150 | 119,735,689 | T |
| 40,999 | HSS-VIL-1_job | 2012-10-09 04:14:28 | B | I |
8,345 | 466,134,754 | T |
| 41,052 | HSS-VIL-1_job | 2012-10-10 04:15:33 | B | I |
11,125 | 946,012,469 | T |
| 41,110 | HSS-VIL-1_job | 2012-10-11 04:15:22 | B | I |
8,898 | 604,906,727 | T |
| 41,163 | HSS-VIL-1_job | 2012-10-12 04:15:27 | B | I |
10,610 | 613,376,814 | T |
| 41,216 | HSS-VIL-1_job | 2012-10-13 04:15:58 | B | I |
7,604 | 530,114,112 | T |
| 41,269 | HSS-VIL-1_job | 2012-10-14 04:16:40 | B | I |
4,086 | 554,999,485 | T |
| 41,322 | HSS-VIL-1_job | 2012-10-15 04:15:49 | B | I |
3,275 | 198,191,930 | T |
| 41,375 | HSS-VIL-1_job | 2012-10-16 04:14:26 | B | I |
11,713 | 601,523,941 | T |
| 41,429 | HSS-VIL-1_job | 2012-10-17 04:15:53 | B | I |
18,210 | 838,659,657 | T |
| 41,483 | HSS-VIL-1_job | 2012-10-18 04:15:46 | B | I |
11,810 | 575,001,874 | T |
+--------+---------------+---------------------+------+-------+-----------+-----------------+-----------+
Why do I have information about the last 12 days?
Isn't the purpose of JobRetention to clean it?
It seems to work on another host though:
*list job=SUPER-VIL-1_job
+--------+-----------------+---------------------+------+-------+----------+-------------+-----------+
| JobId | Name | StartTime | Type | Level |
JobFiles | JobBytes | JobStatus |
+--------+-----------------+---------------------+------+-------+----------+-------------+-----------+
| 41,179 | SUPER-VIL-1_job | 2012-10-12 04:36:44 | B | I
| 40 | 4,575,403 | T |
| 41,232 | SUPER-VIL-1_job | 2012-10-13 04:35:19 | B | I
| 67 | 4,709,002 | T |
| 41,285 | SUPER-VIL-1_job | 2012-10-14 04:27:37 | B | I
| 53 | 4,424,132 | T |
| 41,338 | SUPER-VIL-1_job | 2012-10-15 04:30:22 | B | I
| 49 | 4,526,834 | T |
| 41,391 | SUPER-VIL-1_job | 2012-10-16 04:42:28 | B | I |
105 | 45,807,762 | T |
| 41,445 | SUPER-VIL-1_job | 2012-10-17 04:39:48 | B | F |
38,183 | 995,992,505 | T |
| 41,501 | SUPER-VIL-1_job | 2012-10-18 04:47:52 | B | I
| 27 | 4,474,961 | T |
+--------+-----------------+---------------------+------+-------+----------+-------------+-----------+
A full backup was done two days ago because the full backup of the
first week was pruned, I reckon.
Thanks again in advance!
Florent
Thanks in
advance everybody and sorry if I feel like I am searching
information at the wrong place.