Bacula-users

Re: [Bacula-users] How precisely does bacula decide what to archive

2013-02-22 11:25:16
Subject: Re: [Bacula-users] How precisely does bacula decide what to archive
From: Josh Fisher <jfisher AT pvct DOT com>
To: bacula-users AT lists.sourceforge DOT net
Date: Fri, 22 Feb 2013 11:22:33 -0500
On 2/21/2013 7:39 AM, Durand Toto wrote:
> Hi all,
>
> First of all, thanks once more Dan.
>
>     > Hi,
>     >
>     > Im trying to use bacula but for space reasons I can only do
>     > incremental backups so I need to make sure that I archive
>     everything.
>     >
>     > I see two alternatives:
>     >     * Bacula in normal mode ensuring that all files are considered
>     > new (settting ctimes)
>     >      * Bacula in accurate mode
>     >
>     > However, Id like to make sure that my interpretation of the
>     manual is
>     > correct before proceeding.
>     >
>     > In normal mode, the way I understand it, bacula archives files that
>     > have a ctime (or mtime with mtime-only) that is newer than the time
>     > at
>     > which the job is run. If I am right, is this job time:
>     >      1) the time at which the job was scheduled
>     >     2) the time at which the job actually started
>     >     3) the time at which the files list was created
>     >     4) the time at which the current file is archived ?
>     > Also, what if the file has the same ctime as the job ? Will it be
>     > archived this time, skipped or never archived altogether ? (In other
>     > words, is files newer meaning ctime > job time or ctime >= job time)
>
>     Incrementals (and differentials) are relative to another job.  They
>     backup
>     everything that has changed since *THAT* job.
>
>     And it's mtime, not ctime.
>
>  I just thought it was ctime and mtime by default unless one selects 
> mtimeonly.

Keep in mind that ctime can change independently and more often than 
mtime. A write changes mtime, meaning the data content of the file has 
been modified. ctime is the time that an inode changed for any reason, 
(not just when attributes change). There are several things that will 
update ctime without affecting mtime, including the chown(), chmod(), 
and rename() system calls used by for example the chown, chmod, and mv 
shell commands.

I am not aware of any way to set ctime in a POSIX filesystem other than 
by changing the system time and doing a mv, so I don't think that is an 
option.

Bacula will handle everything in an intuitive manner in normal mode 
except for accounting for deleted files. Without accurate mode, an 
incremental or differential backup will not record that a file was 
deleted, so a subsequent  restore will restore any files deleted since 
the last full backup. Also, if an old file is restored after the last 
full backup, then it may be missed by incremental and differential jobs. 
Of course, a full backup is not affected by any of this and always backs 
up everything.

The bottom line is that if you can live with the fact that deleted files 
will be restored, then normal mode uses less resources, is faster, and 
still keeps everything backed up. If it is important not to restore 
deleted files and/or restores of old files are a frequent occurrence, 
then accurate mode is required.

It would be better to somehow acquire the hardware to have at least two 
sets of fulls so that full backups could be performed periodically. The 
single full backup is also a single point of failure. Also, I would 
imagine a restore would be painfully slow if there were hundreds of 
incremental jobs involved.


>
>
>      From
>     
> http://www.bacula.org/5.2.x-manuals/en/main/main/Configuring_Director.html#SECTION001430000000000000000
>
>     "The File daemon (Client) decides which files to backup for an
>     Incremental backup by comparing start time
>     of the prior Job (Full, Differential, or Incremental) against the time
>     each file was last "modified" (st_mtime)
>     and the time its attributes were last "changed"(st_ctime). If the file
>     was modified or its attributes changed
>     on or after this start time, it will then be backed up."
>
>     All clear now?
>
> "on or after"
> that's exactly the kind of info I was after.
>
>
>     > IN ACCURATE MODE, I cannot keep a DB of all the files ever archived
>     > as
>     > the DB would grow to a TB (10^5 to 10^6 new files/day).
>
>     No, that doesn't happen. Not every revision of every file is kept.
>     Forever.
>
> I understand that. However, for most files, I don't do revisions, they 
> are created and then archived period. Despite that, I have many many 
> brand new files every day. Thus even with one entry per file (without 
> any revision), I will have billions of entries.
>
>
>     > I am thus trying to understand what is compared internally, however,
>     > I
>     > dont know how to interprete the following sentence of the manual:
>     >      _"the Director will send a list of ALL previous files backed
>     > up_,_ and the File daemon will use that list to determine if any new
>     > files have been added or or moved and if any files have been
>     > deleted."_
>     >
>     > Does all files mean:
>     >     1) all files ARCHIVED in the PREVIOUS (potentially incremental)
>     > backup?
>     >     2) all files PRESENT on the drive during the PREVIOUS
>     > (potentially incremental) backup?
>     >      3) all files that bacula KNOWS OF i.e. all files PRESENT IN
>     > THE CATALOG at that time (i.e. minus purged old files?)
>     >     4) all files EVER ARCHIVED ?
>
>     I'll leave that to someone else.
>
> If "someone else" is around, that'd be much appreciated :).
>
> Thanks in advance,
>
> Best,
>
> Gnewbee
>
>
>     --
>     Dan Langille - http://langille.org/
>
>     
> ------------------------------------------------------------------------------
>     The Go Parallel Website, sponsored by Intel - in partnership with
>     Geeknet,
>     is your hub for all things parallel software development, from
>     weekly thought
>     leadership blogs to news, videos, case studies, tutorials, tech docs,
>     whitepapers, evaluation guides, and opinion stories. Check out the
>     most
>     recent posts - join the conversation now.
>     http://goparallel.sourceforge.net/
>     _______________________________________________
>     Bacula-users mailing list
>     Bacula-users AT lists.sourceforge DOT net
>     <mailto:Bacula-users AT lists.sourceforge DOT net>
>     https://lists.sourceforge.net/lists/listinfo/bacula-users
>
>
>
>
> ------------------------------------------------------------------------------
> Everyone hates slow websites. So do we.
> Make your web apps faster with AppDynamics
> Download AppDynamics Lite for free today:
> http://p.sf.net/sfu/appdyn_d2d_feb
>
>
> _______________________________________________
> Bacula-users mailing list
> Bacula-users AT lists.sourceforge DOT net
> https://lists.sourceforge.net/lists/listinfo/bacula-users


------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_d2d_feb
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users