Networker

[Networker] determing what files to backup based on modification times.

2003-06-25 13:57:29
Subject: [Networker] determing what files to backup based on modification times.
From: Craig Ruefenacht <Craig.Ruefenacht AT US.USANA DOT COM>
To: NETWORKER AT LISTMAIL.TEMPLE DOT EDU
Date: Wed, 25 Jun 2003 11:55:33 -0600
Hi,

We discovered a perplexing problem recently (and have since implemented a
work-around solution), but I don't really like our work-around.  I'd like to
hear what others have to suggest.

The problem centered around not understanding how Networker determines what
files are to be backed up during the current backup session.

To demonstrate the problem we were having, here is a simple example.

Lets say that on Sunday morning at 6:00am, a full backup is performed of
filesystem ABC.  On Monday morning at 6:00am, an incremental backup is
performed of the same filesystem.  During the middle of the day on Monday, a
file, lets call it XYZ, is placed on the ABC filesystem which has a
modification timestamp of Sunday 8:30pm.

On Tuesday morning 6:00am, an incremental backup is performed of filesystem
ABC.  We noticed that file XYZ was not backed up on Tuesday, even though it
is a file that didn't exist when Monday's incremental ran, and, hence, is a
new file to the filesystem.  Shouldn't the incremental backup on Tuesday
backup this new file?

The logical answer is yes, but, if you look at the definition of an
incremental backup, it saves everything that has changed since the last
backup ran.  The keyword here is "changed".  How does Networker know that a
file has "changed" since the last backup?  By looking at the file's
modification timestamp.  Because file XYZ had a timestamp that pre-dated
Monday's incremental backup, when Tuesday's incremental ran, Networker
assumed that because XYZ's modification time was prior to the previous
incremental backup (Monday's), that the previous backup (Monday's) backed up
the file, so it wasn't backed up on Tuesday.  Networker has no way to know
that the file didn't even exist when Monday's incremental backups ran - it
assumes it did because the modification time says it did.

We have this kind of situation happen every day, because we have a EMC
symmetrix and have BCV volumes which we sync up once a day of our production
Oracle database.  A couple of hours after the BCV sync, we backup the BCV
volumes via Networker.  After a BCV sync has occurred (we do an establish,
let the volumes sync up, and then do a split), new files get written to the
production filesystems (and existing ones get modified).  If the
modification times of these files on the production filesystems occur after
the BCV sync but predate the time that Networker backs up the BCV (by the
time Networker backs up the BCVs the snapshot contained on the BCVs are a
couple of hours old), when the next BCV sync occurs the next morning, these
modified files on the production filesystems will be on the BCV volumes, but
their modification times will predate the Networker backup performed the
previous morning.  So the next Networker incremental backup will not backup
these files.

As a work-around, I wrote my own save script (Networker calls my save
script, not the Networker supplied save command).  My script takes the
command-line arguments that Networker passes to me and I decrement the "-t"
option a few hours.  I then pass the original command-line arguments with
the -t option modified, to Networker's save command.  This in effect tells
the save command to backup all files that have changed since the last
backup, minus a few hours.

I know that there are other ways of dealing with this problem, including
just doing a full backup each day, or doing various "level" backups instead
of doing an incremental backup, or forking out money to use the Oracle
module for Networker itself.  But even doing various "level" backups each
day can exhibit the same problem, if, for example, you do a level 3 backup
on Monday, a level 5 backup on Tuesday, and a level 4 backup on Wednesday.
It should be that Wednesday's level 4 backup would contain all files changed
since Monday's level 3 backup, but what if a file was placed on the
filesystem on Tuesday with a modification time that predates the level 3
backup on Monday?  You wouldn't put that file there manually, but if the
filesystem was part of a BCV, it could happen if the BCV is simply a
snapshot of a filesystem at a given time.  If the time between when the
snapshot was taken and the Networker backup was done is long enough, there
could be files that appear on the next snapshot which have a modification
time that predates the Networker backup.

With all of this in mind, does there exist any kind of methology to deal
with such a situation other than manually modifying the command-line
arguments to the Networker save command to decrement the -t option so that
the Networker save command will catch these files?

This scenario would apply to anyone who uses BCV volumes to mirror a
filesystem and then backup the BCV volume at some later time.  I only used
Oracle as an example because its the application we use BCVs for backup.









Craig Ruefenacht
UNIX Systems Administrator
USANA Health Sciences, Inc.
(801) 954-7559



--
Note: To sign off this list, send a "signoff networker" command via email
to listserv AT listmail.temple DOT edu or visit the list's Web site at
http://listmail.temple.edu/archives/networker.html where you can
also view and post messages to the list.
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=

<Prev in Thread] Current Thread [Next in Thread>