Amanda-Users

Re: Overlapping backups: should I expect problems?

2008-09-16 08:12:22
Subject: Re: Overlapping backups: should I expect problems?
From: Chris Hoogendyk <hoogendyk AT bio.umass DOT edu>
To: amanda-users AT amanda DOT org
Date: Tue, 16 Sep 2008 08:03:40 -0400


John Morris wrote:
Dustin J. Mitchell wrote:
On Mon, Sep 15, 2008 at 11:36 PM, John Morris <jman AT ablesky DOT com> wrote:
My question is, is this actually a good idea? Will there be any problems I haven't anticipated, such as the two configs conflicting? For example, I notice that there is only one /etc/amandates file that is presumably shared by both configurations. I don't know whether the date to begin incrementals from is from this file or from /var/lib/amanda/gnutar-lists.
No, it isn't a good idea, because there's a better solution that
doesn't involve this sort of gymnastics: RAIT.  You can set up a RAIT
between the two tape devices, and then even if a drive is down, your
recoveries will work fine.

Dustin
The configuration I describe solves some problems that RAIT won't. See if there's any error in my reasoning here.

The two configurations run on alternating days. One configuration runs more often (3 days/week) on a smaller disk, so these backups only go back a month. The other configuration runs less often (2 days/week) on a larger disk, so these backups go back two months. With RAIT, you get one configuration that stores identically replicated data, so there's no opportunity to tweak one copy to go back further in time (is this true?).

Another relevant point is, the second disk was added for extra capacity. With this extra capacity, we can run backups every single day instead of only every two days. Thus, there is more granularity in the backups with this configuration. If a user deletes a file accidentally today, then we are guaranteed to have a copy less than 24 hours old. For backups between 1 and 2 months ago, we don't care as much about the granularity.

Whether or not the effort from this config's required gymnastics is offset by these extra advantages, are there any other problems I can anticipate? I'm no Amanda expert, so I fear not hearing the loud alarms that would be set off in a more experienced user's mind when he hears about this odd configuration.

Just a couple of comments. I've had a couple of occasions in the last few weeks where I've needed to recover people's mail on the server, because they had errors in how they configured their desktops and lost historical messages that were important to them. I was able to pinpoint the day it was sure to be on the server and pull it back. In one case I had to go back to 3 different dates including an archive from June. Having each and every day on backups proved important.

Having to figure out not just the date, but then which configuration to recover from, based on which day of the week that date was, complicates it some. But also, if you lose one of your drives (the reason you've configured it this way), you also actually lose some of your dates. That's an actual loss of data. With RAIT, if you lose a drive, you haven't lost any data. Also, with RAIT, the actually backup and replication of the data is more efficient. You are only accessing the clients to back up the data once and getting the RAIT copying without an additional transfer across the network. So your data redundancy is both more complete and more efficient.

If your drive capacities are different based on purchasing history, and RAIT seems not quite right because of that, I would just make sure I have suitable drives for the design. Drives are relatively inexpensive these days.


--
---------------

Chris Hoogendyk

-
  O__  ---- Systems Administrator
 c/ /'_ --- Biology & Geology Departments
(*) \(*) -- 140 Morrill Science Center
~~~~~~~~~~ - University of Massachusetts, Amherst
<hoogendyk AT bio.umass DOT edu>

---------------
Erdös 4