Amanda-Users

Re: simple query

2002-09-07 07:43:24
Subject: Re: simple query
From: "Brandon D. Valentine" <bandix AT structbio.vanderbilt DOT edu>
To: Galen Johnson <gjohnson AT trantor DOT org>
Date: Sat, 7 Sep 2002 06:30:28 -0500
On Fri, 6 Sep 2002, Galen Johnson wrote:

>OK...I'm still reading as much documentation as I can find before I
>really decide I'm ready to tackle Amanda...One thing that keeps cropping
>up is that if Amanda hits the end of tape during a dump, it jumps to the
>next tape and starts over.  This brings up the situation where the dump
>itself is greater than an empty tape.  Let's say it crops up (worse
>case) at the beginning of a run...wouldn't this seem to be antithetical
>to running a backup?  If followed to it's logical conclusion, you could
>potentially run through your entire tapeset without backing anything up,
>even though the other backups following the "runaway backup" might
>happily fit on the tapes.  Or am I missing something?  I realize this
>may not be a critical problem (just readjust your disklists the next
>day) but does seem possible (not to mention a nuisance).

Amanda will not use more than the number of tapes you have set runtapes
to in a single run so there's no way she's going to overwrite your whole
tapecycle unless you're foolish enough to sit there and keep feeding her
tapes.  If you do indeed end up with a filesystem which can no longer
fit on tape your best bet is to split it up via tar into smaller
component directories.  When you do this amanda will no longer care
about the original large directory and you can safely remove it from
your holding disk as soon as subsequent amdump runs get level 0s of the
new subdirectories you have defined.  If you're backing up user data and
home directories you can also just enforce quotas on the directories at
the outset and announce to your users the maximum ceiling you are able
to backup.  This is what we do and it has many side benefits as well.
We provide each user with a seperate scratch directory they are free to
fill to their heart's content which sits on a RAID5 but doesn't get
backed up.  They all know that this scratch space is relatively safe but
is not guaranteed and it is certainly vulnerable to user error.  They
are all encouraged only to keep data which cannot be regenerated in
their home directories and to regularly copy intermediate results of the
analyses they run back to their home directories.  Plus, we encourage
archival runs of their data at the end of the project which will allow
them to remove it from their home directory altogether.  As they finish
a project they have the option of sitting down at a machine we have made
available which supports all manner of storage media from DDS to AIT,
from CD-R(W) to zip disk, each according to his capacity, and moving the
data offline.  ;-)

This is where it gets interesting for me.  We are looking to replace
this last step with some sort of large tape library and an HSM or
hierarchical storage management solution.  There do not currently appear
to be any good open source software solutions for management of an HSM
scheme.  DMF from SGI/Cray is the only decent commercial package that
does it properly and it's prohibitively expensive.  I've begun to give a
lot of thought to what modifications would be necessary to make such an
open source application amanda derived.

-- 
Brandon D. Valentine <bandix AT structbio.vanderbilt DOT edu>
Computer Geek, Center for Structural Biology

"This isn't rocket science -- but it _is_ computer science."
        - Terry Lambert on current AT FreeBSD DOT org.


<Prev in Thread] Current Thread [Next in Thread>
  • simple query, Galen Johnson
    • Re: simple query, Brandon D. Valentine <=