Amanda-Users

Re: wasted action of taper

2003-05-16 12:00:14
Subject: Re: wasted action of taper
From: Mitch Collinsworth <mitch AT ccmr.cornell DOT edu>
To: Jon LaBadie <jon AT jgcomp DOT com>
Date: Fri, 16 May 2003 11:57:17 -0400 (EDT)
Hi Jon,

On Fri, 16 May 2003, Jon LaBadie wrote:

> On Fri, May 16, 2003 at 07:10:20AM -0400, Mitch Collinsworth wrote:

> > Taper chunking opens the door to backing up DLEs that are larger than
> > a single tape, something which no amount of algo-shuffling is going
> > to accomplish.  It also opens the door to allowing taping and dumping
> > of a single DLE to proceed in parallel, which can be a huge time-savings
> > for large DLEs.  These are both features that a lot of people would like
> > to have.  They've also been common deal-breakers for folks looking at
> > amanda in the past.
>
> Mitch,
> clarify for my your notion of "taper chunking" and the mentioned parallelism.
>
> In your view, if chunking is implemented, it would allow a DLE to be taped
> in chunks before the dump had completed.  This would certainly reduce the
> maximum size required for some holding disks.  Yet could only be used if
> a holding disk were provided.

Yes, it could reduce the *minimum* size required for your holding disk,
though I don't personally advocate small holding disk.  Disk is too
cheap anymore to be the piece to skimp on.  And yes, tape chunking would
not be available to a configuration without a holding disk.


> Do you anticipate the chunks of any specific DLE to be taped sequentially?
> I.e. once started, taping only continues for that DLE until completed?
> Or do you anticipate interleaving the chunks of multiple DLE's?  I can
> forsee difficulty with both approaches compared to only taping when a
> dump of a DLE is completed, with chunks for tape spanning of course.

I say "opens the door" because yes, interleaving would then be a
possibility.  That seems like something worth having a configuration
switch to enable or disable as the user prefers.  It could also be
something that isn't implemented in the first pass but is added later
after non-interleaved chunking is fully working.  I agree there are
challenges to this but I'm not thinking of any that aren't solvable.


> BTW in practice, doesn't the existing parallelism of amanda pretty much
> eliminate the time-saving benefits that these proposals might realize?
> Taping of completed DLE's goes on while my large DLE's complete and other
> DLE's dump while my large DLE's tape.

Not that I can see.  Depending on your config, amanda as it works now
may actually be taking _more_ time than a script-based approach.
Suppose you have a 100 GB capacity tape drive, even one of the spiffy new
200 GB capacity drives.  [Drool]   Suppose you have a config with a single
DLE that takes 90% of a tape to do a level 0.  If you have to dump this
to a holding disk and then wait until the dump completes before you begin
taping, you've wasted a LOT of time.

If amanda implements tape chunking so that sites with huge RAID arrays
can do multi-tape level 0's, a single DLE could potentially take several
tapes to dump.  Obviously they could save time by splitting the DLE but
maybe they don't want to.  Or maybe it's some strange datatype that just
doesn't split easily.  Now the penalty for waiting to get it all onto
holding disk before starting to tape is even more painful.  And the
folks who decided against amanda because it couldn't span tapes will
have a new excuse.  "It spans tapes but it takes too long."

-Mitch

<Prev in Thread] Current Thread [Next in Thread>