Amanda-Users

Re: Is tape spanning documented anywhere?

2006-06-13 11:18:31
Subject: Re: Is tape spanning documented anywhere?
From: Toralf Lund <toralf AT procaptura DOT com>
To: Joshua Baker-LePain <jlb17 AT duke DOT edu>
Date: Tue, 13 Jun 2006 17:10:53 +0200

To throw my $.02 in here, the situations would be very different. If one is "forced" to have all DLEs "tapeable" in one amdump run, then (theoretically), nothing will be left on the holding disk to lose should said disk die.

But we're talking about a situation where the DLEs are not "tapeable". The

With tape spanning as implemented, any DLE is tapeable if runtapes is big enough. :)

What I'm slightly worried about, is the "unbalanced" setup a low number of large DLEs combined with a large runtapes value will give me. I mean, it implies that several tapes will have to be written on some nights, while nothing at all is taped on others - at least if we disregard incrementals for the moment. This could mean that the write operation will continue long into the following day, when we want to use the server's capacity for other purposes, or (even worse) isn't finished when the next dump is supposed to start. Actually, maybe there won't be any serious issues associated with this, but I'd just feel more comfortable if I could spread the work more evenly and/or use the idle hours of every night. And a different "flush" operation would help me achieve at least part of that, even though the actual dump would still be pretty unbalanced.

Some of my colleagues have just nearly convinced me that I worry too much, though ;-/


Maybe it's just me being curmudgeonly (it wouldn't be the first time -- hell, I haven't found a WM I like more than fvwm2) and slavishly adhering to the KISS method. But I think backups *should* adhere to the KISS method.

Normally I would agree, but I have to back up 3Tb of data organised as one single volume. The only "simple" option would be to have one 3Tb tape as well, but such a thing isn't available (to me at least.) Also, I think the whole tape splitting concept is inherently complex, and what I suggest here doesn't change the complexity level. The complexity was introduced already, I'm just talking about a *simple* implementation adjustment...

I agree that it doesn't change the complexity level.  But it does change
the safety level.  Suddenly you're making yourself far more vulnerable to
losing parts of a backup image.

On a practical level, I'm pretty sure that the setup you're proposing would require you to have a 3TB holding disk (or at least 3TB-tapelength) to hold your level 0.
It's not quite as bad as that, fortunately. While there is one 3TB volume, I can actually split it into more than one DLE quite easily. Splitting it into (much) more than tapes-per-cycle entries (which seems to be a requirement if you want a "balanced" setup) is however going to very hard. But you are right, holding list space is also going to be a bit of an issue.

I also have one other scenario in mind, though - which is one I've actually come across a number of times: What if a certain DLE due for backup is estimated to be slightly smaller than <runtapes>*<tape size>, and thus dumped to holding disk, but then turns out to be slightly larger? With the current setup, amanda will obviously run out of tape-space during the original dump and also if you try amflush. And if auto-flush is enabled, the next dump will hit end-of-tape before any of the new dumps have been written, and the next one after that, and so on; this holding disk image will effectively block the tape operation of all the following backups, and eventually, the holding disk will be full, too, so amdump won't be able to do anything at all.

If we were to introduce "partial tape write" as discussed here, but leave the scheduling algorithm unchanged, we would actually increase the safety in this area - an oversized dump would also be flushed eventually, and not "lock up" the system. We would not compromise the safety in other ways, as Amanda would still try to schedule only <runtapes>*<tape size>'s worth of data (so nothing would be left on the holding disk if everything went according to plan.)

- Toralf