Re: RAIT in 2.4.3b4

On Mon, 27 Jan 2003, Scott Mcdermott wrote:

> I'm not talking about multiple writes to the same tape happening at
> once.  With a single parity tape, all writes to any drive in the set
> block on writes from any of the other drives, since they all wait on the
> same drive head used to record parity information.  This limits write
> speed to all the drives in the data set combined, to that of the single
> parity drive.

That would be true, except *every* drive out there
1) buffers writes
2) writes really slowly, relative to the bus speed.
So the fact that you are writing round-robin rather than "in parallel"
(which you can really only do if you have distinct SCSI bus/fiber
paths to your various drives anyway) doesn't in practice make
much difference.

> Read operations can still go in parallel with RAIT4, but the write case
> is of course what needs optimizing for a tape backup system.

The RAIT code also does round-robin reads; once again drive buffering
makes it work quite well in practice.

> ok, I just read the RAIT code.  A few things, which I'm sure you are
> already aware of already (and now your comments make a lot of sense):
>
>    - it doesn't have any parallelism at all, so it gains no concurrency
>      advantages; it simply loops among the data set members, does one
>      write, blocks, does the other, etc then write the parity tape.

The parallelism you get is that the drives buffer before writing; and
write much more slowly than the buffers can be filled.

>    - it's limited to RAIT4, and additionally:
>
>    - it has the divisor limitation you mention due to this code:
>
>        data_fds = pr->nfds - 1;
>         if (0 != len % data_fds) {
>             errno = EDOM;
>             rait_debug((stderr, "rait_write:returning %d: %s\n",
>                                 -1,
>                                 strerror(errno)));
>             return -1;
>         }
>         /* each slice gets an even portion */
>         len = len / data_fds;
>
>      (and the rest of the write implementation)

The reason for that was to avoid the overhead of adding a descriptor
and pad bytes to the end of the frames; this is particularly important
when you want it to work with fixed-1k-block drives and the default
amanda blocksizes...

>    - the #ifdef NO_AMANDA stuff is pretty much exact duplication.
>      Anyone consider just moving this out into a library instead of
>      duplicating whole routines in rait-*.c ?

The RAIT code was initially done to be part of a different package
(the Fermi Tape Tools package) and got migrated to amanda, the
idea was to be able to move it back the other way some day.

> I guess there isn't any way, then, to employ multiple write heads at
> once, without having different sets of backups that use different tape
> devices.  This sounds like it would be really hard to set up with a
> changer.

That's not true at all, which you'll see if you try it...

> Effectively enough, though.  If no one speaks up about RAIT, it means no
> one is using it.  And if they are using it in isolation, they may as
> well not be using it.

Well, it may just mean no one who is using it happens to be reading
just this minute.  I would be interested to know how many folks
are acutally using it, though.

> I'm just saying, when considering it for a production deployment,
> measuring list response to questions about it is a valid measure of its
> maturity, if not an exact one.

Ah, but what attribute of list response do you measure?  Initial response
time? Quality of answers? Quantity of answers?