"Prather, Wanda" wrote:
> 1. IT DEPENDS. (You KNEW I was going to say that, didn't you?!?)
Yes.
>
>
> 2. There is no "right" answer (so you will probably be getting LOTS of
> opinions back about this question ;>).
Understood.
> 3. If you only have a few clients to back up, it just doesn't matter.
That's me, however, I was thinking in terms of tuning my repository for quickest
throughput.
> 4. The purpose of your disk pool is to act as a "buffer", so that you can
> have many more clients backing up concurrently than you have tape drives.
> As long as your disk pool is large enough so that your clients can back up
> without waiting for a tape, you have an "adequate" configuration.
Understood, however I go back to my size tuning question, more smaller or less
bigger? But, you said it doesn't matter which probably answers my question.
> 5. The rule of thumb is that you want to have enough space in your disk
> pool to hold at least one day's backups. That way if there is a tape
> problem (more common than disk problems), you will have time to fix it when
> you arrive in the morning without any backups failing. Now you have a
> "better" configuration.
Understood.
> 6. After that, what you do with your disk pool is work on increasing
> throughput/performance, ie. the "best" combination of function and
> throughput. And it depends a lot on your client mix and a lot on your
> hardware.
Assume very few clients (1-2) with many many filesystems.
> IF you have "n" clients backing up concurrently, and you have at least "n"
> disk pool volumes, TSM will start "n" I/O's in parallel to the different
> disk pool volumes. So to take advantage of MANY volumes, you have to have
> MANY concurrent backups in progress. But, if you are writing to 1 spindle
> of disk with a zillion tiny diskpool volumes, you will get more thrashing of
> the disk than throughput.
This is important information I could not confirm. Thank you for this.
> So, if writing to a raw (not RAID) spindle, you proably want 2-3 diskpool
> volumes per spindle.
No, it's just JBOD spindles now, maybe RAID in the future.
> On the other hand, lots of people use some type of RAID these days. The
> disk pool I/O is sequential; some people have reported good results with
> striping across multiple physical spindles. And if you are writing to a
> Shark with gobs of cache memory in front of it, that buffers the effect of
> the disk head movement and you may not be able to come up with many
> configurations where you can measure the difference between a "few" and
> "many" disk pool volumes.
Good information, thank you.
> 6. NONE of that is really going to affect your throughput to LTO.
>
> - no matter how many (or few) diskpool volumes you have, if they are full
> of lots of itty bitty files (actually aggregates of files), remember that
> TSM has to update the data base each time it moves one. It's hard to stream
> enough data to the tape during this process to get great throughput numbers.
Good information.
> - no matter how many (or few) diskpool volumes you have, if they contain a
> few BIG files, TSM has a better chance of pushing the data to LTO fast,
> because it doesn't have to make data base updates as it goes.
>
> - many people report better thoroughput on BIG files when writing direct
> from the client to tape, not the disk pool.
>
> This should give people LOTS of targets to respond to!!
Thanks for your detailed response. From my standpoint, I guess I should
allocate large files which will slightly exceed the number of clients streams I
anticipate. I am also running multiple streams from the same client at the same
time. But, you answered my overall question about many-small or few-big and I
think I have enough information to cobble a nice architecture here. Thank you
again.
Mitch
>
>
> Wanda Prather
> Johns Hopkins University Applied Physics Laboratory
> 443-778-8769
>
|