ADSM-L

Re: Sizing for a virtual tape library

2004-09-01 11:03:06
Subject: Re: Sizing for a virtual tape library
From: Richard Rhodes <rrhodes AT FIRSTENERGYCORP DOT COM>
Date: Wed, 1 Sep 2004 11:02:16 -0400
I've been playing around with the same thing although in a completely
different context.

Although a better person should answer this, I believe the occ is what the
TSM server sees - the byte stream into and out of the TSM server.  If
running client compression, then the server only sees the compressed data
stream coming in.  If compression at the tape drive, then tsm only see the
data stream it sends to the drive, not what the drive does.

But . . . this gets more complicated.

Remember that the CDL emulates tapes - in all their details.  This includes
the need for expiration and reclamation.  In other words, how full are your
tapes?  If you are running reclamation at, say, 60%, then you have up to
40% of many tapes unusable until reclamation is run.  The CDL will work
exactly the same way on it's virtual tape volumes.  My understanding is
that when a new virtual tape is created It grows the disk space allocated
in chuncks (I thought I heard 5gb as a number once), so new tapes only use
used space.  But full tapes with expired data still use the full tape.

So, your calculation might have to be something like . . . . .(thinking out
loud . . . high possibility of error) . . . ( total-occupancy +
total-expired-but-not-reclaimed-space )  /
virtual-tape-drive-compression-ratio + some huge fudge factor.

It is very possible with the CDL with growing new tapes by chunks to
overcommit  the CDL.  You could end up with a situation with many TSM tapes
that are FILLING, but have the CDL out of disk space.  It does have the
option to fully allocate the disk space for a virtual tape when the tape is
created.

Another thing to think about . . . . co-location.  I've been thinking about
this a lot.  If we used s CDL, would I still want to co-locate our primary
tape pools?  Virtual tapes should mount very quickly.  Seeking to the data
on a virtual tape should be fast, although I haven't heard how the CDL
implements seeks.  So, can I do without co-location on the primary tape
pool?  I don't know . . .needs more thinking and probably testing (if we
ever do this).

Another thing to think about  . . . . which library to emulate and how
many.  The list of libraries you can pick from to emulate is small.  Since
a storage pool has to live within one library, the number of virtual tapes
in the library has to have the capacity for the storage pool (again, don't
forget expired data).  The CDL has a limit on the number of virtual
volumes, libraries and drives.  According to the data sheet from EMC's web
site, the CDL supports "Configures up to 32 tape libraries, 256 tape
drives, and 2048 cartridges with a single disk library system . . ".   You
are going to have to juggle your pool sizes, the library size and the
number of virtual tapes.

Much to think about. . . . . . .

Rick










                      "Thach, Kevin G"
                      <[email protected]        To:       ADSM-L AT VM.MARIST 
DOT EDU
                      OM>                      cc:
                      Sent by: "ADSM:          Subject:  Sizing for a virtual 
tape library
                      Dist Stor
                      Manager"
                      <[email protected]
                      .EDU>


                      09/01/2004 10:04
                      AM
                      Please respond to
                      "ADSM: Dist Stor
                      Manager"






Greetings-

We are looking at purchasing an EMC CDL (virtual tape library), and I'm
trying to figure out exactly how much disk I'm going to need to meet my
requirements.

select sum(physical_mb) from occupancy where stgpool_name='<tapepool>'

Gives me ~62TB.  Is that number the compressed value, or the actual
value?  In other words, assuming I do no compression with the new setup,
would I be able to get by with ~62TB of disk?  Or would I need more?

I've read that compression is transparent to TSM since I'm doing
compression on my tape drives, so that number should represent what was
sent to the drives, correct?  It should therefore be the actual size of
the data before compression, right?

I did a search and found some past threads about this, but they confused
me even more!  =)

If someone could set me straight, I'd appreciate it.

Thanks,
Kevin




-----------------------------------------
The information contained in this message is intended only for the personal
The information contained in this message is intended only for the personal
and confidential use of the recipient(s) named above. If the reader of this
message is not the intended recipient or an agent responsible for
delivering it to the intended recipient, you are hereby notified that you
have received this document in error and that any review, dissemination,
distribution, or copying of this message is strictly prohibited. If you
have received this communication in error, please notify us immediately,
and delete the original message.
<Prev in Thread] Current Thread [Next in Thread>