ADSM-L

Re: Sizing an AIX platform and tape libraries

2004-03-23 21:14:37
Subject: Re: Sizing an AIX platform and tape libraries
From: Dan Foster <dsf AT GLOBALCROSSING DOT NET>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Wed, 24 Mar 2004 02:14:04 +0000
Hot Diggety! Nancy R. Brizuela was rumored to have written:
>
> 1)  Right now we are backing up about 150 GB/ night, but we need to add
> Exchange and a new student information system (Banner) to this.  We are
> estimating that we will soon grow to at least 500 -600 GB/night.
>
> 2)  Workload consists of 95 clients, consisting of from small 20-30gb
> Unix and Windows systems, to DB (Oracle) and file servers with half to
> one terabyte of storage.  This will grow to 120 or so clients soon.

Sounds like you know your current needs and have some idea of growth --
that's great! Helps to size potential candidate setups so much better
and more accurately.

> 3)  We are currently storing 10.7 TB total data in two libraries, one
> local tape library and a second copy of everything at a remote tape
> library (3494 ATLs w/3590 E drives).  Looks like this storage could
> triple, given how much we will be backing up each night (15 TB in each
> location).

My suggestion might be to consider something like the IBM 3584 LTO
library with LTO-2 tape drives. These drives are pretty darned fast at
up to 70 MB/sec (with hardware compression enabled), don't suffer from
the same terrible stop/start issue that LTO-1 had, and holds 200 GB per
tape natively (no compression) -- for *our* library, we're seeing 2.24:1
compression with LTO-1 (so no reason not to expect that to be similar
for our LTO-2 setup) which suggests perhaps 400-450 GB per tape with
hardware compression enabled, depending on data mix.

The 3584 can go up to *16* library frames, with up to 12 drives per
frame (ie, 192 tape drives total) and up to 6,881 tapes in a 16 frame
setup. That's a lot of future expansion potential, but the nice thing is
that you can buy a frame or two now, and add more on later as your needs
grows.

In a two frame setup with 12 drives (our 3584 setup), has 610 tape
slots... which, in a LTO-2 setup, yields 122 TB of uncompressed data.
A single base frame 3584 LTO library has about 250 tape slots if you
have a 30 slot I/O station... so, 250*200 = "only 50 TB". If you say
you expect to eventually triple... 15 TB in each location * 3 = 45 TB,
and you'd still have room for as much future expansion as you want.

The 3584 has been a very nice ATL library for us. Footprint isn't bad;
the two frame setup is about 5'x5'.

> 4)  Network is Gigabit Ethernet.

No problem. Get as many IBM FC5700 (Gig-E/fiber) or 5701 (Gig-E/RJ-45)
adapters you want/need, plug in, determine how you want to handle the
networking (EtherChannel trunking, subnetted for each adapter, or
whatever).

> 5)  We would like to use one large server vs. multiple servers.

Not a problem. We use the pSeries 660 model 6H1 server, which has been a
perfect fit -- it's modularized with the CPU and I/O units separate so
it hasn't been a physical hassle to rack mount it, and these 6H1s has so
many slots like you wouldn't believe. Up to 2 RIO drawers per system;
each drawer has *14* PCI slots! So we don't have slot pressure -- we can
throw in as many adapters as we want! FC cards, SCSI HBAs, Gig-E, SSA
adapters, whatever you want, throw it in there. It also has multiple
busses to spread out the bus traffic as well as a great memory and CPU
interconnection, internally, so performance is not an issue at all.

It also has 2 GB of RAM but can easily go to much larger configurations
as our load grows further. At today's memory prices and systems
configurations, probably wouldn't think twice about 4 GB of RAM... but
if stretching every dollar, 2 GB would be an adequate config for your
current needs (you mentioned number of clients, size of library, etc)
and short term needs.

Also has 4 processors in it due to the high I/O load -- CPUs are kept
busy putting data on / off I/O adapters, processing the data, then
offloading to yet some more I/O adapters (network, disk, or tape). 4's
an healthy number of CPUs that has worked out really well. Load is
nearly nonexistent! (We could probably do fine with 2 CPUs, but decided
it was easier to get 4 and overengineer slightly, to grow into it over
time, than to go back later and beg for expansion in a bad situation
with loading issues. Besides, it'd have meant downtime to add more CPUs
in the future amongst other logistical issues...)

The 6H1 also has dynamic hot-plug slots (you have to run a command to
turn slot power off, do whatever, turn slot power on, cfgadm, done), a
service processor (good for powering on/off a machine remotely as well
as forcing an halt even if OS is unresponsive... really rare but have
used that once), monitoring/management features, and many other things.

IBM isn't selling the 6H1 any more; they're selling its successors which
are even better -- the POWER4 based systems. (The 6H1 is basically a
rebadged and modularized H80 server, and one of the last machines
released before the POWER4 systems came out in force.)

So my point is simply that you want to look at a mid-level server that
has remote management, reliability and availability features, *plenty*
of slots (which also permits for seamless future expansion), features
that supports uptime (such as hot-plug slots) since a backup server is
expected to be essentially the most highly available machine in an
enterprise for obvious reasons, amongst other factors.

As long as you choose a system with an healthy amount of slots, you
can't really go wrong... because entry level systems will have smaller
configurations, fewer RAS features, fewer management/availability
features, smaller proc/mem configs, far fewer slots, less expansion
potential, etc... so if it's got a lot of slots, it's usually a
mid-range or high end system.

The performance and management of even the "mid-range" systems are just
simply eye-popping -- you simply *cannot* go wrong, period.

I also wanted to note that you will probably also want to get an
external disk array to hook up to the backup server. This would serve as
the diskpool / intermediate staging area and allow your client backups
to complete much sooner while relieving mount pressures... and you
really do want external disks for this because you get to spread the I/O
across many more spindles than a mirrored two drive boot disk setup
(which already hosts the paging space...).

You can get a FC-capable ATA disk array (16 250 GB drives in drawer) in
a 3U rack mountable factor for about $12K these days; we just bought a
bunch from AC&NC (makers of Jetstor III) although we eval'd their
competitor (Nexsan ATAboy2) and found it to be equivalent in feature set
but about 25% higher on price than the Jetstors. Or for about $1-2K
less, SCSI attachment if you don't like FC.

Or you can get some brand name tier 1 vendor's SCSI or SSA array or
whatever. But you really do want an external disk drawer for a diskpool.
It doesn't have to be an elaborate or expensive thing -- just a bunch of
disks stringed together. ;)

3584 support page -- has manuals, product info, interoperability matrix,
tips/hints, FAQs, FC HBA support, device drivers, library/drive
firmware images to download, anything you could possibly want to find or
read, in one well organized spot:

http://www-1.ibm.com/servers/storage/support/lto/3584.html

More technical information about the 3584 library itself:

http://www-306.ibm.com/common/ssi/fcgi-bin/ssialias?infotype=dd&subtype=sm&appname=ShopIBM&htmlfid=897/ENUS3584-L32

I think the closest thing to the 6H1 would be the pSeries 655:

http://www-132.ibm.com/content/home/store_IBMPublicUSA/en_US/eServer/pSeries/mid_range/655.html

To see other midrange machines:

http://www-132.ibm.com/content/home/store_IBMPublicUSA/en_US/eServer/pSeries/mid_range/pSeries_midrange.html

The p650 looks too small on the slot angle; looks fine on CPU/mem, but
having only 7 slots looks like a big killer... not to mention, looking
huge and *HEAVY* -- it appears to be about the size of the infamous
H50/H70 which was almost nearly impossible to rack due to the sheer
weight! That was the only machine that made even our strong (data
center) movers unhappy :-) I also don't like really heavy/bulky machines
since it also makes the logistics of doing an hardware maintenance so
much less pleasant. 6H1, in comparison to the p650, has been *perfect*,
period.

Other ATLs available also makes use of things such as Sony's AIT-3 as
well as SDLT. I just don't have personal experience with these libs,
although I know other members of this list use them, and probably just
as happily.

Sony just announced AIT-4 at CeBIT last week:

http://theregister.co.uk/content/63/36406.html

...but hasn't begun selling drives or libraries, and nobody here has any
firsthand experience with them just yet, so a little premature to plan
around a brand new product that hasn't been shaken out yet. (LTO-2 has
been around about 1 1/2 to 2 years now and seems to be doing well.)

The 3584 is a real enterprise tape library and has all sorts of features
that we're starting to use more of now with new security requirements,
and we're very appreciative that we don't have to pay anyone another
dime to make use of these enterprise class features whenever we have a
need to use a particular configuration or feature. It has paid for
itself quite a few times over by now! Darned right I'm pleased with it
as well as the overall server, disk, and tape library sizing.

The most expensive part of the 3584 is the tape drives themselves (not
to mention the initial investment for tapes). Was quoted a list price of
$19,000 per LTO-2 tape drive although I think it's common to get a
20-35% discount right off the bat, so figure $11-12K per tape drive.

Then there's tape media cost... $70 'on the street', list of $150 (from
IBM or an IBM VAR), discounted price from IBM might be closer to
$100-110.

Still, if you get, say, 800 tapes... at $110, that's about $88,000. Even
at $70, $56,000 total. But once you dispose with the one time initial
investment costs, you generally don't need to pump money into it... not
even administrator time because with TSM, AIX's cron, and about 4 dozen
small scripts, the whole setup basically manages itself daily.

-Dan

P.S. disclaimer: I am a technical person not affiliated with (nor paid
by) any of the systems, disk, or tape vendors. :-)