Almost direct to tape - long message

I've mentioned in the forum my experiment to implement direct-to-tape
backups.  I've succeeded in the reconfiguration of my TSM system and
thought I'd share with the forum what I did.

TSM 4.1.5.0 on AIX 4.3.3 PL 8;  7025 F50;  2 x 332 MHz CPUs; 1024 MB RAM;
100 Mbs network
Approx 135 clients: mostly Windows, some AIX/Irix, 3 TDP/Domino
4 TB data on online tape; 11 TB data total on tape

We have four primary data paths through our system:
1) large client backups - terminate on 3583-L72 (LTO) w/ 4 drives,
collocated
2) small client backups - terminate on 3575-L18 (3570XL) w/ 4 drives,
non-coll.
3) temporary-retention archives - terminate on 3575-L12 (3570XL) w/ 4
drives, non-coll.
4) permanent-retention archives - terminate on HP 4/40 (DLT) w/ 4 drives,
non-coll.
Also, data from paths 1 and 2 get sent to our copypool, which is also
located on the HP 4/40.

For over five years, since ADSM 2, we have force-migrated disk pools to
zero utilization, and performed all restores (2-5 per week) from tape.
Both of those techniques are contrary to TSM best practices, but have not
caused any operational problems in the entire period.

We shrank the upstream disk pools of paths 2 and 3 to 1 GB each,
implemented as 20 volumes of 50 MB each.  These are the
"almost-direct-to-tape" paths.  The disk volumes serve as mount point
multipliers, and the migration thresholds are set to zero.  Migration to
tape occurs as soon as a client commits its data.

We eliminated the upstream disk pool of path 1 and send that data
direct-to-tape.  There are only 13 clients that send data to the LTO during
the night.  Four drives provided plenty enough mount points, and a single
drive can take the full throughput of the NIC.

Only path 4 retains a large (28 GB as 28 x 1000 MB) disk pool upstream.
This disk pool is used to funnel data to one tape at a time.  That is to
avoid sending a lot of expensive DLTs to the vault that are 5-10% filled.

We had several reasons for cofiguring our TSM system that way:

1.  The network is the pinch point.  Each library, despite being connected
to just one SCSI adapter, can accept 16 MB/s, while the network tops out at
about 11 MB/s.  Note that for incoming backup data, network overhead is
only 5%, so 95% of what passes through the NIC ends up on tape.

2. Once we had four drives in each library, we no longer had drive
contention because of insufficent mount points.  We added drives to the two
3575s; the DLT and LTO were bought with four drives.

3. Our TSM server is five years old and every I/O slot is filled.  To
double-connect the libraries, I need a new server with more slots.  A quote
for a new RS/6000 with lots of RAM, enough slots, and enough SSA disk space
to implement TSM best practice was over $150,000!  When SCSI disks were
substituted, the price dropped to $90,000, still a lot of money to our
little company.  Reducing the required disk space shaves thousands of
dollars from the server price, and makes it an easier sell to management.

I've watched the operation of the system in this configuration for hours
and have seen no problems other than the one caused by the node MaxNumMP
parameter, and that has been fixed.

Tab Trepagnier
TSM Administrator
Laitram Corporation