'Tapeless' TSM Server on Solaris ???

runwfo

Newcomer
Joined
Jan 22, 2010
Messages
1
Reaction score
0
Points
0
I have been supporting several TSM 5.x servers on AIX 5.3 connected to various IBM and STK tape libraries. I was just given a Sun x4270 running Solaris 10 with 30TB SAN disk. The assignment is to design a 'tapeless' TSM environment.

Is anyone doing this today? I would like to get some ideas on replacing tape pools with a large disk pool. I have not used 'file' stgpools in the past

Thanks!!!
 
If you currently have a DISK type storagepool, would there be any benefit of moving things to a FILE type storagepool at all?

You might be better off having a massive DISK pool and just leave everything in that... No migrations, no reclaims required...

If it will all be on disk anyhow, and those disks are always present on the system, do you really need to shuffle data around?
 
If you currently have a DISK type storagepool, would there be any benefit of moving things to a FILE type storagepool at all?

You might be better off having a massive DISK pool and just leave everything in that... No migrations, no reclaims required...

If it will all be on disk anyhow, and those disks are always present on the system, do you really need to shuffle data around?

<NOTE: I'm a newb and have only been working with TSM for 2 months now>
I am attempting to build a strategy much like what you have been tasked with.
One thing to consider:

  • you need to have some storage that is not attached to your TSM server itself. I.e a NAS share, another server, whatever. The point is: you need to have somewhere to keep backups of your important TSM files (i.e. the database backup) in case your TSM server dies. I have done this with a simple NAS mount, which I define as a devclass=FILE. I also plan to have multiple copies of VOLHIST, etc... on that NAS share.
Now - on to my questions/opinions regarding this topic

  • Wouldn't it be a good thing to have a separate pool? You would want to do migrations and reclamations as this where the deduplication takes place, I thought? Also - if you had tiered storage, wouldn't you want your BACKUPPOOL to be tier 1, and the other pools to be tier2?
  • Does VTL count as a "tapeless" strategy from your management's viewpoint?
  • Tape is "timeless" in a sense, compared to most storage. I.e. the refresh schedule for tape technology is typically not every 3 years to have a migration to a new array. I'm not sure how big of a disk pool you are talking about, but the task of migrating significant amounts of data every few years when you replace an array seems daunting.
Great topic, by the way. Most of the places where I am consulting have regulatory reasons that they have to continue using tape, and personally I could care less either way. I've seen some impressive numbers for both tape and disk for throughput.
 
We have been running Solaris 8/9/10 for years now, but rather than use devcl=file we use raw volumes presented from our SAN. Great performance. You might also want to look at DataDomain as your diskpool storage and take advantage of Dedupe which is what I would do if I could get funding and elimate onsitetape.
 
I am working on a similar configuration. Sun x4550 with 48 x 1 TB drives running Solaris 10x64 and using a zfs pool.

I setup everything using device class file so I could work with TSM 6.1 deduplication. So far my results have been mixed at best. Backup performance is fine, as is restore performance, but stability is an issue.

I have upgraded to TSM 6.1.3.1 and still have problems. The problem I am having is that TSM just stops responding. It doesn't crash, at least a ps -ef | grep dsms shows the process, I cannot connect to the instance. The only activity I saw in the actlog prior to the crash was a long series of "ANR0538I A resource waiter has been aborted." messages.

I checked the log, archive log, and db file systems and this isn't an out of space problem but I am at a loss as to what to do about it. I was going to open a case, but I don't really have any data to show a problem.
 
Open a case, an IBM tech. should assists you with doing some tracing within TSM to obtain more data.
 
If you currently have a DISK type storagepool, would there be any benefit of moving things to a FILE type storagepool at all?

You might be better off having a massive DISK pool and just leave everything in that... No migrations, no reclaims required...

If it will all be on disk anyhow, and those disks are always present on the system, do you really need to shuffle data around?

Disk is not good for longterm storage. Overtime the data will be fragmented and the performance will suffer. It's better with dev=file. Then the data is ordered with reclaim. Also you need dev=file with dedup.

Also: http://publib.boulder.ibm.com/infoc...p?topic=/com.ibm.itsmaixn.doc/anragd55261.htm
"Do not use raw partitions with a device class type of FILE"

\Masonit
 
Last edited:
What masonit said.

If you want really big pools used for tapeless storage, you are better with file class. Better performance. Much faster tsm server startup. Much faster stgpool backups.

If you have a lot of little tiny files, you might need to use format=nonblock though for that pool (eg directory pool), otherwise each file or aggregate will take up minimum of 256K of disk space.
 
I have tried to setup TSM Server 6.1.3.1 on Solaris 10 x86 and discovered that I can cause TSM to lockup by migrating data between disk pools that are on zfs. I have a case open with Tivoli who tell me that TSM is in a wait state waiting for kernel I/O to be serviced by Solaris. Sun tells me that the wait state is caused by a user space application thread in TSM.

Now I am playing man in the middle passing information back and forth between IBM and Sun. Has anyone else run into problems with TSM Server locking up when using zfs for disk pools?

I will have to abandon this project if I can't get resolution to this soon, which would be unfortunate. This diskless "TSM in a box" model is something we had hoped to use for a few remote sites.

-Rowl
 
I'll be honest. As a previous user stated, Once you client data backs up to disk, there is no reason to migrate it. Just make the backuppool/diskpool the first and last destination. I would recommend you audit it from time to time. Currently I have a 60TB Diskpool that migrates to Tape after it is 14 days old. In addition, I run the DRM job against it. I only do this since there is not enough SAN space available to keep it longer. What do you hope to gain by migrating the data from DISK to DISK? Now if you hope to Dedup, thats a different story.
 
Well, regardless of why I am migrating from disk to disk, it is discouraging that TSM hangs. I should not be able to hang the application by something as normal as migrating data from pool to pool.
 
I can almost promise you, TSM is NOT the issue. I made this assumption myself with my disk based SAN with Stornext. It turned out one of the controllers on my SAN chassis would randomly fail over causing the disk volume to hang. TSM does not work at the "block level" on the storage UNLESS, you using RAW, therefore all request are passed to the kernel.
TSM is just simply passing IO request to the solaris Kernel. Check all your logs on the Storage devices, Fibre Switch, and naturally Solaris, I already assume you searched the Solaris logs. If there is a delay in IO, This will naturally cause TSM to hang.
 
I fear this is going to be one of those finger pointing exercises between vendors.

I should add that I have been able to reproduce this issue running reclamation as well. This is with deduplication enabled.
 
any change you might be able to duplicate this outside of TSM? Perhaps using the dd command to create a giant file when dump it to /dev/null. You could create a script to loop this until it crashes, or you see something in the log. I still can't believe there is nothing in the Solaris logs
 
It now appears that there were multiple outstanding DB2 requests from TSM that were in a hung state. Waiting for something new from support on this one. I hear TSM 6.2 should be available for download 3/19 so maybe I will just start over from there.
 
Back
Top