Tape Library Full

jayp2200

ADSM.ORG Member
Joined
Apr 12, 2011
Messages
38
Reaction score
0
Points
0
Location
GA
I apologize in advance if this seems a bit scatterbrained, I'm just a bit confused... I am wondering if someone can help me understand what is going on here in our environment when it comes to our tape library.

So we have a TS3310 tape library with 4 drives and 126 tape slots (and 1 cleaning tape slot). For quite some time now, we've been limping along with this tape library running at very near full capacity, and we are constantly seeing errors about it not having enough scratch volumes. The tapes we use are all LTO5. We don't have the budget to expand the tape library, and quite frankly I'm not sure at this point that we really need to. I'd just like to find out here if we are missing something in our normal processes that would ultimately be affecting how the tapes are used.

We have a DRM plan in place and admin schedules that run various scripts to handle DB backups to disk and tape, copypool backups, expiration, migration and reclamation processes, etc. We have a report that gets generated daily which lists which tapes need to be removed from the library and placed in the vault (Mountable), and which tapes need to be brought back to the vault and checked in (VaultRetrieve). We have a person who reads the report and performs these duties on a daily basis (Mon-Fri).

On our 3 primary tape storage pools, we currently have the reclamation threshold set at 60, and the 2 copypools we have are set at 70.

We also have been seeing an issue for quite some time now where tapes that were checked out and moved to the vault are marked in the volhist file as STGDELETE at some point and are essentially orphaned. They are never marked in the system as vaultretrieve to be used as scratch again. For this, we have just been manually comparing the volhist file with the physical tapes in the vault, to see which ones we can reuse as scratch. However, with the library being near capacity, we can never catch up to what the system is needing.

So with all that said, is there any more specific info that you might need to look at in order to help me determine what is going on?
 
Hi,
Some points that may help you out.

If you have spare drive resources, you could set the reclaim th to 50%

Look out for tapes that goes to readonly, with low occupancy. Move out that data, and get rid of bad tapes.

Make sure you back up only what you need. Exclude *:\...\tmp\ *:\...\temp and so on.

Check filespaces for duplicates. These happens in cluster configs, where both nodes back up the same data, instead of having a cluster node holding data.

Instead of adding library, maybe replace lto5 with lto7 drives (or lto6) (and tapes of course)

Can you keep data on disk for a longer period? One TB of disk may release two tapes to scratch.


-= Trident =-
 
We also once had the problem that expired tapes no longer appeared in vault retrieve. I cleared the volumehistory and since then it works again.
 
Thanks for the responses guys! I'm just now getting back to this as I've been tied up with other things. You know how that goes... Anyway, these are some good points.

@waelti - I will definitely look into clearing the volhist as I'm not sure that has ever been done... lol How often would you say that it needs to be done?

@Trident - Do you have any sort of general suggestions on how long to keep tapes in the rotation? Of course I know it has much to do with how often they are used, but are there any other factors to look at as well?

We are already excluding any type of temp files from any node backups.

We don't have a cluster config.

We definitely do have a decent amount of disk space right now, so we could possibly keep data on disk for longer. We also have 2 nodes with several TB of data each, which only back up directly to tape. So we may just end up moving at least one of the nodes over to disk storage instead.
 
One other thing I'd like to add... From what I can tell, we do have plenty of scratch tapes that can be used, but no physical space to hold the tapes the system needs to keep in the library. This is the main problem here.
 
@Trident - Do you have any sort of general suggestions on how long to keep tapes in the rotation? Of course I know it has much to do with how often they are used, but are there any other factors to look at as well?

Tape rotation is a dynamic monster. Every bad tape goes out . Readonly tapes are left a few days, and then the remaining data is moved to fresh tapes/disk. Quite often the tapes goes bad because of worn out tape drives.

If you suscpect a bad drive, the send a dump to the vendor, and ask for analysis of drive usage.

-= Trident =-
 
We also once had the problem that expired tapes no longer appeared in vault retrieve. I cleared the volumehistory and since then it works again.


According to IBM's page here:

https://www.ibm.com/support/knowledgecenter/SSGSG7_7.1.3/srv.reference/r_cmd_volhistory_delete.html

For users of DRM (like us), the SET DRMDBBACKUPEXPIREDAYS command needs to be used instead of the DELETE VOLHISTORY command.

However, if I run the Q DRMSTATUS command, it shows: DB Backup Series Expiration Days: 6 Day(s)

So my question is... If it is only set at 6 days, then why do the volumes still never show up as scratch? I could be looking at this wrong, but it seems to me that since this is not working as desired, the DELETE VOLHISTORY command could be used anyway, and then we'd just go through and find which tapes that are in the vault are no longer in the system.

By the way, could you give me an example of the command you use? I'm a little confused on some of the available options...
 
Back
Top