Dropped in my lap - please help! All jobs failed.

Willum

Active Newcomer
Joined
Sep 26, 2013
Messages
12
Reaction score
0
Points
0
Location
Albany, Oregon
Hello. I an very new to Tivoli - actually it was dropped in my lap July 1 when the admin left! The system has been in place and working for several years.
The documentation he left was - how to remove a bad tape from the library, how to clean the drive, how to run the report to show what tapes are needed for DR, how to add a scratch tape and move a tape to the off site storage. He did not show me Health Monitor, Activity logs or where to see if the backup jobs actually worked!
I was told to see if there was a tape in the I/O spot in the Dell PowerVault TL4000 Tape Library and if so, follow the procedure to remove. I was supposed to add Scratch tapes when needed.
I got concerned when the report, ran weekly, did not change what tapes are needed for DR. I was also concerned that it did not take any new Scratch tapes. I was told that it sometimes takes weeks to add.

Problem - All backup Jobs fails. The TSM Data volume - 4.58TB is full. My guess it is not sending the data from the disk to the tapes.
I found how to check for Scratch tapes available and used and actually added to one of the storage pools. After that is said it was full.

Looking at the Activity Log of the server I see -
ANR1405W Scratch volume mount request denied - no scratch volume available, ANR1086E Space reclamation is ended for volume 000112L5. There is insufficient space in storage pool. (PROCESS: 883) -
ANR1163W Offsite volume 000028L5 still contains files which could not be moved. (PROCESS: 883) - that tape is actually over in the safe! Why is it looking at that?

Sorry for the length. TSM version 6.2 on a Windows Server 2008 R2 Standard edition. Like I mentioned this had been working just fine (as far as I know) for years. These problems only started recently.

If I just deleted the 4.5 TB of files, somehow told it to inventory or just let the jobs run, would I be OK? Thank you. //Bill
 
First of all, sorry for this being dropped on your lap.

Secondly, DON'T just delete the 4.5TB of files. We need more info to help you.

Post the output of 'q stgpool f=d', and 'q libvol'
 
Thanks moon-buddy! Looks like there is a limit, so here is the q libvol

Library Name Volume Name Status Owner Last Use Home Element Device Type
LB4.1.0.2 0%9073L5 Private 4,098
LB4.1.0.2 00$058L5 Private 4,100
LB4.1.0.2 00$066L5 Private 4,124
LB4.1.0.2 000003L5 Private 4,127
LB4.1.0.2 000007L5 Private Data 4,112
LB4.1.0.2 000010L5 Private Data 4,122
LB4.1.0.2 000011L5 Private Data 4,135
LB4.1.0.2 000015L5 Private Data 4,109
LB4.1.0.2 000018L5 Private 4,134
LB4.1.0.2 000022L5 Private Data 4,125
LB4.1.0.2 000026L5 Private Data 4,129
LB4.1.0.2 000027L5 Private Data 4,130
LB4.1.0.2 000036L5 Private Data 4,115
LB4.1.0.2 000040L5 Private Data 4,138
LB4.1.0.2 000043L5 Private 4,103
LB4.1.0.2 000046L5 Private Data 4,096
LB4.1.0.2 000052L5 Private Data 4,107
LB4.1.0.2 000053L5 Private Data 4,132
LB4.1.0.2 000054L5 Private Data 4,111
LB4.1.0.2 000056L5 Private Data 4,108
LB4.1.0.2 000061L5 Private Data 4,126
LB4.1.0.2 000062L5 Private Data 4,110
LB4.1.0.2 000071L5 Private Data 4,136
LB4.1.0.2 000075L5 Private Data 4,119
LB4.1.0.2 000076L5 Private Data 4,139
LB4.1.0.2 000082L5 Private Data 4,097
LB4.1.0.2 000094L5 Private Data 4,101
LB4.1.0.2 000100L5 Private 4,106
LB4.1.0.2 000101L5 Private 4,105
LB4.1.0.2 000102L5 Private 4,114
LB4.1.0.2 000103L5 Private 4,118
LB4.1.0.2 000104L5 Private 4,120
LB4.1.0.2 000106L5 Private Data 4,128
LB4.1.0.2 000108L5 Private 4,140
LB4.1.0.2 000112L5 Private Data 4,131
LB4.1.0.2 000116L5 Private Data 4,104
LB4.1.0.2 000117L5 Private 4,102
LB4.1.0.2 000118L5 Private Data 4,117
LB4.1.0.2 000120L5 Private 4,113
LB4.1.0.2 000121L5 Private 4,116
LB4.1.0.2 000124L5 Private Data 4,121
LB4.1.0.2 000125L5 Private Data 4,123
LB4.1.0.2 000127L5 Private 4,137
LB4.1.0.2 000128L5 Private 4,099
LB4.1.0.2 000129L5 Private 4,133

Thanks again. //Bill
 
Here is the q stgpool f=g (part 1 - file is too big)

Storage Pool Name : ARCHIVEPOOL
Storage Pool Type : Primary
Device Class Name : DISK
Estimated Capacity : 0 M
Space Trigger Util : 0
Pct Util : 0
Pct Migr : 0
Pct Logical : 100
High Mig Pct : 90
Low Mig Pct : 70
Migration Delay : 0
Migration Continue : Yes
Migration Processes : 1
Reclamation Processes :
Next Storage Pool :
Reclaim Storage Pool :
Maximum Size Threshold : No Limit
Access : Read/Write
Description :
Overflow Location :
Cache Migrated Files? : No
Collocate? :
Reclamation Threshold :
Offsite Reclamation Limit :
Maximum Scratch Volumes Allowed :
Number of Scratch Volumes Used :
Delay Period for Volume Reuse :
Migration in Progress? : No
Amount Migrated (MB) : 0
Elapsed Migration Time (seconds) : 0
Reclamation in Progress? :
Last Update by (administrator) : SERVER_CONSOLE
Last Update Date/Time : 1/21/11 8:11:25 PM PST
Storage Pool Data Format : Native
Copy Storage Pool(s) :
Active Data Pool(s) :
Continue Copy on Error? : Yes
CRC Data : No
Reclamation Type :
Overwrite Data when Deleted :
Deduplicate Data? : No
Processes For Identifying Duplicates :
Duplicate Data Not Stored :
Auto-copy Mode : Client
Contains Data Deduplicated by Client? : No

Storage Pool Name : BACKUPPOOL
Storage Pool Type : Primary
Device Class Name : DISK
Estimated Capacity : 0 M
Space Trigger Util : 0
Pct Util : 0
Pct Migr : 0
Pct Logical : 100
High Mig Pct : 90
Low Mig Pct : 70
Migration Delay : 0
Migration Continue : Yes
Migration Processes : 1
Reclamation Processes :
Next Storage Pool :
Reclaim Storage Pool :
Maximum Size Threshold : No Limit
Access : Read/Write
Description :
Overflow Location :
Cache Migrated Files? : No
Collocate? :
Reclamation Threshold :
Offsite Reclamation Limit :
Maximum Scratch Volumes Allowed :
Number of Scratch Volumes Used :
Delay Period for Volume Reuse :
Migration in Progress? : No
Amount Migrated (MB) : 0
Elapsed Migration Time (seconds) : 0
Reclamation in Progress? :
Last Update by (administrator) : SERVER_CONSOLE
Last Update Date/Time : 1/21/11 8:11:25 PM PST
Storage Pool Data Format : Native
Copy Storage Pool(s) :
Active Data Pool(s) :
Continue Copy on Error? : Yes
CRC Data : No
Reclamation Type :
Overwrite Data when Deleted :
Deduplicate Data? : No
Processes For Identifying Duplicates :
Duplicate Data Not Stored :
Auto-copy Mode : Client
Contains Data Deduplicated by Client? : No
 
q stgpool f=g (part 2)

Storage Pool Name : LTO_BACKUP_POOL
Storage Pool Type : Primary
Device Class Name : LTO5
Estimated Capacity : 123,971 G
Space Trigger Util :
Pct Util : 4.1
Pct Migr : 62.5
Pct Logical : 99.3
High Mig Pct : 90
Low Mig Pct : 70
Migration Delay : 0
Migration Continue : Yes
Migration Processes : 1
Reclamation Processes : 1
Next Storage Pool :
Reclaim Storage Pool :
Maximum Size Threshold : No Limit
Access : Read/Write
Description : LTO_BACKUP_POOL
Overflow Location :
Cache Migrated Files? :
Collocate? : No
Reclamation Threshold : 95
Offsite Reclamation Limit :
Maximum Scratch Volumes Allowed : 48
Number of Scratch Volumes Used : 30
Delay Period for Volume Reuse : 0 Day(s)
Migration in Progress? : No
Amount Migrated (MB) : 0
Elapsed Migration Time (seconds) : 0
Reclamation in Progress? : No
Last Update by (administrator) : ROOT
Last Update Date/Time : 9/24/13 8:22:46 AM PDT
Storage Pool Data Format : Native
Copy Storage Pool(s) :
Active Data Pool(s) :
Continue Copy on Error? : Yes
CRC Data : No
Reclamation Type : Threshold
Overwrite Data when Deleted :
Deduplicate Data? : No
Processes For Identifying Duplicates :
Duplicate Data Not Stored :
Auto-copy Mode : Client
Contains Data Deduplicated by Client? : No

Storage Pool Name : LTO_COPYSTORAGE
Storage Pool Type : Copy
Device Class Name : LTO5
Estimated Capacity : 92,319 G
Space Trigger Util :
Pct Util : 5.3
Pct Migr :
Pct Logical : 99.5
High Mig Pct :
Low Mig Pct :
Migration Delay :
Migration Continue :
Migration Processes :
Reclamation Processes : 1
Next Storage Pool :
Reclaim Storage Pool :
Maximum Size Threshold :
Access : Read/Write
Description :
Overflow Location :
Cache Migrated Files? :
Collocate? : No
Reclamation Threshold : 50
Offsite Reclamation Limit : No Limit
Maximum Scratch Volumes Allowed : 45
Number of Scratch Volumes Used : 13
Delay Period for Volume Reuse : 10 Day(s)
Migration in Progress? :
Amount Migrated (MB) :
Elapsed Migration Time (seconds) :
Reclamation in Progress? : No
Last Update by (administrator) : ROOT
Last Update Date/Time : 7/23/12 11:23:44 AM PDT
Storage Pool Data Format : Native
Copy Storage Pool(s) :
Active Data Pool(s) :
Continue Copy on Error? :
CRC Data : No
Reclamation Type : Threshold
Overwrite Data when Deleted :
Deduplicate Data? : No
Processes For Identifying Duplicates :
Duplicate Data Not Stored :
Auto-copy Mode :
Contains Data Deduplicated by Client? : No

Storage Pool Name : PRIMARY_BACKUP_POOL
Storage Pool Type : Primary
Device Class Name : DISK
Estimated Capacity : 51 G
Space Trigger Util : 100
Pct Util : 100
Pct Migr : 100
Pct Logical : 100
High Mig Pct : 90
Low Mig Pct : 10
Migration Delay : 0
Migration Continue : Yes
Migration Processes : 10
Reclamation Processes :
Next Storage Pool : SECONDARY_BACKUP_POOL
Reclaim Storage Pool :
Maximum Size Threshold : No Limit
Access : Read/Write
Description :
Overflow Location :
Cache Migrated Files? : No
Collocate? :
Reclamation Threshold :
Offsite Reclamation Limit :
Maximum Scratch Volumes Allowed :
Number of Scratch Volumes Used :
Delay Period for Volume Reuse :
Migration in Progress? : No
Amount Migrated (MB) : 0
Elapsed Migration Time (seconds) : 1
Reclamation in Progress? :
Last Update by (administrator) : ROOT
Last Update Date/Time : 11/1/11 1:31:10 PM PDT
Storage Pool Data Format : Native
Copy Storage Pool(s) :
Active Data Pool(s) :
Continue Copy on Error? : Yes
CRC Data : No
Reclamation Type :
Overwrite Data when Deleted :
Deduplicate Data? : No
Processes For Identifying Duplicates :
Duplicate Data Not Stored :
Auto-copy Mode : Client
Contains Data Deduplicated by Client? : No

Storage Pool Name : SECONDARY_BACKUP_POOL
Storage Pool Type : Primary
Device Class Name : FILE
Estimated Capacity : 4,328 G
Space Trigger Util : 89.8
Pct Util : 100
Pct Migr : 100
Pct Logical : 99.9
High Mig Pct : 90
Low Mig Pct : 70
Migration Delay : 0
Migration Continue : Yes
Migration Processes : 2
Reclamation Processes : 1
Next Storage Pool : LTO_BACKUP_POOL
Reclaim Storage Pool :
Maximum Size Threshold : No Limit
Access : Read/Write
Description :
Overflow Location :
Cache Migrated Files? :
Collocate? : No
Reclamation Threshold : 50
Offsite Reclamation Limit :
Maximum Scratch Volumes Allowed : 200
Number of Scratch Volumes Used : 189
Delay Period for Volume Reuse : 0 Day(s)
Migration in Progress? : No
Amount Migrated (MB) : 0
Elapsed Migration Time (seconds) : 70
Reclamation in Progress? : No
Last Update by (administrator) : ROOT
Last Update Date/Time : 9/20/13 3:05:43 PM PDT
Storage Pool Data Format : Native
Copy Storage Pool(s) :
Active Data Pool(s) :
Continue Copy on Error? : Yes
CRC Data : No
Reclamation Type : Threshold
Overwrite Data when Deleted :
Deduplicate Data? : No
Processes For Identifying Duplicates :
Duplicate Data Not Stored :
Auto-copy Mode : Client
Contains Data Deduplicated by Client? : No
 
q =stgpool f=g (part 3)

Storage Pool Name : SPACEMGPOOL
Storage Pool Type : Primary
Device Class Name : DISK
Estimated Capacity : 0 M
Space Trigger Util : 0
Pct Util : 0
Pct Migr : 0
Pct Logical : 100
High Mig Pct : 90
Low Mig Pct : 70
Migration Delay : 0
Migration Continue : Yes
Migration Processes : 1
Reclamation Processes :
Next Storage Pool :
Reclaim Storage Pool :
Maximum Size Threshold : No Limit
Access : Read/Write
Description :
Overflow Location :
Cache Migrated Files? : No
Collocate? :
Reclamation Threshold :
Offsite Reclamation Limit :
Maximum Scratch Volumes Allowed :
Number of Scratch Volumes Used :
Delay Period for Volume Reuse :
Migration in Progress? : No
Amount Migrated (MB) : 0
Elapsed Migration Time (seconds) : 0
Reclamation in Progress? :
Last Update by (administrator) : SERVER_CONSOLE
Last Update Date/Time : 1/21/11 8:11:25 PM PST
Storage Pool Data Format : Native
Copy Storage Pool(s) :
Active Data Pool(s) :
Continue Copy on Error? : Yes
CRC Data : No
Reclamation Type :
Overwrite Data when Deleted :
Deduplicate Data? : No
Processes For Identifying Duplicates :
Duplicate Data Not Stored :
Auto-copy Mode : Client
Contains Data Deduplicated by Client? : No
 
OK, looks like your offsite copy tape pool is: Storage Pool Name : LTO_COPYSTORAGE

You have one strange library name: LB4.1.0.2

Based on what you mentioned, it looks like you don't have any scratch tapes. Do you have any scratch tapes to check in? If so, checkin as many as you can by loading the tapes in to the I/O and issuing this command AFTER loading the tapes. Repeat the process until you reach the desired number of scratch tapes:

checkin libvol LB4.1.0.2 search=bulk checkl=barcode status-scratch waitt=0

You should see the library 'eat' the tapes, i.e., move it from the I/O to the slots

when you have reached the desired checkin tapes, issue the following command to backup the tapes into online tape pool:

migrate stgp SECONDARY_BACKUP_POOL lo=0

This will move the data from stgpool SECONDARY_BACKUP_POOL to primary tape pool, LTO_BACKUP_POOL. Note that this may run for a while. However, the nodes will start backing up when the primary storage frees up

When all is done, you can do a copy of LTO_BACKUP_POOL to LTO_COPYSTORAGE by issuing the command:

backup stgpool LTO_BACKUP_POOL LTO_COPYSTORAGE

You may need to add scratch tapes before running the command above. Again, the copy cycle may run for s long time
 
There are actually 2 tapes sitting in the I/O and there are 4/45 empty slots. Not sure how many need to stay empty either. I will issue the commands and get back to you. Thank you very much. //Bill
 
From the command line I typed (actually copied from your post)
checkin libvol LB4.1.0.2 search=bulk checkl=barcode status-scratch waitt=0

This is what came back.
ANR2004E Missing value for keyword parameter - status-scratch.

OH - clicking on Storage Device, then the server name - LB4.1.0.2 is listed Library Name Status Good. From there I clicked on Volumes and it shows tapes - 45 of them. Thanks again.
 
I looked at the line and changed it a bit changed the - to a = in the status=scratch. It is running now. Thank you. //Bill
checkin libvol LB4.1.0.2 search=bulk checkl=barcode status=scratch waitt=0
 
Here is in part the activity log. When I started the migrate stgp command it was session 9355 and process 934, the log
2013-09-26-14.40.25 ANR1405W
Scratch volume mount request denied - no scratch volume available. (SESSION: 9355, PROCESS: 934)

2013-09-26-14.40.25 ANR1126E
Migration is ended for volume Y:\TSM\SECONDARY_DISK_POOL\000258EE.BFS. There is insufficient space in subordinate storage pools. (SESSION: 9355, PROCESS: 934)

2013-09-26-14.40.25 ANR1025W
Migration process 934 terminated for storage pool SECONDARY_BACKUP_POOL - insufficient space in subordinate storage pool. (SESSION: 9355, PROCESS: 934)

Before it told me the library was full and did not take in either of the scratch tapes. Maybe it needs the 4 empty slots? Thanks //Bill
 
Yes. Thank you.

q drm

Volume Name State Last Update Date/Time Automated LibName
000028L5 Vault 3/2/12 1:11:26 PM PST
000029L5 Vault 7/2/13 3:30:46 AM PDT
000032L5 Vault 2/6/12 10:52:28 AM PST
000042L5 Vault 1/22/13 8:52:13 AM PST
000050L5 Vault 7/18/13 3:50:23 PM PDT
000070L5 Vault 12/5/11 12:24:00 PM PST
000079L5 Vault 7/25/13 3:19:37 AM PDT
000083L5 Vault 1/22/13 8:52:13 AM PST
000098L5 Vault 6/28/13 1:47:02 PM PDT
000107L5 Vault 6/28/13 1:47:02 PM PDT
000114L5 Vault 6/20/13 7:10:50 PM PDT
000115L5 Vault 6/20/13 7:10:50 PM PDT
000001L5 Vault retrieve 7/23/13 3:42:36 AM PDT
000055L5 Vault retrieve 7/20/13 3:41:28 AM PDT
000097L5 Vault retrieve 7/18/13 3:50:47 PM PDT
000078L5 Vault retrieve 7/16/13 4:54:27 AM PDT
000030L5 Vault retrieve 7/12/13 3:41:03 AM PDT
000065L5 Vault retrieve 7/10/13 3:41:36 AM PDT
000002L5 Vault retrieve 7/9/13 5:33:15 AM PDT
000059L5 Vault retrieve 7/6/13 3:41:23 AM PDT
000044L5 Vault retrieve 7/4/13 3:39:53 AM PDT
000019L5 Vault retrieve 7/3/13 3:43:22 AM PDT
000038L5 Vault retrieve 7/2/13 3:32:01 AM PDT
000048L5 Vault retrieve 7/1/13 7:25:04 AM PDT
000099L5 Vault retrieve 7/1/13 7:25:04 AM PDT
000005L5 Vault retrieve 7/1/13 7:24:07 AM PDT
000060L5 Vault retrieve 6/25/13 8:41:10 AM PDT
000113L5 Vault retrieve 6/20/13 7:10:50 PM PDT
000105L5 Vault retrieve 6/13/13 3:00:22 AM PDT
000095L5 Vault retrieve 6/13/13 3:00:22 AM PDT
000014L5 Vault retrieve 6/13/13 3:00:26 AM PDT
000045L5 Vault retrieve 6/21/13 1:04:03 PM PDT
000013L5 Vault retrieve 6/21/13 1:04:03 PM PDT
000025L5 Vault retrieve 6/21/13 1:04:03 PM PDT
 
Do you see any free slots in the library? I was hoping to see some tapes in mountable status when you issued the 'q drm' command.

If you see free slots, check in some more scratch tapes. The ones that say 'vaultretrieve' are scratch tapes
 
The tape library shows 4 empty slots. I got tapes 1L5, 2L5 and 5L5 from the safe and put in the I/O slots of the library.

Then from Storage Devices, I added Volumes, selecting all of the volumes are labeled. Just check them in and Do not prompt to insert tape volumes.

It said it was running Process 1486.

I opened the activity log, found Process 1486 - Not good news I'm afraid.

2013-09-27-10.52.08 ANR8443E
CHECKIN LIBVOLUME: Volume 000001L5 in library LB4.1.0.2 cannot be assigned a status of SCRATCH. (SESSION: 13142, PROCESS: 1486)
2013-09-27-10.52.08 ANR8443E
CHECKIN LIBVOLUME: Volume 000002L5 in library LB4.1.0.2 cannot be assigned a status of SCRATCH. (SESSION: 13142, PROCESS: 1486)
2013-09-27-10.52.08 ANR8443E
CHECKIN LIBVOLUME: Volume 000005L5 in library LB4.1.0.2 cannot be assigned a status of SCRATCH. (SESSION: 13142, PROCESS: 1486)

The help said to mark them as Private?
I think the system is truly in trouble! Thank you for helping. //Bill
 
Just saw something new in the Activity logs and this is the error - ANR2968E - and under the user response - I'm sorry, but I don't understand. This was all working - for years. Nothing has been added and the other admin, the guy that just left, did not know much more that what he gave to me. Oh sigh . . .

If the message indicates DB2 sqlcode 2033, then the problem is probably the Tivoli Storage Manager API configuration. The DB2 instance uses the Tivoli Storage Manager API to copy the Tivoli Storage Manager database-backup image to the Tivoli Storage Manager server-attached storage devices. Common sqlerrmc codes include: 1. 50 - To determine whether an error created the API timeout condition, look for any Tivoli Storage Manager server messages that occurred during the database-backup process and that were issued before ANR2968E. Ensure that the Tivoli Storage Manager API options file has the correct TCPSERVERADDR and TCPPORT options specified for the Tivoli Storage Manager server being backed up. If the option settings are wrong, correct them for the DB2 instance.
 
Back
Top