Rman backup fails with "An invalid option keyword was found during option parsing."

newguy2489

ADSM.ORG Member
Joined
Jun 24, 2009
Messages
16
Reaction score
0
Points
0
We have an older Oracle instance (9.2.08) running on AIX 6.1 that we are backing up to TSM. The B/A client, and it's api have been updated to 7.1.1.4, but the Oracle TDP, matched to the DB version is at 5.4. 1.0 so we are no longer supported by Tivoli.

We have implemented a new TSM infrastructure with replication of our 2 primary TSM servers to 2 remote TSM servers, and now I am starting to see some of these off backup failures related to dsm.sys file options. When we started doing the replication, we noticed lots of errors in the client error logs all relating to a nrtable database... and after some research found that we needed to add in new parms to the dsm.sys files to let each backup (B/A and Oracle TDP in this case) write it's own copy of this database, and it also seems to update the dsm.sys file with replication related info after or during each backup. We fixed these errors, but now we are getting sporadic backup failures

On several occasions in the past few weeks, we have seen backup failures for this node, some right out of the gate, some near the time the backup normally completes, but all have made reference to invalid options in the dsm.sys file. In one case I found several replication related parms, that looked ok, but were in the wrong spot in the stanza, I deleted them, and the next backup completed successfully. In another case I couldn't find anything wrong, but restoring the dsm.sys file to the one from the previous night's backup solved the issue.

Anyone else seeing anything like this? Any thoughts or ideas are greatly appreciated.


Thanks!
 
Why did you have to update the BA client to 7.1.1.4?

Without looking further on the problem, I am guessing that the big version difference of the 7.1.1.4 API and the TDP version at 5.4.1.0 is the cause of the errors.

I would assume you had no issues before you updated to 7.1.1.4?
 
Last edited:
Why did you have to update the BA client to 7.1.1.4?

Without looking further on the problem, I am guessing that the big version difference of the 7.1.1.4 API and the TDP version at 5.4.1.0 is the cause of the errors.

I would assume you had no issues before you updated to 7.1.1.4?

Actually, we were having some sporadic backup issues with the older clients, the upgrade to 7114 was done many months ago, and seemed to reduce the frequency of the old issues. The current issues started after we pointed each stanza to it's own nrtablepath and gave the oracle account the permissions it needed in the TSM paths.
 
What "invalid options" are pointed out? Can you post?

I am attaching 3 dsm.sys files (in the uploaded zip, they all have 5 stanzas, but we are only using the 1st 2, #1 for the ba, and the tdp_ora stanza for the Rman) the current one (restored from the backup of 11-1-15), and 2 that have failure in the name, the one dated 10-29-15 has 4 that appear to correct syntax, but are in the wrong place in the file (starting on line 9...)

here is a snipit from that dsm.sys...

*** end of automatically updated options
MYREPLICATIONServer TSM03
MYPRIMARYServername TSM01
MYREPLICATIONServer TSM03
MYPRIMARYServername TSM01

I deleted those 4 lines and things were back to normal for the next night's backup - here is the RMAN error from that failed backup (it occurred at 8 PM, 10/29/15, close to the end of the backup which starts at 4 PM)

RMAN-03009: failure of backup command on t13 channel at 10/29/2015 20:00:47

ORA-27191: sbtinfo2 returned error

Additional information: 2

ORA-19511: Error received from media manager layer, error text:

ANS0260E (RC410) An invalid option keyword was found during option parsing.



Recovery Manager complete.


... we were good till 11-2, another failure, this one right out of the gate...Here are the Rman errors:

RMAN-03009: failure of backup command on t15 channel at 11/02/2015 16:00:12

ORA-19506: failed to create sequential file, name="database_backup_894729611_22112_1", parms=""

ORA-27028: skgfqcre: sbtbackup returned error

ORA-19511: Error received from media manager layer, error text:


ANS1217E (RC409) Server name not found in System Options File




Recovery Manager complete.

When I looked at the dsm.sys (dsm.sys.failure-11-2-15), I could't find anything wrong, so I made a copy if it and then restore the dsm.sys from the previous night's b/a client backup, and the backup was restarted and ran successfully
 

Attachments

  • dsm.sys.zip
    3 KB · Views: 2
RMAN-03009: failure of backup command on t13 channel at 10/29/2015 20:00:47
ORA-27191: sbtinfo2 returned error
Additional information: 2
ORA-19511: Error received from media manager layer, error text:
ANS0260E (RC410) An invalid option keyword was found during option parsing.
Recovery Manager complete.
... we were good till 11-2, another failure, this one right out of the gate...Here are the Rman errors:
RMAN-03009: failure of backup command on t15 channel at 11/02/2015 16:00:12
ORA-19506: failed to create sequential file, name="database_backup_894729611_22112_1", parms=""
ORA-27028: skgfqcre: sbtbackup returned error
ORA-19511: Error received from media manager layer, error text:
ANS1217E (RC409) Server name not found in System Options File

For the above errors, have you looked into the value of "maxnummp" set for the node? The number might not be enough for the channel allocations. As an example, I set node maxnummp=10 for channel allocation of 6.

I am still trying to get my bearings for the other errors. Back later ....
 
The maxnummp's are set to 10 for this node, so it shouldn't be out of mount points...


Well, the 4 PM backup died again ... right out of the gate again

RMAN-03009: failure of backup command on t7 channel at 11/03/2015 16:00:11

ORA-19506: failed to create sequential file, name="database_backup_894816009_22140_1", parms=""

ORA-27028: skgfqcre: sbtbackup returned error

ORA-19511: Error received from media manager layer, error text:

ANS1217E (RC409) Server name not found in System Options File



Recovery Manager complete.

restoring the dsm.sys file from last night (11-2) again fixed it!

Here are the 2 dsm.sys files ...
 

Attachments

  • dsm.zip
    2 KB · Views: 4
*** end of automatically updated options
MYREPLICATIONServer TSM03
MYPRIMARYServername TSM01
MYREPLICATIONServer TSM03
MYPRIMARYServername TSM01

What is the full version of the TSM Server that you are running? Can not be TSM Server 5.X.
The above lines are for node replication, and they will automatically be populated in the dsm.sys.

FYI: The TDP is not dependent on the version of the TSM Server.
Its the version of the TSM API compatibility with the TSM Server that should be concern with.

http://www-01.ibm.com/support/docview.wss?uid=swg21053218

What are the messages in the TSM Server activity log during the backup via the TDP?

restoring the dsm.sys file from last night (11-2) again fixed it!

Was the dsm.sys file restore from a backup? Or copy from another location?
Or did we recreate the dsm.sys file from the original dsm.sys file?

When the dsm.sys file restore, did we stop and restart the schedule daemon?

Any messages in the following logs for the time frame of when the backup failed?-

dsmerror.log
dsierror.log
tdpoerror.log

Good Luck,
Sias
 
What is the full version of the TSM Server that you are running? Can not be TSM Server 5.X.
The above lines are for node replication, and they will automatically be populated in the dsm.sys.

FYI: The TDP is not dependent on the version of the TSM Server.
Its the version of the TSM API compatibility with the TSM Server that should be concern with.

http://www-01.ibm.com/support/docview.wss?uid=swg21053218

What are the messages in the TSM Server activity log during the backup via the TDP?



Was the dsm.sys file restore from a backup? Or copy from another location?
Or did we recreate the dsm.sys file from the original dsm.sys file?

When the dsm.sys file restore, did we stop and restart the schedule daemon?

Any messages in the following logs for the time frame of when the backup failed?-

dsmerror.log
dsierror.log
tdpoerror.log



Good Luck,
Sias

Hi all...

TSM is 7.1.1.300 running on Windows server 2012.

I have attached a snipit of the actlog for last nights failure ... it started and failed immediately (16:00)
most of the messages are session starts and ends, and some Oracle cleanup attempts (ANU0599 ANU2602E)

The dsm.sys file was a TSM restore of the last dsm.sys backed up on the system (from the previous night), I did NOT restart the dsmcad after the restore, but the RMAN backup completed successfully around 8:11 PM) ... when it re-wrote the dsm.sys file (time & date stamp shows it was changed at 8:10 PM on 11-3-15), and messed up the 1st stanza, so that the B/A backup for 22:00 failed (it actually never started) ... that really messed up dsm.sys is also attached.

I also included a rtf file with the info from the 3 logs you asked about ... there is no dsierror.log, and nothing new in the tdpoerror,log ... the dsmerror.log info in that document is from the BA stanza, I attached the dsmerror_ora.log from the tdp_ora stanza ... it basically started with
11/03/15 16:00:09 ANS1217E Server name not found in System Options File
repeated 3 times
11/03/15 16:00:11 ANS1303E Client ended transaction
repeated numerous times
then the Oracle cleanup attempts
11/03/15 16:00:12 ANS4994S TDP Oracle AIX ANU0599 ANU2602E The object /default//database_backup_894816008_22134_1 was not found on the TSM Server
11/03/15 16:00:12 ANS4994S TDP Oracle AIX ANU0599 ANU2602E The object /default//database_backup_894816008_22135_1 was not found on the TSM Server
then finally errors early this morning when the node tried to do 2 small backups, all relating to invalid keywords in dsm.sys, so the 2nd stanza must be messed up too.
 

Attachments

  • failures-11-3-15.zip
    6.5 KB · Views: 1
newguy2489:

Do you have OPT files for the TDP for Oracle setup?

dsm.opt? Yes, there are 2 separate dsm.opt file pointed to the tdp_ora stanza in both the api/bin64 and oracle/bin64 directories, each directory also has a link to the dsm.sys file in the ba/bin64 path.
 
Followup:

After several days of failing /missing backups, I had to do something, but I really didn't have any good ideas to try, until I realized that the main issue seems to be that the dsm.sys file is getting re-written incorrectly, and if I can't get a fix for the clients to re-write it correctly, maybe I should not let them re-write it at all...... so, I removed the permissions for the Oracle account to write to the dsm.sys file, I chmod'ed it from 777 to 644 and all my backups since have worked fine... but now I get hundreds of the following errors in the RMAN stanza's error log:

11/06/15 07:01:14 ANS4058I A write failure occurred while attempting to save node replication failover values to the options file.

Certainly not ideal, but I guess it is easier to cleanup up the error logs every week or so than to have backups failing on a daily basis.

Thanks for everyone's input
 
This looks like a bug that may need an APR.

Did you talk with IBM about this?

Your work around looks good but as you said, not ideal. Have you tried later versions?
 
Unfortunately, we are out of support due to the Oracle API being at 5.4.x due to the old Oracle version.
We have heard rumors of the system retiring, and 2 of it's QA regions have been torn down, but we don't retire prod systems here very quickly ... so it will probably still be here in a year.
I thought of trying other versions, but I figure that my odds are pretty good at breaking it entirely and it is a backup that gets pretty high visibility ... when it fails, so for now I think I will stay with my less than ideal work around

Thanks again.
 
Back
Top