• Please help support our sponsors by considering their products and services.
    Our sponsors enable us to serve you with this high-speed Internet connection and fast webservers you are currently using at ADSM.ORG.
    They support this free flow of information and knowledge exchange service at no cost to you.

    Please welcome our latest sponsor Tectrade . We can show our appreciation by learning more about Tectrade Solutions
  • Community Tip: Please Give Thanks to Those Sharing Their Knowledge.

    If you receive helpful answer on this forum, please show thanks to the poster by clicking "LIKE" link for the answer that you found helpful.

  • Community Tip: Forum Rules (PLEASE CLICK HERE TO READ BEFORE POSTING)

    Click the link above to access ADSM.ORG Acceptable Use Policy and forum rules which should be observed when using this website. Violators may be banned from this website. This notice will disappear after you have made at least 3 posts.

RMAN backups move no data

sandragon

ADSM.ORG Member
#1
OS: RHEL 6.9
TSM client: 8.1.0
TDP client: 8.1.0
TSM Server: 8.1.1

Recently built and configured Oracle instance does not complete backups. It will sometimes push data, but otherwise behaves very similarly to vilius.m's issue here: https://adsm.org/forum/index.php?threads/rman-archive-logs-stuck.31401/

Code:
tail -f level1_201707283707.log
Starting backup at 28-JUL-2017 11:37:12
using channel ORA_SBT_TAPE_1
channel ORA_SBT_TAPE_1: starting incremental level 1 datafile backup set
channel ORA_SBT_TAPE_1: specifying datafile(s) in backup set
input datafile file number=00048 name=/data01/DEV/PRODdtat_01.dbf
input datafile file number=00040 name=/data01/DEV/dd812t_02.dbf
input datafile file number=00026 name=/data01/DEV/histdtat_01.dbf
input datafile file number=00033 name=/data01/DEV/syapp1i_01.dbf
input datafile file number=00020 name=/data01/DEV/arcdta_01.dbf
channel ORA_SBT_TAPE_1: starting piece 1 at 28-JUL-2017 11:37:12
TDPO logs only show the following:
Code:
 tail dsmerror.log
07/27/2017 04:06:01 ANS4992W TDPO Linux86-64 ANU0599 TDP for Oracle: (12048): =>() ANU2604W The object /adsmorc/ /LVL1_2osabifc_1_1 was not found on the IBM Spectrum Protect Server
07/27/2017 04:06:07 ANS1909E The scheduled command failed.
07/27/2017 04:06:07 ANS1512E Scheduled event 'ORACLE_*******_INCR' failed.  Return code = 1.
07/28/2017 00:20:35 ANS0361I DIAG: sessSendVerb: Error sending Verb, rc: -50
07/28/2017 00:20:35 ANS1017E Session rejected: TCP/IP connection failure.
07/28/2017 00:20:35 ANS1017E Session rejected: TCP/IP connection failure.
07/28/2017 00:20:35 ANS0361I DIAG: sessSendVerb: Error sending Verb, rc: -50
07/28/2017 00:20:41 ANS4992W TDPO Linux86-64 ANU0599 TDP for Oracle: (12111): =>() ANU2604W The object /adsmorc/ /LVL1_4asae6ri_1_1 was not found on the IBM Spectrum Protect Server
07/28/2017 00:20:51 ANS1909E The scheduled command failed.
07/28/2017 00:20:51 ANS1512E Scheduled event 'ORACLE_*******_INCR' failed.  Return code = 1.
The TSM Server shows the following in the act log:
Code:
188,498 Tcp/Ip RecvW 50.817 M 1.268 K 2.001 M Node TDPO Linux86-64 *******_ORACLE
The wait time is at 50 minutes and it has pushed 2mb.

Failure sometimes occurs sometimes between 20 minutes and 4 hours.

Filesystem backups on this system work fine. Archive logs from oracle work fine. The full works fine, but runs very slow.

The DBAs tell me that the fulls and incrs both to a disk target occur very quickly, it's only when the TDP client is engaged that this behavior happens. Any thoughts? I'm going to get a PMR going Monday.
 

sandragon

ADSM.ORG Member
#3
Hi,
Are you using container stgpool in this enviroment as target?

http://www-01.ibm.com/support/docview.wss?crawler=1&uid=swg1IT20858

If you do a tcpdump on either server or client, do you see tcpzerowindow issues when this error occur?
Each time I have tried to leave a TCP dump running when the backup job fails, TCPdump terminates before the failure. However, this is one of two identical systems. We set up a VM, and cloned it. One clone works, the other does not (they have different node names). Our TSM server has a vlan interface on this subnet so it's going over layer 2, so no routing is involved either.

We are not using containers in this environment, they are going to standard disk pools.

I've got a PMR open with IBM but they are fixating on the TCP/IP errors and networking, even though this is only affecting level 1 not level 0 backups. A networking error would occur across all backup types, in theory.
 

Trident

TSM noob with 10 years expirience
ADSM.ORG Moderator
#4
OK. Good luck. It would be interesting to know what root cause you find.

Could you post your dsm.sys/opt files used by rman? Maybe we can tune somethings there for you.
 

sandragon

ADSM.ORG Member
#5
After a long delay, it appears that the issue is timeout. The DBs are Oracle Standard edition, so no block tracking, single threaded backup. On low change systems, this means very long waits while the DB is scanned for changes.
It never seems to take as long when it's being done to disk, only when using TDPO. Here's our opt file, sanitized of system identifiers:

Code:
************************************************************************
* Tivoli Storage Manager                                               *
************************************************************************

SErvername              TSM01
   COMMMethod           TCPip
   TCPPort              1500
   TCPServeraddress     [address]
   COMPRESSION          no
   Largecommbuffers     yes
   TCPB                 256
   TCPNODelay           yes
   TCPWindowsize        640
   TXNBytelimit         25600
   PASSWORDACCESS       generate
   RESOURCEUTILIZATION  5
   inclexcl             /opt/tivoli/tsm/client/ba/bin/inclexcl.list
   schedlogname         /opt/tivoli/tsm/client/ba/bin/dsmsched.log
   errorlogname         /opt/tivoli/tsm/client/ba/bin/dsmerror.log
   managedservices      schedule webclient
   schedmode            polling
   schedlogret          14 D
   nodename             [node]

SErvername              tdposched
   NODENAME             [node]_oracle
   COMMMethod           TCPip
   TCPServeraddress     [address]
   PASSWORDAccess       generate
   PASSWORDDIR          /opt/tivoli/tsm/client/oracle/bin64
   managedservices      schedule
   schedmode            prompted
   schedlogret          14 D
   schedlogname         /opt/tivoli/tsm/client/ba/bin/tdpo/dsmsched.log
   errorlogname         /opt/tivoli/tsm/client/ba/bin/tdpo/dsmerror.log
   inclexcl             /opt/tivoli/tsm/client/ba/bin/inclexcl.oracle
 

Trident

TSM noob with 10 years expirience
ADSM.ORG Moderator
#6
Hi,

Some performance tips

TCPWindowsize 0

Provioding your OS has enabled TCP Window Scaling
 

Advertise at ADSM.ORG

If you are reading this, so are your potential customer. Advertise at ADSM.ORG right now.

UpCloud high performance VPS at $5/month

Get started with $25 in credits on Cloud Servers. You must use link below to receive the credit. Use the promo to get upto 5 month of FREE Linux VPS.

The Spectrum Protect TLA (Three-Letter Acronym): ISP or something else?

  • Every product needs a TLA, Let's call it ISP (IBM Spectrum Protect).

    Votes: 9 22.5%
  • Keep using TSM for Spectrum Protect.

    Votes: 19 47.5%
  • Let's be formal and just say Spectrum Protect

    Votes: 8 20.0%
  • Other (please comement)

    Votes: 4 10.0%

Forum statistics

Threads
31,001
Messages
131,978
Members
21,255
Latest member
pzzl321
Top