1. Please help support our sponsors by considering their products and services.
    Our sponsors enable us to maintain high-speed Internet connection and fast webservers.
    They support this free information and knowledge exchange forum service at no cost to you.

    Please welcome our latest sponsor Tectrade . We can show our appreciation by learning more about Tectrade Solutions

RMAN backups move no data

Discussion in 'Oracle' started by sandragon, Jul 28, 2017.

  1. sandragon

    sandragon ADSM.ORG Member

    Joined:
    Aug 26, 2014
    Messages:
    52
    Likes Received:
    0
    OS: RHEL 6.9
    TSM client: 8.1.0
    TDP client: 8.1.0
    TSM Server: 8.1.1

    Recently built and configured Oracle instance does not complete backups. It will sometimes push data, but otherwise behaves very similarly to vilius.m's issue here: https://adsm.org/forum/index.php?threads/rman-archive-logs-stuck.31401/

    Code:
    tail -f level1_201707283707.log
    Starting backup at 28-JUL-2017 11:37:12
    using channel ORA_SBT_TAPE_1
    channel ORA_SBT_TAPE_1: starting incremental level 1 datafile backup set
    channel ORA_SBT_TAPE_1: specifying datafile(s) in backup set
    input datafile file number=00048 name=/data01/DEV/PRODdtat_01.dbf
    input datafile file number=00040 name=/data01/DEV/dd812t_02.dbf
    input datafile file number=00026 name=/data01/DEV/histdtat_01.dbf
    input datafile file number=00033 name=/data01/DEV/syapp1i_01.dbf
    input datafile file number=00020 name=/data01/DEV/arcdta_01.dbf
    channel ORA_SBT_TAPE_1: starting piece 1 at 28-JUL-2017 11:37:12
    TDPO logs only show the following:
    Code:
     tail dsmerror.log
    07/27/2017 04:06:01 ANS4992W TDPO Linux86-64 ANU0599 TDP for Oracle: (12048): =>() ANU2604W The object /adsmorc/ /LVL1_2osabifc_1_1 was not found on the IBM Spectrum Protect Server
    07/27/2017 04:06:07 ANS1909E The scheduled command failed.
    07/27/2017 04:06:07 ANS1512E Scheduled event 'ORACLE_*******_INCR' failed.  Return code = 1.
    07/28/2017 00:20:35 ANS0361I DIAG: sessSendVerb: Error sending Verb, rc: -50
    07/28/2017 00:20:35 ANS1017E Session rejected: TCP/IP connection failure.
    07/28/2017 00:20:35 ANS1017E Session rejected: TCP/IP connection failure.
    07/28/2017 00:20:35 ANS0361I DIAG: sessSendVerb: Error sending Verb, rc: -50
    07/28/2017 00:20:41 ANS4992W TDPO Linux86-64 ANU0599 TDP for Oracle: (12111): =>() ANU2604W The object /adsmorc/ /LVL1_4asae6ri_1_1 was not found on the IBM Spectrum Protect Server
    07/28/2017 00:20:51 ANS1909E The scheduled command failed.
    07/28/2017 00:20:51 ANS1512E Scheduled event 'ORACLE_*******_INCR' failed.  Return code = 1.
    The TSM Server shows the following in the act log:
    Code:
    188,498 Tcp/Ip RecvW 50.817 M 1.268 K 2.001 M Node TDPO Linux86-64 *******_ORACLE
    The wait time is at 50 minutes and it has pushed 2mb.

    Failure sometimes occurs sometimes between 20 minutes and 4 hours.

    Filesystem backups on this system work fine. Archive logs from oracle work fine. The full works fine, but runs very slow.

    The DBAs tell me that the fulls and incrs both to a disk target occur very quickly, it's only when the TDP client is engaged that this behavior happens. Any thoughts? I'm going to get a PMR going Monday.
     
  2.  
  3. Trident

    Trident TSM noob with 10 years expirience ADSM.ORG Moderator

    Joined:
    Apr 2, 2007
    Messages:
    356
    Likes Received:
    34
    Occupation:
    IT operations
    Location:
    Oslo, Norway
  4. sandragon

    sandragon ADSM.ORG Member

    Joined:
    Aug 26, 2014
    Messages:
    52
    Likes Received:
    0
    Each time I have tried to leave a TCP dump running when the backup job fails, TCPdump terminates before the failure. However, this is one of two identical systems. We set up a VM, and cloned it. One clone works, the other does not (they have different node names). Our TSM server has a vlan interface on this subnet so it's going over layer 2, so no routing is involved either.

    We are not using containers in this environment, they are going to standard disk pools.

    I've got a PMR open with IBM but they are fixating on the TCP/IP errors and networking, even though this is only affecting level 1 not level 0 backups. A networking error would occur across all backup types, in theory.
     
  5. Trident

    Trident TSM noob with 10 years expirience ADSM.ORG Moderator

    Joined:
    Apr 2, 2007
    Messages:
    356
    Likes Received:
    34
    Occupation:
    IT operations
    Location:
    Oslo, Norway
    OK. Good luck. It would be interesting to know what root cause you find.

    Could you post your dsm.sys/opt files used by rman? Maybe we can tune somethings there for you.
     
  6. sandragon

    sandragon ADSM.ORG Member

    Joined:
    Aug 26, 2014
    Messages:
    52
    Likes Received:
    0
    After a long delay, it appears that the issue is timeout. The DBs are Oracle Standard edition, so no block tracking, single threaded backup. On low change systems, this means very long waits while the DB is scanned for changes.
    It never seems to take as long when it's being done to disk, only when using TDPO. Here's our opt file, sanitized of system identifiers:

    Code:
    ************************************************************************
    * Tivoli Storage Manager                                               *
    ************************************************************************
    
    SErvername              TSM01
       COMMMethod           TCPip
       TCPPort              1500
       TCPServeraddress     [address]
       COMPRESSION          no
       Largecommbuffers     yes
       TCPB                 256
       TCPNODelay           yes
       TCPWindowsize        640
       TXNBytelimit         25600
       PASSWORDACCESS       generate
       RESOURCEUTILIZATION  5
       inclexcl             /opt/tivoli/tsm/client/ba/bin/inclexcl.list
       schedlogname         /opt/tivoli/tsm/client/ba/bin/dsmsched.log
       errorlogname         /opt/tivoli/tsm/client/ba/bin/dsmerror.log
       managedservices      schedule webclient
       schedmode            polling
       schedlogret          14 D
       nodename             [node]
    
    SErvername              tdposched
       NODENAME             [node]_oracle
       COMMMethod           TCPip
       TCPServeraddress     [address]
       PASSWORDAccess       generate
       PASSWORDDIR          /opt/tivoli/tsm/client/oracle/bin64
       managedservices      schedule
       schedmode            prompted
       schedlogret          14 D
       schedlogname         /opt/tivoli/tsm/client/ba/bin/tdpo/dsmsched.log
       errorlogname         /opt/tivoli/tsm/client/ba/bin/tdpo/dsmerror.log
       inclexcl             /opt/tivoli/tsm/client/ba/bin/inclexcl.oracle
    
     
  7. Trident

    Trident TSM noob with 10 years expirience ADSM.ORG Moderator

    Joined:
    Apr 2, 2007
    Messages:
    356
    Likes Received:
    34
    Occupation:
    IT operations
    Location:
    Oslo, Norway
    Hi,

    Some performance tips

    TCPWindowsize 0

    Provioding your OS has enabled TCP Window Scaling
     

Share This Page