1. Forum Rules (PLEASE CLICK HERE TO READ BEFORE POSTING) Click the link to access ADSM.ORG Acceptable Use Policy and forum rules which should be observed when using this website. Violators may be banned from this website. This message will disappear after you have made at least 12 posts. Thank you for your cooperation.

NDMP Issues

Discussion in 'Backup / Archive Discussion' started by cwilloug, Aug 12, 2009.

  1. cwilloug

    cwilloug Senior Member

    Joined:
    Sep 13, 2006
    Messages:
    387
    Likes Received:
    11
    Occupation:
    Storage / Backup / VMware / HyperV admin
    Location:
    North Dakota
    Several months ago I installed a IBM N3600 NAS, and set up NDMP to backup that system... well over the weekend my TSM Server crashed (AIX, TSM 5.5, TS3500 lib with 8 3592 EO5 drives). Currently I have 3 drives mapped to each of the N3600 "nodes" sharing 2 drives each.

    Here's what I think happened - I have several oracle databases that the DBA likes to to Full backups on weekends, so when backups started, several space reclamations were still running, occupying the drives, NDMP backups started, the drives were not available, TSM didn't like that so it crashed.

    Has anyone else had this issue and what did you do to fix it? I am adding a one more drive to each node, and changing the times the NDMP backups start to hopefully avoid this problem again.

    Thanks
     
  2.  
  3. picay

    picay New Member

    Joined:
    Oct 22, 2004
    Messages:
    212
    Likes Received:
    0
    what do you have in actlog when crash occurs?
     
  4. cwilloug

    cwilloug Senior Member

    Joined:
    Sep 13, 2006
    Messages:
    387
    Likes Received:
    11
    Occupation:
    Storage / Backup / VMware / HyperV admin
    Location:
    North Dakota
    08/08/2009 19:03:51 ANR0406I Session 158392 started for node LIBSERVxx (TDP
    Oracle SUN) (Tcp/Ip libservxx.xxxx.xxxx.xxxx(45962)).
    (SESSION: 158392)
    08/08/2009 19:03:51 ANR0403I Session 158392 ended for node LIBSERVxx (TDP
    Oracle SUN). (SESSION: 158392)
    08/08/2009 19:03:51 ANE4991I (Session: 158367, Node: LIBSERVxx) TDP Oracle
    SUN ANU0599 TDP for Oracle: (6432): =>(LIBSERVxx)
    ANU2535I File /libservxx//kjkm6oia_1_1ODN19TSTdf
    = 257163264 bytes sent (SESSION: 158367)
    08/08/2009 19:03:51 ANR0403I Session 158367 ended for node LIBSERVxx (TDP
    Oracle SUN). (SESSION: 158367)
    08/08/2009 19:03:53 ANR0406I Session 158393 started for node LIBSERVxx (TDP
    Oracle SUN) (Tcp/Ip libservxx.xxxx.xxxx.xxxx(45966)).
    (SESSION: 158393)
    08/08/2009 19:03:54 ANR0406I Session 158394 started for node LIBSERVxx (TDP
    Oracle SUN) (Tcp/Ip libservxx.xxxx.xxxx.xxxxx(45967)).
    (SESSION: 158394)
    08/08/2009 19:03:54 ANR8337I NAS volume SP1088 mounted in drive DRIVE08
    (/dev/rmt13). (SESSION: 158118, PROCESS: 1080)

    And thats it... crash.... restart shows....nobody noticed,, I was checking stuff from home and noticed things weren't working

    08/09/2009 16:26:43 ANR4726I The NAS-NDMP support module has been loaded.
    08/09/2009 16:26:43 ANR2102I Activity log pruning started: removing entries
    prior to 07/23/09 00:00:00.
    08/09/2009 16:26:43 ANR1305I Disk volume /tsmsandata/data11.dsm varied online.
    08/09/2009 16:26:45 ANR1305I Disk volume /tsmsandata/data09.dsm varied online.
    08/09/2009 16:26:45 ANR1305I Disk volume /tsmsandata/data10.dsm varied online.
    08/09/2009 16:26:45 ANR1305I Disk volume /tsmsandata/data02.dsm varied online.
     
  5. picay

    picay New Member

    Joined:
    Oct 22, 2004
    Messages:
    212
    Likes Received:
    0
    in tsm server dir, do you have a dsmserv.err file and, maybe, a coredump ?
     
  6. cwilloug

    cwilloug Senior Member

    Joined:
    Sep 13, 2006
    Messages:
    387
    Likes Received:
    11
    Occupation:
    Storage / Backup / VMware / HyperV admin
    Location:
    North Dakota
    [tsm][/]>errpt
    IDENTIFIER TIMESTAMP T C RESOURCE_NAME DESCRIPTION
    A7AB4C8F 0811071009 I H rmt16 TAPE SIM/MIM RECORD
    A7AB4C8F 0811062509 I H rmt16 TAPE SIM/MIM RECORD
    E507DCF9 0811062509 I H rmt16 TAPE DRIVE NEEDS CLEANING
    C69F5C9B 0808190409 P S SYSPROC SOFTWARE PROGRAM ABNORMALLY TERMINATED
    A7AB4C8F 0808095109 I H rmt13 TAPE SIM/MIM RECORD
    A7AB4C8F 0808094709 I H rmt13 TAPE SIM/MIM RECORD

    Nothing in the dsmserv.err file or my other system log file worth noting.
     
  7. Eldoraan

    Eldoraan Senior Member

    Joined:
    Feb 19, 2003
    Messages:
    288
    Likes Received:
    10
    Occupation:
    Data Protection
    Location:
    Charlotte, NC
    Which version of TSM Server you have? There's an APAR - IC59290 - where server can crash if NDMP session aborts. You may have had a backup node process time out on drive mount and when it aborted that process... crash! Supposedly fixed in 5.5.3.0 Maint.
     
  8. picay

    picay New Member

    Joined:
    Oct 22, 2004
    Messages:
    212
    Likes Received:
    0
    errpt -a -j C69F5C9B will give you more informations about error logged by aix
     
  9. picay

    picay New Member

    Joined:
    Oct 22, 2004
    Messages:
    212
    Likes Received:
    0
    check also errorid A7AB4C8F for some timeout or tapes problems on drives
     
  10. cwilloug

    cwilloug Senior Member

    Joined:
    Sep 13, 2006
    Messages:
    387
    Likes Received:
    11
    Occupation:
    Storage / Backup / VMware / HyperV admin
    Location:
    North Dakota
    Sweet, Thanks Eldoraan - I'll install that patch

    Thanks for the -a -j picay - I didn't know about that one (AIX is not my strong point)
     
: n3600, ndmp, ts3500

Share This Page