NDMP Issues

cwilloug

ADSM.ORG Senior Member
Joined
Sep 13, 2006
Messages
388
Reaction score
11
Points
0
Location
North Dakota
Website
Visit site
Several months ago I installed a IBM N3600 NAS, and set up NDMP to backup that system... well over the weekend my TSM Server crashed (AIX, TSM 5.5, TS3500 lib with 8 3592 EO5 drives). Currently I have 3 drives mapped to each of the N3600 "nodes" sharing 2 drives each.

Here's what I think happened - I have several oracle databases that the DBA likes to to Full backups on weekends, so when backups started, several space reclamations were still running, occupying the drives, NDMP backups started, the drives were not available, TSM didn't like that so it crashed.

Has anyone else had this issue and what did you do to fix it? I am adding a one more drive to each node, and changing the times the NDMP backups start to hopefully avoid this problem again.

Thanks
 
what do you have in actlog when crash occurs?
 
08/08/2009 19:03:51 ANR0406I Session 158392 started for node LIBSERVxx (TDP
Oracle SUN) (Tcp/Ip libservxx.xxxx.xxxx.xxxx(45962)).
(SESSION: 158392)
08/08/2009 19:03:51 ANR0403I Session 158392 ended for node LIBSERVxx (TDP
Oracle SUN). (SESSION: 158392)
08/08/2009 19:03:51 ANE4991I (Session: 158367, Node: LIBSERVxx) TDP Oracle
SUN ANU0599 TDP for Oracle: (6432): =>(LIBSERVxx)
ANU2535I File /libservxx//kjkm6oia_1_1ODN19TSTdf
= 257163264 bytes sent (SESSION: 158367)
08/08/2009 19:03:51 ANR0403I Session 158367 ended for node LIBSERVxx (TDP
Oracle SUN). (SESSION: 158367)
08/08/2009 19:03:53 ANR0406I Session 158393 started for node LIBSERVxx (TDP
Oracle SUN) (Tcp/Ip libservxx.xxxx.xxxx.xxxx(45966)).
(SESSION: 158393)
08/08/2009 19:03:54 ANR0406I Session 158394 started for node LIBSERVxx (TDP
Oracle SUN) (Tcp/Ip libservxx.xxxx.xxxx.xxxxx(45967)).
(SESSION: 158394)
08/08/2009 19:03:54 ANR8337I NAS volume SP1088 mounted in drive DRIVE08
(/dev/rmt13). (SESSION: 158118, PROCESS: 1080)

And thats it... crash.... restart shows....nobody noticed,, I was checking stuff from home and noticed things weren't working

08/09/2009 16:26:43 ANR4726I The NAS-NDMP support module has been loaded.
08/09/2009 16:26:43 ANR2102I Activity log pruning started: removing entries
prior to 07/23/09 00:00:00.
08/09/2009 16:26:43 ANR1305I Disk volume /tsmsandata/data11.dsm varied online.
08/09/2009 16:26:45 ANR1305I Disk volume /tsmsandata/data09.dsm varied online.
08/09/2009 16:26:45 ANR1305I Disk volume /tsmsandata/data10.dsm varied online.
08/09/2009 16:26:45 ANR1305I Disk volume /tsmsandata/data02.dsm varied online.
 
in tsm server dir, do you have a dsmserv.err file and, maybe, a coredump ?
 
[tsm][/]>errpt
IDENTIFIER TIMESTAMP T C RESOURCE_NAME DESCRIPTION
A7AB4C8F 0811071009 I H rmt16 TAPE SIM/MIM RECORD
A7AB4C8F 0811062509 I H rmt16 TAPE SIM/MIM RECORD
E507DCF9 0811062509 I H rmt16 TAPE DRIVE NEEDS CLEANING
C69F5C9B 0808190409 P S SYSPROC SOFTWARE PROGRAM ABNORMALLY TERMINATED
A7AB4C8F 0808095109 I H rmt13 TAPE SIM/MIM RECORD
A7AB4C8F 0808094709 I H rmt13 TAPE SIM/MIM RECORD

Nothing in the dsmserv.err file or my other system log file worth noting.
 
Which version of TSM Server you have? There's an APAR - IC59290 - where server can crash if NDMP session aborts. You may have had a backup node process time out on drive mount and when it aborted that process... crash! Supposedly fixed in 5.5.3.0 Maint.
 
errpt -a -j C69F5C9B will give you more informations about error logged by aix
 
check also errorid A7AB4C8F for some timeout or tapes problems on drives
 
Sweet, Thanks Eldoraan - I'll install that patch

Thanks for the -a -j picay - I didn't know about that one (AIX is not my strong point)
 
Back
Top