ADSM-L

" Intermittant TSM Server - Database stops"

2003-12-16 01:51:22
Subject: " Intermittant TSM Server - Database stops"
From: "Pole, Stephen" <Stephen.Pole AT HEALTH.WA.GOV DOT AU>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Tue, 16 Dec 2003 14:50:45 +0800
TSM'ers

Being the first time attempting such as process here is a plan

1. Perform database backup
2. Copy to another location (other machine)
        a) dsmserv.opt
        b) dsmserv.dsk
        c) volhist (volume history file)
        d) devconfig ( device config file)

3. Set the server to start in quiet mode. Modify the server dsmserv.opt
        a) change client ports to 15000 (Stops client from logging on)
        b) set expinterval to 0 as this prevents inventory from expiring
starting immediately after a server startup.
        c) add NOMIGRRECL to prevent TSM from starting space reclamation or
migration.
        d) Set DISABLESCHEDS to YES (this prevents any TSM Schedules from
running

4.      Edit devconfig file to contain file deveice class to store the
unloaded database on it.
        define declass fileclass devtyep=file mountlimit=5 maxcap=5G
dir=/tsmtemp

5. From Server directory run DSMSERV UNLOADDB DEVclass=fileclass

Then we will perform the following steps after a successful UNLOADDB. This
is to ensure that the files created for the database and recovery logvolumes
do not exist on the system otherwise LOADFORMAT process will fail.

6. Create file called /usr/temp/LOGVOL.TXT. Edit the file to contain the
following:-
"var/tsm/tsmlog/log01.dsm" 512
"var/tsm/tsmlog/log02.dsm" 512
"var/tsm/tsmlog/log03.dsm" 512
"var/tsm/tsmlog/log04.dsm" 512

7. Create a file called /usr/temp/DBVOL.TXT, edit the file to contain the
following:-
"/var/tsm/tsmdb1/db01.dsm" 5000
"/var/tsm/tsmdb1/db02.dsm" 5000
"/var/tsm/tsmdb1/db03.dsm" 5000
"/var/tsm/tsmdb1/db04.dsm" 5000

8. From server directory issue DSMSERV loadformat 4
FILE:"/usr/temp/LOGVOL.TXT 4 FILE:"/usr/temp/DBVOL.TXT

9. DSMSERV LOADDB DEVclass=fileclass
VOLumenames="/tsmtemp/<volname1>","/tsmtemp/volname2",
"/tsmtemp/volname3","/tsmtemp/volname4"

If LOADDB spits up any errors then run AUDITDB
eg DSMSERV AUDITDB FIX=YES DETAIL=YES FILE=/usrtemp/AUDITDB.TXT

10. After finshing the above. Restore the orginal devconfig.out to original
settings

11 Start TSM Server from commandline

12. Perform a Full DB Backup
13 HALT the TSM Server
14. Restore the orginal DSMSERV.OPT file
15. Define mirror volumes (db and log vols using the dsmfmt command)

Define the group of mirrored volumes

16 Restart the TSM Server

Any comments or gottcha's would be greatly appreciated.

Thanks in advance


Stephen


-
----- Original Message -----
From: "Pole, Stephen" <Stephen.Pole AT HEALTH.WA.GOV DOT AU>
To: <ADSM-L AT VM.MARIST DOT EDU>
Sent: Tuesday, December 16, 2003 11:36 AM
Subject: Re: Intermittant TSM Server - Database stops


> Hi all,
>
> Sorry to trouble all of you.
>
> Here is an event that happened last week during normal operations, while a
> large query was being run on the TSM database.
>
> 12/11/03 01:08:13     ANR2561I Schedule prompter contacting ICMC02RPA
> (session
>                        4174) to start a scheduled operation.
>
> 12/11/03 01:08:32     ANR2017I Administrator OPS issued command: FETCH
NEXT
> 50
> 12/11/03 01:08:37     ANR2958E SQL temporary table storage has been
> exhausted.
>
> The explanation seems pretty self explanantory.
>
> The server was restarted all seemed ok for about 12 hours.
>
> We have added another extended the database by means of another dataase
> volume and refrained from performing any major SQL queries etc..
>
> Since then, TSM Server stops at random times. (no messages in the actlog)
> The server just falls and has to be restarted.
>
> The only clue something has happened is in the errpt -a
>
> LABEL:          CORE_DUMP
> IDENTIFIER:     1F0B7B49
>
> Date/Time:       Mon Dec 15 23:33:11 WAUS
> Sequence Number: 40038
> Machine Id:      0000D62A4C00
> Node Id:         rsfm014
> Class:           S
> Type:            PERM
> Resource Name:   SYSPROC
>
> Description
> SOFTWARE PROGRAM ABNORMALLY TERMINATED
>
> Probable Causes
> SOFTWARE PROGRAM
>
> User Causes
> USER GENERATED SIGNAL
>
>         Recommended Actions
>         CORRECT THEN RETRY
>
> Failure Causes
> SOFTWARE PROGRAM
>
>         Recommended Actions
>         RERUN THE APPLICATION PROGRAM
>         IF PROBLEM PERSISTS THEN DO THE FOLLOWING
>         CONTACT APPROPRIATE SERVICE REPRESENTATIVE
>
> Detail Data
> SIGNAL NUMBER
>            6
> USER'S PROCESS ID:
>        44974
> FILE SYSTEM SERIAL NUMBER
>            2
> INODE NUMBER
>       215060
> PROCESSOR ID
>            0
> PROGRAM NAME
> dsmserv
> ADDITIONAL INFORMATION
> pthread_k A8
> ??
> _p_raise 64
> raise 34
> abort B8
> AbortServ 80
> TrapHandl 13C
> ??
> ??
>
> Symptom Data
> REPORTABLE
> 1
> INTERNAL ERROR
> 0
> SYMPTOM CODE
> PCSS/SPI2 FLDS/dsmserv SIG/6 FLDS/AbortServ VALU/80
>
> End result is that TSM has gone very "flakey" and falls over a odd times.
>
> Has anyone encounter this before. If so what are my options?
>
> We are looking at doing an unloaddb then reload.... etc... (If I can get
my
> head around the procedure)..
>
> Any help would be greatly appreciated
>
> Thanks in advance TSM'ers!
>
> Cheers
>
>
>
> Stephen Pole
> WA Dept of Health
>
> stephen.pole AT health.wa.gov DOT au

-----Original Message-----
From: Sony Priyambodo [mailto:sony.priyambodo AT METRODATA.CO DOT ID]
Sent: 16 December, 2003 12:44 PM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Re: Intermittant TSM Server - Database stops


SQL Query using TSM Log, it happed with us, and solve with extend TSM Log
size. Before normal operation, please run TSM Server in console mode, this
is the best way troubleshooting after TSM Server crash

SN

----- Original Message -----
From: "Pole, Stephen" <Stephen.Pole AT HEALTH.WA.GOV DOT AU>
To: <ADSM-L AT VM.MARIST DOT EDU>
Sent: Tuesday, December 16, 2003 11:36 AM
Subject: Re: Intermittant TSM Server - Database stops


> Hi all,
>
> Sorry to trouble all of you.
>
> Here is an event that happened last week during normal operations, while a
> large query was being run on the TSM database.
>
> 12/11/03 01:08:13     ANR2561I Schedule prompter contacting ICMC02RPA
> (session
>                        4174) to start a scheduled operation.
>
> 12/11/03 01:08:32     ANR2017I Administrator OPS issued command: FETCH
NEXT
> 50
> 12/11/03 01:08:37     ANR2958E SQL temporary table storage has been
> exhausted.
>
> The explanation seems pretty self explanantory.
>
> The server was restarted all seemed ok for about 12 hours.
>
> We have added another extended the database by means of another dataase
> volume and refrained from performing any major SQL queries etc..
>
> Since then, TSM Server stops at random times. (no messages in the actlog)
> The server just falls and has to be restarted.
>
> The only clue something has happened is in the errpt -a
>
> LABEL:          CORE_DUMP
> IDENTIFIER:     1F0B7B49
>
> Date/Time:       Mon Dec 15 23:33:11 WAUS
> Sequence Number: 40038
> Machine Id:      0000D62A4C00
> Node Id:         rsfm014
> Class:           S
> Type:            PERM
> Resource Name:   SYSPROC
>
> Description
> SOFTWARE PROGRAM ABNORMALLY TERMINATED
>
> Probable Causes
> SOFTWARE PROGRAM
>
> User Causes
> USER GENERATED SIGNAL
>
>         Recommended Actions
>         CORRECT THEN RETRY
>
> Failure Causes
> SOFTWARE PROGRAM
>
>         Recommended Actions
>         RERUN THE APPLICATION PROGRAM
>         IF PROBLEM PERSISTS THEN DO THE FOLLOWING
>         CONTACT APPROPRIATE SERVICE REPRESENTATIVE
>
> Detail Data
> SIGNAL NUMBER
>            6
> USER'S PROCESS ID:
>        44974
> FILE SYSTEM SERIAL NUMBER
>            2
> INODE NUMBER
>       215060
> PROCESSOR ID
>            0
> PROGRAM NAME
> dsmserv
> ADDITIONAL INFORMATION
> pthread_k A8
> ??
> _p_raise 64
> raise 34
> abort B8
> AbortServ 80
> TrapHandl 13C
> ??
> ??
>
> Symptom Data
> REPORTABLE
> 1
> INTERNAL ERROR
> 0
> SYMPTOM CODE
> PCSS/SPI2 FLDS/dsmserv SIG/6 FLDS/AbortServ VALU/80
>
> End result is that TSM has gone very "flakey" and falls over a odd times.
>
> Has anyone encounter this before. If so what are my options?
>
> We are looking at doing an unloaddb then reload.... etc... (If I can get
my
> head around the procedure)..
>
> Any help would be greatly appreciated
>
> Thanks in advance TSM'ers!
>
> Cheers
>
>
>
> Stephen Pole
> WA Dept of Health
>
> stephen.pole AT health.wa.gov DOT au

<Prev in Thread] Current Thread [Next in Thread>
  • " Intermittant TSM Server - Database stops", Pole, Stephen <=