SP 8.1.6.100 Nothing Happened last night....

shcart

ADSM.ORG Member
Joined
Jan 6, 2003
Messages
91
Reaction score
1
Points
0
Location
Monroe, NC
Website
www.geocities.com
PREDATAR Control23

A most unusual issue. This morning I arrived to find over 700 Missed schedules. We had 3 or 4 manual backups still running from the previous day with no TCP/IP messages in the actlog or client error logs so is doesn't look like a network issue.

AIX shows no errpt messages, no dropped packets, no NIC issues

The only notification we see is that schedule X for node Y missed at xx:xx:xx. There is no message that we are attempting to connect to Node Y IP Address zz.zz.zz.zz. it is just like TSM decided to put its feet up and have an easy night by not bothering to talk to anyone else.
 
PREDATAR Control23

Are the backup schedules active? Or, are these schedules even there?

What happens if you restart a client backup schedule service? Do you see the dsmsched.log change after the restart and picks up the schedule?
 
PREDATAR Control23

Schedules are active and show events for both last night and future. Restarting the the client service worked fine and showed the new days schedule in sched log. Client Services before restarting show yesterdays schedule in sched log. Define clienta works fine. Its like the background process that actions the schedules had just hung or ceased running silently.

Just to be on the safe side we have restarted the AIX server and the SP instance.
 
PREDATAR Control23

Does this have anything to coincide with your other post about DB2 being pinned with replication? I recall a time we were doing replication from a Power6 to a Power4 and the P4 just couldn't keep up. Not sure what TSM version, think 6.2 or so?? Eventually it crashed and took down both frames. I think that was the case, it has been a few years and I wasn't on the Tiv team then.

I recall some other times recently with 7.1.x and 8.1.x where nearly all my clients would miss. Generally it was because bad things were afoot with the application. Generally running processes, or sometimes even stuck/hung client sessions.

Anything in the dsmffdc.log that offers a hint?
Sadly, I skipped 8.1.6 and went from 8.1.5.100 to 8.1.8 recently.
 
PREDATAR Control23

Schedules are active and show events for both last night and future. Restarting the the client service worked fine and showed the new days schedule in sched log. Client Services before restarting show yesterdays schedule in sched log. Define clienta works fine. Its like the background process that actions the schedules had just hung or ceased running silently.

Just to be on the safe side we have restarted the AIX server and the SP instance.

Reboot the host server after shutting down TSM. I believe there is a disconnect between the TSM server and the host system.

I have seen something similar before BUT NOT schedules not running.
 
PREDATAR Control23

what is the schedule mode?
and did you notice anything in dsmerror.log or dsmwebcl.log ,
anything logged there duirng the backup time ?
 
PREDATAR Control23

Sched mode is Prompted. No abnormal messages in client sched or error log. No messages indicating the schedules even attempted connecting on the TSM server.

Since we restarted the instance everything is running as designed

Definitely no connection with the other message thread I opened they are on different servers. We have 5 TSM 8.1.6.100 Primary servers and 5 Replication target servers. (we are migrating from 5 TSM v7 servers with VTL and containerpool back ends)
 
PREDATAR Control23

Sched mode is Prompted. No abnormal messages in client sched or error log. No messages indicating the schedules even attempted connecting on the TSM server.
If running in prompted mode, the client would not attempt to connect to the server, it waits to be contacted by the server to start the schedule.

In general, for a schedule to be missed, one or more of the following must occur. I know you verified some already, I'm just giving you all possible scenarios for a missed schedule:
- client scheduler is not running
- in the case of polling, communication problem preventing the client to contact the server
- in the case of prompted, communication problem preventing the server to contact the clients
- central scheduler on the server stopped running, it's rare, but could happen if the server is extremely busy
- TCP/IP threads on the Spectrum Protect server (at the application level) stopped, again it's rare, but could happen if the server is extremely busy

The first two would log in the dsmsched.log and dsmerror.log, all the others would log in the server activity log if that's the case.
 
PREDATAR Control23

if you are uisng prompt mode make sure to define :
TCPCLIENTAddress <your client address>
in your client (dsm.opt or dsm.sys according to your operating system)
just to prevent TSM from using the cached TCP/IP of the client
 
PREDATAR Control23

We Use TCPCLIENTAddress on all clients with multiple NIC's and also those leveraging NAT'd addresses through firewalls. Most everything else relies on Cached addresses and works every day except for that one day.

We prefer automatically collected cached addresses as our environment is, shall we say, "dynamic".
 
Top