ADSM-L

SQL Backtrack and ADSM - hanging problem

1999-04-19 09:07:00
Subject: SQL Backtrack and ADSM - hanging problem
From: van Roosmalen, Naomi <naomi.vanroosmalen AT NBTEL.NB DOT CA>
To: ADSM-L AT VM.MARIST DOT EDU <ADSM-L AT VM.MARIST DOT EDU>
Date: Monday, April 19, 1999 09:07
>Hi everyone,
>I have a very strange problem that has been going on for far too long now,
>and am hoping someone may have a solution for me.
>
>Configuration:
>
>Server:
>ADSM server 3.1.1.5, running on AIX 4.2.1
>
>Client:
>Sun Solaris 2.5.1
>ADSM client 2.1.0.? (Not sure if it was 2.1.0.7)
>Sybase database version 11.0.3
>SQL BackTrack 3.1.1
>
>There is a dedicated 100 Mbps ethernet connection between the two.
>
>Sequence of events:
>1. For year 2000 compliance, and to keep up to date on our software, it was
>decided it was time to upgrade the ADSM client to version 3, and SQL BT to
>version 4.0.50.
>
>2. The ADSM client is upgraded to version 3.1.0.6.
>
>3. The backup of the databases fails, but it turns out that the sys admin
by
>accident did not use the original dsm.sys file, which meant ADSM was using
>the wrong ethernet interface.
>
>4. Original dsm.sys file is restored.
>
>5. Backup seems to run ok, but then starts to not complete on some days.
>After the backup should have long been done, SQL BT processes are still
>showing up in the ps -ef output. There are no error codes in any log files,
>no messages indicating why the backup hung. Sybase logs, ADSM logs, SQL BT
>logs are checked, but nothing unusual is showing up. dtwatch does not
>provide any insight either, it just shows a current status, which does not
>change.
>
>6. A week and a half later the sys admin notifies me that SQL BT is hogging
>semaphores, and that the system has run out. The number of semaphores is
>doubled, and the system is rebooted. The first few backups work, but then
>backups start hanging again.
>
>7. Because this did not help, SQL BT is upgraded from version 3.1.1 to
>version 4.0.50. This does not help, backups continue to hang.
>
>I don't remember if point 6 happened before point 7, those two could be
>reversed.
>
>8. Last week I downgraded the ADSM client from v 3.1.0.6 to 3.1.0.5. This
>did not change anything.
>
>The difficulty with these hangs is that there are no messages in any log
>files indicating what the problem is. All reporting simply stops. I have
run
>the backup with the -debug option and the -query option, but this did not
>provide BMC support with any insight.
>
>The backups will partially complete, but then stop. There does not seem to
>be a pattern in where the backup would stop. Some days it would complete
one
>database, other days it would do 3 databases, and some days it would
>complete everything.
>
>I have tried everything I can think, and looked at everything I can think
>of: network traffic, Sybase logs, SQL BT logs, ADSM server logs (no entries
>show there either), ADSM client logs. I have asked various people who work
>with this system if anything had changed in its environment. No changes
have
>been reported.
>
>It somehow looks like it is semaphore related, but increasing the number of
>semaphores did not help. Eventually all of them end up in use anyway (we
>know this because there is another application that is dependent on
>available semaphores, and this application stops working when there are
none
>left).
>
>Has anyone had a similar situation and what did you do to resolve it?
>
>Thanks,
>Naomi van Roosmalen
>NBTel Inc.
<Prev in Thread] Current Thread [Next in Thread>