ADSM-L

Linux client hangs

2000-01-03 10:45:51
Subject: Linux client hangs
From: Thomas Denier <Thomas.Denier AT MAIL.TJU DOT EDU>
Date: Mon, 3 Jan 2000 10:45:51 -0500
My site is using the unsupported Linux client to back up a news server. A
QUERY NODE command reports that the client OS level is 2.2.12-20smp and that
the ADSM client level is 3.1.0.1. We have recently had the client software
hang twice.

The first time the problem was discovered when the ADSM server generated
messages stating the the schedule prompter was unable to contact the client. I
was able to ping the client's IP address. When I logged on to the client I
found that the 'dsmc sched' process was still there; it had just stopped
responding to requests from the central scheduler. When I killed and restarted
the process the backup started up almost immediately. Neither client nor
server logs showed any indication of trouble beyond the messages mentioned
above.

The second hang condition occured several days later, shortly after a
scheduled backup started. The client log showed that the client sent a few
files, which were to go directly to tape. As usual, the files failed, the
client waited for the tape to be mounted, and the client retried the files.
The client schedule log simply stopped at that point. The client error log had
a revision date about 24 hours earlier, indicating that it contained no
messages related to the error. The server log contained no unusual messages.
The problem was detected when someone noticed that the wait time shown by
QUERY SESSION was over ten minutes. I logged on and found that a ps command
showed two 'dsmc sched' processes (this seems to be the usual result for a
Unix system with a scheduled backup in progress). I killed both processes and
restarted the client scheduler process. The system completed the scheduled
backup successfully.

I am already getting tired of being paged to deal with these conditions, and I
suspect that I will get even more tired of telling managers that I haven't got
a clue how to fix the underlying problem. Is there any known cure for the
behavior described above?
<Prev in Thread] Current Thread [Next in Thread>