Networker

[Networker] AW: [Networker] Inactivity Timeout on particular Save Set

2003-01-17 05:54:12
Subject: [Networker] AW: [Networker] Inactivity Timeout on particular Save Set
From: "Gottwald, Stephan" <Stephan.Gottwald AT ITZ-DUESSELDORF DOT DE>
To: NETWORKER AT LISTMAIL.TEMPLE DOT EDU
Date: Fri, 17 Jan 2003 11:47:57 +0100
Hi

We just experienced this Problem on a W2k Server.

It was a File Server with a Raid Array (no Cluster).
We couldnt do a full backup on the Data Drive (70 GB) any more.
Sometimes a manual backup would execute successfully, thoug not often.
save.exe hangs, no possibility to end the task, no commands that make use of 
the \Pipe mechanism work any more (eg remote shutdown), logon to the console or 
a terminal server session not possible.

Onle a hard reboot (power cycle) would work to restart the Server.

I ran a chkdsk, and it reported no errors.

Tried everything, up to a complete reinstall of the Networker Software on the 
server and all clients.
I even discarded all existing indexes, media database etc.

Some weeks later we had suddenly a total corruption of the NTFS file system.
A lot of files we couldnt open anymore, or even copy to another server.
W2k gave a message that it could not access the drive.

Only solution we had was to format the drive and recover the Data from Backup 
and the Data we successfully copied to another Server.

I dont know if it is a problem of W2k or the Raid-Array.
We had no errors in the internal Log of the array-controller up to the day the 
total corruption occured.
On this Day, one drive did not react at boot-time, though we could take it 
online with the management software and had no problems with this drive 
afterwards.

Right now we are closely monitoring the raid-array to see if anything happens...

Hope this helps.

Greetings



Stephan Gottwald


> -----Ursprüngliche Nachricht-----
> Von: Ingo Roschmann [mailto:ingo AT VISIONET DOT DE] 
> Gesendet: Freitag, 17. Januar 2003 11:16
> An: NETWORKER AT LISTMAIL.TEMPLE DOT EDU
> Betreff: [Networker] Inactivity Timeout on particular Save Set
> 
> 
> Hi all,
> 
> I know, there's been a lot of discussion on inactivity 
> timeouts, but I haven't found a hint that covers our problem:
> 
> From time to time, one particular save set fails due to 
> inactivity timeout; in addition, the save.exe-process hangs 
> and we can only kill it by rebooting the machine.
> 
> Our environment is networker 6.1.1, server is solaris 8, 
> client is a w2k- mscs-cluster with open file manager 8.0 running.
> 
> Symptoms are:
> - It is always the same save set that fails and the save set 
> is located on its own group of hard disks
> - Increasing the inactivity timeout is useless; a manual save 
> on the client shows that the process seems to stop working 
> after a few minutes
> - Another save set on the same clusternode at the same time works fine
> - The backup level is irrelevant; the error occured with 
> incremental backups as well as with level 5 backups
> - While the abandoned save.exe process still hangs, every 
> attempt to do another save will fail; if the hanging process 
> is killed (after a reboot), backup may work for a few days 
> until the error occurs again
> - The last time the error occurred a scandisk on the volume 
> reported errors
> 
> Now I have 3 questions:
> 
> - What do you think of the idea, the hard disks the save set 
> is located on, are the problem and are there any suggestions 
> for testing this hypothesis?
> - Are there any suggestions how we could kill the hanging 
> save.exe- processes on the client? We can't kill them by task 
> manager nor by tools like pskill nor does stopping the 
> networker services help (and we don't want to boot the server 
> everytime the problem occurred)
> - Has anyone any idea, please?
> 
> Thanks in advance,
> Ingo
> 
> --
> Note: To sign off this list, send a "signoff networker" 
> command via email to listserv AT listmail.temple DOT edu or visit 
> the list's Web site at 
> http://listmail.temple.edu/archives/networker.> html where you 
> can also view and post messages to the list. 
> =*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=
> 

--
Note: To sign off this list, send a "signoff networker" command via email
to listserv AT listmail.temple DOT edu or visit the list's Web site at
http://listmail.temple.edu/archives/networker.html where you can
also view and post messages to the list.
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=

<Prev in Thread] Current Thread [Next in Thread>
  • [Networker] AW: [Networker] Inactivity Timeout on particular Save Set, Gottwald, Stephan <=