Networker

[Networker] Advanced file system device during a network outage

2007-11-03 10:14:49
Subject: [Networker] Advanced file system device during a network outage
From: Stan Horwitz <stan AT TEMPLE DOT EDU>
To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
Date: Sat, 3 Nov 2007 10:12:43 -0400
I am very new to staging and the use of advanced file system devices with NetWorker 7.4 on a Sun T2000 server. On Tuesday, an outside consultant set up on a new storage node with NetWorker 7.4 on a Sun X4500. It was working well until last night. We are staging to an ADVFS device on the X4500. The device is on a 10TB ZFS volume that's internal to the X4500. The T2000 and X4500 both share a Sony PetaSite tape library using dynamic drive sharing for four of its 14 tape drives. I have seen data stream to the disk device as fast as 93MB/s so I am thrilled with it, especially since it was handling only six sessions at the time! Unfortunately, last night around 7:00, we had a network outage that appears to have lasted for at least half an hour, but I am not sure because I was out shopping when this outage began. DNS service was down for both our DNS servers. Of course, lots of backups that were in progress failed.

I would expect backups to fail for some clients during a network outage, but what surprised me was that the volume that was labeled for the disk device was unmounted. It had well under 10TB on it and none of the savesets on it show an abort. What I don't understand is why the disk volume became unmounted. This disk volume is set up for staging where the low water mark is 30% and the high water mark is 70% and it has yet to reach the 70% because its only been in use since Tuesday night and I haven't set up all the clients yet that I intend to use it with.

At any rate, I used nsrmm to mount the disk volume again and it mounted fine and some non-NDMP backups that were previously sending data to it resumed sending their data. What also surprised me was that four NDMP DSA backups that were writing to this disk volume failed over to tape and they are still in progress. This stunned me because I didn't realize that was possible. Unfortunately, the NDMP backups did not fail over from where they left off; they started over again from scratch.

Does anyone have any idea of why the disk device volume became unmounted? Is it safe to assume this network outage cause this to happen or do you think it was just a coincidence?
To sign off this list, send email to listserv AT listserv.temple DOT edu and type 
"signoff networker" in the body of the email. Please write to networker-request 
AT listserv.temple DOT edu if you have any problems with this list. You can access the 
archives at http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER