Networker

Re: [Networker] nsrmmdbd exiting on signal 11

2007-06-14 19:00:53
Subject: Re: [Networker] nsrmmdbd exiting on signal 11
From: "Wood, R A (Bob)" <WoodR AT CHEVRON DOT COM>
To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
Date: Thu, 14 Jun 2007 23:56:33 +0100
Hi,
        First thing is to stop it getting worse. Set the affected device
to read only to prevent any more data writing to the device.

Now to assess exactly what you have on the volume and what you are going
to do with it. 

You say that doing nsrstage -S moves some of the data before hanging.
This sounds like there is some sort of corruption somewhere. So, the
thing to do, is move what you can then then look at the remainder. 

To move what you can do an mminfo on the volume and send it to a file
(so you can compare later). Use the file as an input to nsrstage (use -v
option so it logs which savesets have moved). Let the stage run until it
fails. Run another mminso to a different file and compare, the odds are
that it is the top saveset of the second list that is corrupt (you may
get some hint in the daemon log). Make a note of it and remove it from
the second list. Repeat the nsrstage with the second list.

Repeat this process until you are left with a list of ssids that you
suspect will not stage. Try staging them individually (hopefully this
will trim the list down further).

By now, you are down to the savesets that defifitely will not move. Now
is getting towards crunch time, you may lose some or all of the data in
these savesets so make a note of what they contain (even if it just to
list the data that was lost when you come to write up the incident). You
may, at this stage, decide that the remaining data is lost and call it a
day at that. Depending on the amount of data left I'd start trying to
see what sort of corruption you have. Is it just that Networker can't
understand what's there or does the OS barf when it tries to read it? If
it is the latter does the OS offer any self heal utilities (Windows has
it's disk check, etc.) - try the web for inspiration, etc.

If all else fails you could start enquiring round those companies that
offer a data reclamation service

The situation might not be as bad as it seems.

Good luck
Bob

Bob Wood
Senior Technical Analyst

ITC - EMEAE London eHub
Chevron Limited
1 Westferry Circus, Canary Wharf
London, E14 4HA
Tel +44(0)20 7719 3885
Fax +44(0)20 7719 5101
woodr AT chevron DOT com

Chevron Limited. Registered in England and Wales (145197). Registered
office 1 Westferry Circus, Canary Wharf, London, England, E14 4HA


 

>-----Original Message-----
>From: EMC NetWorker discussion 
>[mailto:NETWORKER AT LISTSERV.TEMPLE DOT EDU] On Behalf Of Uwe Weber
>Sent: 14 June 2007 12:21
>To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
>Subject: [Networker] nsrmmdbd exiting on signal 11
>
>Hi all!
>
>I have the following problem with an advanced filetype device 
>and stageing on a customer's site.
>
>They had an SAN outage some weeks ago and stageing from one of 
>their 4 filetype devices does not work correctly anymore.
>All devices reisde on the smae RAID-array and share one filesystem.
>The Networker server is on Solaris 10. 
>
>Whenever they try to stage from one of the devices,  nsrmmdbd 
>exits on signal 11 and the other NW processes hang and have to 
>be killed off.  If you start the stageing from the NMC this 
>happens immediately, if you use nsrstage -S it might stage a 
>few (< 5) savesets before dying. Stageing from all other devices works.
>
>nsrim -X did not help. 
>
>Unfortuntaley, it took them some time to notice this, so the 
>ft device has grown to about 2.1 TB. 
>
>Powerlink searches came up empty.
>
>Did anybody ever see this behaviour and has some advice to share ?
>
>TIA and regards,
>uwe
>
>To sign off this list, send email to 
>listserv AT listserv.temple DOT edu and type "signoff networker" in 
>the body of the email. Please write to 
>networker-request AT listserv.temple DOT edu if you have any problems 
>with this list. You can access the archives at 
>http://listserv.temple.edu/archives/networker.html or via RSS 
>at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER
>

To sign off this list, send email to listserv AT listserv.temple DOT edu and 
type "signoff networker" in the body of the email. Please write to 
networker-request AT listserv.temple DOT edu if you have any problems with this 
list. You can access the archives at 
http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER

<Prev in Thread] Current Thread [Next in Thread>