Networker

Re: [Networker] "Operation would block" error causing failures

2009-10-02 16:14:53
Subject: Re: [Networker] "Operation would block" error causing failures
From: jee <jee AT ERESMAS DOT NET>
To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
Date: Fri, 2 Oct 2009 16:38:43 +0100
Michael,

"Operation would block" means that Nw is trying to backup some file that is in 
use by some other process

If the AV is definitely not the problem (can you completely stop it and try 
just once as a test?), then maybe some other process (user, application) is 
acccessing some file.

It may also happen that a handler is still opened and needs to be removed.

(check with sysinternals "handle", "process explorer" or the like)

Regards,
Jee



On Friday 02 October 2009 3:16:32 pm MIchael Leone wrote:
> I'm experiencing problems with a save job on a dedicated storage node
> (AFTD device). This is a Win2003 cluster. The job has been running fine
> for years, and no parameters have changed. However, it's been failing
> regularly now with errors like:
>
> * nt_san1:H:\PHAEBS 7164:save.: shutdown failed
>      * <ERROR> :  Error: nsr_end failed: Operation would block
>
> ("nt_san1" being the name of the cluster resource being backed up)
>
> Additionally, I see errors like:
>
> NetWorker media: (emergency) Cannot write to
> X:\DBO\18\63\9acd2fea-00000006-12c5b6f0-4ac5b6f0-03ef0000-0a407d46 -
> errno=22
>
> ("X:\DBO" being the path the AFTD device writes to. BTW, couldn't EMC have
> at least included a client name with this error? Sheesh ...)
>
> The weird thing is that it's always the same 2 folders that error out (we
> enumerate about a dozen folders, for speed reasons, rather than just
> saying ALL). Both are very large (one is like 260G, the other about 900G).
>
>
> At first we thought we had mis-set the ant-virus scan, which was recently
> re-installed. We did see errors - errors showed up in the system log -
> that indicated that a scan was taking too long, and was being terminated.
> so we increased the scanning time, and also told McAfee 8.5 *not* to scan
> files opened for backup.   However, the job is still failing, but now
> there is no error in the system event log (so we don't think that it's
> McAfee blocking access to files). And backups of other folders/shares on
> this same server go off without a hitch, no AV errors at all.
>
> I turned on "verbose" on a test job that did only those 2 folders, but all
> it shows is the "Operation would block" message. The NSR log shows:
>
> ----------------
> 32496 10/2/2009 3:54:06 AM  2 0 0 4028 5212 0 admnman004 savegrp job
> (3264918) host: nt_san1 savepoint: H:\PHAEBS had ERROR indication(s) at
> completion.
> Unable to render the following message: savegrp:RESTART FAILED JOBS *
> nt_san1:H:\PHAEBS  See the file D:\Program Files\Legato\nsr\tmp\sg\RESTART
> FAILED JOBS\sso.000005 for output of save command.
>
> 7341 10/2/2009 3:54:07 AM  2 0 0 4028 5212 0 admnman004 savegrp
> nt_san1:H:\PHAEBS failed.
> -----------------
>
> The referenced file shows each file being backed up, and ends with
> "7164:save: shutdown failed".  Not a whole lot of useful, IMO ....
>
> The job is so large (the 900G folder is my main user home directories)
> that I can't just run it at a whim, as it takes multiple hours, especially
> since it has to write to disk first, then clone to tape.
>
> I've been searching PowerLink, which has been less than helpful. Web
> searches indicate turning off AV, but the saves of other folders on this
> server show no problems.
>
> I'm about to get EMC on the line (severity 2, as I'm not down, actually,
> but I am impacted in a major way - tonight is EOM backup ...)
>
> Thoughts? Next steps?

To sign off this list, send email to listserv AT listserv.temple DOT edu and 
type "signoff networker" in the body of the email. Please write to 
networker-request AT listserv.temple DOT edu if you have any problems with this 
list. You can access the archives at 
http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER

<Prev in Thread] Current Thread [Next in Thread>