Veritas-bu

[Veritas-bu] Hung AVRD [SUMMARY]

2002-10-25 10:09:32
Subject: [Veritas-bu] Hung AVRD [SUMMARY]
From: mcavoy76 AT hotmail DOT com (Chris McAvoy)
Date: Fri, 25 Oct 2002 09:09:32 -0500
> To get AVRD back you typically need to power reset or IPL the tape drives

We've power-cycled all our drives, and still there's no change.  We've
"rebooted" the port on the switch.  The only things we haven't done are
reboot the switch, and reboot the server.

At this point, we're convinced that it's near impossible to un-hang a hung
AVRD.  We're trying to adopt Veritas recommended (and David Chapa notarized)
best practices:  If a drive is going to be IPL'd or removed, or touched, or
looked at, stop AVRD on the machines that touch that drive.

We're not very happy with the solution, and would much prefer an AVRD
capable of handling change.

Thanks to everyone that responded.

Chris


----- Original Message -----
From: "Marelas, Peter" <MarelP AT AUSTRALIA.Stortek DOT com>
To: "'Chris McAvoy'" <mcavoy76 AT hotmail DOT com>;
<veritas-bu AT mailman.eng.auburn DOT edu>
Sent: Friday, October 25, 2002 1:58 AM
Subject: RE: [Veritas-bu] Hung AVRD


> As richard pointed out this is typically due to a "hung" tape drive.
> AVRD blocks on I/O in the kernel (sg) and I dont believe it ever times
out.
> The FC HBA typically masks FC connectivity failures which is probably
> not a good thing because programs like AVRD dont cope.
>
> To get AVRD back you typically need to power reset or IPL the tape drives
> to find the culprit. I have found AVRD will not return until the tape
> drive it is blocked on comes to life again.
>
> Regards
> Peter Marelas
>
> -----Original Message-----
> From: Chris McAvoy [mailto:mcavoy76 AT hotmail DOT com]
> Sent: Thursday, 24 October 2002 1:57 AM
> To: veritas-bu AT mailman.eng.auburn DOT edu
> Subject: [Veritas-bu] Hung AVRD
>
>
> We've been fighting with hung AVRD processes for a while now, we've tried
> patching, rebuilding sg devices, and reloading the st driver.  This
problem
> only happens on Sun machines.  We're running 3.4 in a SAN / SSO
environment.
>
> We don't know what causes them, or how to kill them.  Has anyone had
similar
> problems, or can shed some light on what's causing the process to hang?
> Although we're sure you can't kill it without a reboot, we'd like to know
if
> there's some way we can prevent it from hanging.
>
> Thanks,
> Chris McAvoy
> _______________________________________________
> Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
> http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
>

<Prev in Thread] Current Thread [Next in Thread>