Hi,
13.11.2008 12:05, Ronald Buder wrote:
> Hi,
>
> we have noticed a blocker which may be resolved in later versions of the
> file daemon, if not I will file it as a bug. If, for whatsoever reason a
> network share breaks away, which is (implicitly) included in the fileset
> the job will stall.
This is normal NFS behaviour - if a NFS server doesn't respond, the
processes accessing it wait in an uninterruptible state. They also do
not get notification of a problem by a signal.
That said, newer NFS client implementations allow to change that
behaviour - under linux, the nfs mount options "soft" and "intr" can
be used to allow client processes to be notified of unavailable NFS
shares.
> At this very moment I am waiting for four backup
> jobs. I have tried to cancel them without any success. The jobs have
> been running for some 8 hours now, cancellation attempt was roundabout 3
> hours ago. As the rest of the system is still up and running and doing
> backups and migration I do not want to restart the director.
You will have to either restart the clients that mount the NFS shares,
or make the NFS server responsive again.
> Running Jobs:
> Console connected at 13-Nov-08 10:16
> JobId Level Name Status
> ======================================================================
> 41637 Increme PLATON-W0001_System.2008-11-13_04.00.21 has been canceled
> 41641 Increme PLATON-W0003_System.2008-11-13_04.00.25 has been canceled
> 41643 Increme PLATON-W0004_System.2008-11-13_04.00.27 has been canceled
> 41645 Increme PLATON-W0005_System.2008-11-13_04.00.29 has been canceled
>
> Due to a server failure the nfs shares are not available anymore. I
> would like to see some sort of a timeout at least if that is at all
> possible.
That's not possible inside Bacula - the FD simply can't terminate file
system accesses that are stalled due to NFS problems.
The best thing to do is often a restart of the NFS server.
Arno
> The reason why I did not file the bug right away is because it may have
> been resolved with a later client version already, I will try to
> reproduce the steps with a more current version of the file daemon and
> post the news here. Any experiences on that matter are of course welcome...
>
> Client: Sparc Solaris 10 (SunOS 5.9 Generic_122300-15 sun4u sparc
> SUNW,Sun-Fire-V490), FD-Version: 2.2.8
> Server: Debian Etch, Dir-Version: 2.4.3
>
> Best regards,
>
> Ronald
>
--
Arno Lehmann
IT-Service Lehmann
Sandstr. 6, 49080 Osnabrück
www.its-lehmann.de
-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users
|