Bacula-users

[Bacula-users] Unable to cancel jobs

2010-12-09 10:16:22
Subject: [Bacula-users] Unable to cancel jobs
From: Alan Gerber <agerber AT ncsu DOT edu>
To: bacula-users AT lists.sourceforge DOT net
Date: Thu, 9 Dec 2010 10:13:15 -0500
All,

I've got a Bacula installation that has a peculiar problem that I need
some assistance in resolving:

Once a job starts (as in, gets accepted by the SD as a job to run), I
am unable to cancel it.  Using the "cancel" bconsole command, I
receive the following output:

*cancel jobid=205
2001 Job fileserver2_Backup.2010-12-09_09.39.25_04 marked to be canceled.
3000 Job fileserver2_Backup.2010-12-09_09.39.25_04 marked to be canceled.
You have messages.
*messages
09-Dec 09:45 director.f.q.d.n-dir JobId 205: Fatal error: Network
error with FD during Backup: ERR=Interrupted system call
09-Dec 09:45 sd.f.q.d.n-sd JobId 205: JobId=205
Job="fileserver2_Backup.2010-12-09_09.39.25_04" marked to be canceled.
09-Dec 09:45 sd.f.q.d.n-sd JobId 205: JobId=205
Job="fileserver2_Backup.2010-12-09_09.39.25_04" marked to be canceled.
09-Dec 09:45 director.f.q.d.n-dir JobId 205: Fatal error: No Job
status returned from FD.
09-Dec 09:45 director.f.q.d.n-dir JobId 205: Bacula
director.f.q.d.n-dir 5.0.3 (04Aug10): 09-Dec-2010 09:45:16
  Build OS:               i686-pc-linux-gnu redhat Enterprise release
  JobId:                  205
  Job:                    fileserver2_Backup.2010-12-09_09.39.25_04
  Backup Level:           Full
  Client:                 "fileserver2.f.q.d.n-fd" 5.0.3 (04Aug10)
Linux,Cross-compile,Win32
  FileSet:                "fileserver2 Set" 2010-11-08 23:05:00
  Pool:                   "Default" (From Job resource)
  Catalog:                "Catalog" (From Client resource)
  Storage:                "tapelibrary" (From Pool resource)
  Scheduled time:         09-Dec-2010 09:39:25
  Start time:             09-Dec-2010 09:39:28
  End time:               09-Dec-2010 09:45:16
  Elapsed time:           5 mins 48 secs
  Priority:               10
  FD Files Written:       0
  SD Files Written:       0
  FD Bytes Written:       0 (0 B)
  SD Bytes Written:       0 (0 B)
  Rate:                   0.0 KB/s
  Software Compression:   None
  VSS:                    no
  Encryption:             no
  Accurate:               yes
  Volume name(s):
  Volume Session Id:      1
  Volume Session Time:    1291904997
  Last Volume Bytes:      187,697,986,560 (187.6 GB)
  Non-fatal FD errors:    0
  SD Errors:              0
  FD termination status:  Error
  SD termination status:  Error
  Termination:            Backup Canceled

The director then treats this job as canceled, but both FD and SD
treats it as if it is still running.  Canceling a job before it
becomes runnable (say, if the SD has exceeded the Max Storage Jobs
directive) works perfectly fine.  As a side effect of this problem,
the SD will keep the backup device open indefinitely.  I have to kill
both the FD and the SD in order to reset things back to a working
state following a cancelled job.

I'd appreciate some guidance on how to go about diagnosing and
resolving this problem.

Thanks!

--
Alan Gerber



Some configuration notes:
I'd be happy to provide additional configuration details (such as the
contents of the bacula-*.conf files) if anyone thinks that would be
helpful.

The director and SD are running on the same machine.  The example
output above was generated by a FD running on a different machine from
the director and SD, but I can replicate the problem on a FD running
locally on the same machine as the director and SD, as well as any
other FD in my installation.

uname -a outputs "Linux director.f.q.d.n 2.6.18-194.8.1.el5 #1 SMP Wed
Jun 23 10:58:38 EDT 2010 i686 i686 i386 GNU/Linux"

Bacula was built from source, using the following configure:

./configure \
 -sbindir=/scratch/bacula/bin \
 -sysconfdir=/scratch/bacula/etc \
 -enable-smartalloc \
 -enable-batch-insert \
 -enable-largefile \
 -with-mysql \
 -with-openssl \
 -enable-conio \
 -with-working-dir=/scratch/bacula/var

...which resulted in the following summary output:


Configuration on Thu Sep  9 11:25:22 EDT 2010:

   Host:                    i686-pc-linux-gnu -- redhat Enterprise release
   Bacula version:          Bacula 5.0.3 (04 August 2010)
   Source code location:    .
   Install binaries:        /scratch/bacula/bin
   Install libraries:       /usr/lib
   Install config files:    /scratch/bacula/etc
   Scripts directory:       /scratch/bacula/etc
   Archive directory:       /tmp
   Working directory:       /scratch/bacula/var
   PID directory:           /var/run
   Subsys directory:        /var/lock/subsys
   Man directory:           ${datarootdir}/man
   Data directory:          /usr/share
   Plugin directory:        /usr/lib
   C Compiler:              gcc 4.1.2
   C++ Compiler:            /usr/bin/g++ 4.1.2
   Compiler flags:           -g -O2 -Wall -fno-strict-aliasing
-fno-exceptions -fno-rtti
   Linker flags:
   Libraries:               -lpthread -ldl
   Statically Linked Tools: no
   Statically Linked FD:    no
   Statically Linked SD:    no
   Statically Linked DIR:   no
   Statically Linked CONS:  no
   Database type:           MySQL
   Database port:
   Database lib:            -L/usr/lib/mysql -lmysqlclient_r -lz
   Database name:           bacula
   Database user:           bacula

   Job Output Email:        root@localhost
   Traceback Email:         root@localhost
   SMTP Host Address:       localhost

   Director Port:           9101
   File daemon Port:        9102
   Storage daemon Port:     9103

   Director User:
   Director Group:
   Storage Daemon User:
   Storage DaemonGroup:
   File Daemon User:
   File Daemon Group:

   SQL binaries Directory   /usr/bin

   Large file support:      yes
   Bacula conio support:    yes -lncurses
   readline support:        no
   TCP Wrappers support:    no
   TLS support:             yes
   Encryption support:      yes
   ZLIB support:            yes
   enable-smartalloc:       yes
   enable-lockmgr:          no
   bat support:             no
   enable-gnome:            no
   enable-bwx-console:      no
   enable-tray-monitor:     no
   client-only:             no
   build-dird:              yes
   build-stored:            yes
   Plugin support:          yes
   AFS support:             no
   ACL support:             no
   XATTR support:           yes
   Python support:          no
   Batch insert enabled:    yes

------------------------------------------------------------------------------
This SF Dev2Dev email is sponsored by:

WikiLeaks The End of the Free Internet
http://p.sf.net/sfu/therealnews-com
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users

<Prev in Thread] Current Thread [Next in Thread>
  • [Bacula-users] Unable to cancel jobs, Alan Gerber <=