All,
I've got a Bacula installation that has a peculiar problem that I need
some assistance in resolving:
Once a job starts (as in, gets accepted by the SD as a job to run), I
am unable to cancel it. Using the "cancel" bconsole command, I
receive the following output:
*cancel jobid=205
2001 Job fileserver2_Backup.2010-12-09_09.39.25_04 marked to be canceled.
3000 Job fileserver2_Backup.2010-12-09_09.39.25_04 marked to be canceled.
You have messages.
*messages
09-Dec 09:45 director.f.q.d.n-dir JobId 205: Fatal error: Network
error with FD during Backup: ERR=Interrupted system call
09-Dec 09:45 sd.f.q.d.n-sd JobId 205: JobId=205
Job="fileserver2_Backup.2010-12-09_09.39.25_04" marked to be canceled.
09-Dec 09:45 sd.f.q.d.n-sd JobId 205: JobId=205
Job="fileserver2_Backup.2010-12-09_09.39.25_04" marked to be canceled.
09-Dec 09:45 director.f.q.d.n-dir JobId 205: Fatal error: No Job
status returned from FD.
09-Dec 09:45 director.f.q.d.n-dir JobId 205: Bacula
director.f.q.d.n-dir 5.0.3 (04Aug10): 09-Dec-2010 09:45:16
Build OS: i686-pc-linux-gnu redhat Enterprise release
JobId: 205
Job: fileserver2_Backup.2010-12-09_09.39.25_04
Backup Level: Full
Client: "fileserver2.f.q.d.n-fd" 5.0.3 (04Aug10)
Linux,Cross-compile,Win32
FileSet: "fileserver2 Set" 2010-11-08 23:05:00
Pool: "Default" (From Job resource)
Catalog: "Catalog" (From Client resource)
Storage: "tapelibrary" (From Pool resource)
Scheduled time: 09-Dec-2010 09:39:25
Start time: 09-Dec-2010 09:39:28
End time: 09-Dec-2010 09:45:16
Elapsed time: 5 mins 48 secs
Priority: 10
FD Files Written: 0
SD Files Written: 0
FD Bytes Written: 0 (0 B)
SD Bytes Written: 0 (0 B)
Rate: 0.0 KB/s
Software Compression: None
VSS: no
Encryption: no
Accurate: yes
Volume name(s):
Volume Session Id: 1
Volume Session Time: 1291904997
Last Volume Bytes: 187,697,986,560 (187.6 GB)
Non-fatal FD errors: 0
SD Errors: 0
FD termination status: Error
SD termination status: Error
Termination: Backup Canceled
The director then treats this job as canceled, but both FD and SD
treats it as if it is still running. Canceling a job before it
becomes runnable (say, if the SD has exceeded the Max Storage Jobs
directive) works perfectly fine. As a side effect of this problem,
the SD will keep the backup device open indefinitely. I have to kill
both the FD and the SD in order to reset things back to a working
state following a cancelled job.
I'd appreciate some guidance on how to go about diagnosing and
resolving this problem.
Thanks!
--
Alan Gerber
Some configuration notes:
I'd be happy to provide additional configuration details (such as the
contents of the bacula-*.conf files) if anyone thinks that would be
helpful.
The director and SD are running on the same machine. The example
output above was generated by a FD running on a different machine from
the director and SD, but I can replicate the problem on a FD running
locally on the same machine as the director and SD, as well as any
other FD in my installation.
uname -a outputs "Linux director.f.q.d.n 2.6.18-194.8.1.el5 #1 SMP Wed
Jun 23 10:58:38 EDT 2010 i686 i686 i386 GNU/Linux"
Bacula was built from source, using the following configure:
./configure \
-sbindir=/scratch/bacula/bin \
-sysconfdir=/scratch/bacula/etc \
-enable-smartalloc \
-enable-batch-insert \
-enable-largefile \
-with-mysql \
-with-openssl \
-enable-conio \
-with-working-dir=/scratch/bacula/var
...which resulted in the following summary output:
Configuration on Thu Sep 9 11:25:22 EDT 2010:
Host: i686-pc-linux-gnu -- redhat Enterprise release
Bacula version: Bacula 5.0.3 (04 August 2010)
Source code location: .
Install binaries: /scratch/bacula/bin
Install libraries: /usr/lib
Install config files: /scratch/bacula/etc
Scripts directory: /scratch/bacula/etc
Archive directory: /tmp
Working directory: /scratch/bacula/var
PID directory: /var/run
Subsys directory: /var/lock/subsys
Man directory: ${datarootdir}/man
Data directory: /usr/share
Plugin directory: /usr/lib
C Compiler: gcc 4.1.2
C++ Compiler: /usr/bin/g++ 4.1.2
Compiler flags: -g -O2 -Wall -fno-strict-aliasing
-fno-exceptions -fno-rtti
Linker flags:
Libraries: -lpthread -ldl
Statically Linked Tools: no
Statically Linked FD: no
Statically Linked SD: no
Statically Linked DIR: no
Statically Linked CONS: no
Database type: MySQL
Database port:
Database lib: -L/usr/lib/mysql -lmysqlclient_r -lz
Database name: bacula
Database user: bacula
Job Output Email: root@localhost
Traceback Email: root@localhost
SMTP Host Address: localhost
Director Port: 9101
File daemon Port: 9102
Storage daemon Port: 9103
Director User:
Director Group:
Storage Daemon User:
Storage DaemonGroup:
File Daemon User:
File Daemon Group:
SQL binaries Directory /usr/bin
Large file support: yes
Bacula conio support: yes -lncurses
readline support: no
TCP Wrappers support: no
TLS support: yes
Encryption support: yes
ZLIB support: yes
enable-smartalloc: yes
enable-lockmgr: no
bat support: no
enable-gnome: no
enable-bwx-console: no
enable-tray-monitor: no
client-only: no
build-dird: yes
build-stored: yes
Plugin support: yes
AFS support: no
ACL support: no
XATTR support: yes
Python support: no
Batch insert enabled: yes
------------------------------------------------------------------------------
This SF Dev2Dev email is sponsored by:
WikiLeaks The End of the Free Internet
http://p.sf.net/sfu/therealnews-com
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users
|