Re: [Bacula-users] bacula watchdog killing tape-loading
2008-08-07 08:52:51
user "bacula" is member of "disk" and "bacula":
cat /etc/group|grep bacula
disk:x:6:root,bacula
bacula:x:101:bacula,apache
the autochanger is:
ll /dev/sg3
crw-rw---- 1 root disk 21, 3 7. Aug 12:16 /dev/sg3
Thanks, Nils
Jason A. Kates schrieb:
What groups is the bacula user in and what are the perms on /dev/sg3?
Thanks -Jason
On Thu, 2008-08-07 at 14:02 +0200, Nils Blanck-Wehde wrote:
Hi Terry,
sorry, I forgot to mention: OS is CentOS 5.2, bacula-version is 2.4.2.
Problems started all out of a sudden with 2.4.1 after running fine for
weeks.
Calling mtx and mtx-changer both as root as well as user bacula works
flawlessly (slot 1 is currently loaded):
[root@company-Backupserver ~]# mtx -f /dev/sg3 load 1 0
Drive 0 Full (Storage Element 1 loaded)
[root@company-Backupserver ~]# su - bacula
-bash-3.2$ /usr/sbin/mtx -f /dev/sg3 load 1 0
Drive 0 Full (Storage Element 1 loaded)
[root@company-Backupserver ~]# /usr/lib/bacula/mtx-changer /dev/sg3
loaded 1 /dev/nst0 0
1
[root@company-Backupserver ~]# su - bacula
-bash-3.2$ /usr/lib/bacula/mtx-changer /dev/sg3 loaded 1 /dev/nst0 0
1
-bash-3.2$
I can see from the webadmin of the autochanger (Quantum Superloader 3
DLT) that autochanger commands are being executed correctly, still
bacula reports "ERR=Child died from signal 15: Termination".
Nils
T. Horsnell schrieb:
Assuming your O/S is a Unix/Linux of some sort, have you tried the
basic mtx command on it?
Something like
mtx -f /dev/sg3 load 1 0
(See 'man mtx')
Cheers,
Terry
Hi all,
bacula won't work with our autochanger anymore. I can't find the
source of the problems.
Here is the output of the autochanger-test of: "btape
-c /etc/bacula/bacula-sd.conf /dev/nst0":
=== Autochanger test ===
3301 Issuing autochanger "loaded" command.
Slot 1 loaded. I am going to unload it.
3302 Issuing autochanger "unload 1 0" command.
unload status=Bad 134217743
3992 Bad autochanger command: /usr/lib/bacula/mtx-changer /dev/sg3
unload 1 /dev/nst0 0
3992 result="Unloading drive 0 into Storage Element 1...done
Program killed by Bacula watchdog (timeout)
": ERR=Child died from signal 15: Termination
3303 Issuing autochanger "load 1 0" command.
3993 Bad autochanger command: /usr/lib/bacula/mtx-changer /dev/sg3
load 1 /dev/nst0 0
3993 result="Loading media from Storage Element 1 into drive
0...done
Program killed by Bacula watchdog (timeout)
": ERR=Child died from signal 15: Termination
You must correct this error or the Autochanger will not work.
This is the storage-definition:
Autochanger {
Name = QS3DLT
Device = DLT-Drive-1
Changer Command = "/usr/lib/bacula/mtx-changer %c %o %S %a %d"
Changer Device = /dev/sg3
}
Device {
Name = DLT-Drive-1 #
Drive Index = 0
Media Type = DLT-VS1
Archive Device = /dev/nst0
AutomaticMount = yes; # when device opened, read
it
AlwaysOpen = no;
RemovableMedia = yes;
RandomAccess = no;
AutoChanger = yes
Maximum Changer Wait = 10
Maximum Rewind Wait = 10
Maximum Open Wait = 10
}
When I look at the webadmin of the autochanger I see the
autochanger and the drive perform exactly the requested operations
at the usual speed (~1:30min for an unload operation, ~3:50 for an
unload/load operation).
Still I get lots of killing / timeout problems.
I start to wonder if the autochanger is somewhat defective...
If any of you guys can help I would greatly appreciate it.
All the best, Nils
Nils Blanck-Wehde schrieb:
Hi John,
thanks for your help. I am quite new to bacula and it seems to
take some time to fully understand it :-)
I am not sure whether it really is a timeout problem.
I increased all timeout values to 10 minutes and the killing
still occurs (after 2:20min):
07-Aug 12:20 company_bacula-dir JobId 217: Start Backup JobId
217,
Job=Fileserver_Lexware_Exchange_to_Tape.2008-08-07_12.20.03
07-Aug 12:20 company_bacula-dir JobId 217: Using Device
"DLT-Drive-1"
07-Aug 12:20 company_bacula-sd JobId 217: 3301 Issuing
autochanger "loaded? drive 0" command.
07-Aug 12:20 company_bacula-sd JobId 217: 3302 Autochanger
"loaded? drive 0", result: nothing loaded.
07-Aug 12:20 company_bacula-sd JobId 217: 3304 Issuing
autochanger "load slot 8, drive 0" command.
*messages
07-Aug 12:22 company_bacula-sd JobId 217: Fatal error: 3992 Bad
autochanger "load slot 8, drive 0": ERR=Child died from signal
15: Termination.
Results=Loading media from Storage Element 8 into drive
0...done
Program killed by Bacula watchdog (timeout)
07-Aug 12:20 company-appsrv-fd JobId 217: Fatal
error: ../../filed/job.c:1817 Bad response to Append Data
command. Wanted 3000 OK data, got 3903 Error append data
I doublechecked the time needed by the autochanger for an
unload/load operation: I did a couple of operations and they all
terminated in less than four minutes.
I think there is a general problem with the autochanger because
btape test throws the following error when testing the
autochanger:
3301 Issuing autochanger "loaded" command.
3991 Bad autochanger
command: /usr/lib/bacula/mtx-changer /dev/sg3 loaded 1 /dev/nst0
0
3991 result="": ERR=Child died from signal 15: Termination
You must correct this error or the Autochanger will not work.
When I run this command "/usr/lib/bacula/mtx-changer /dev/sg3
loaded 1 /dev/nst0 0" manually, both as root and as user bacula,
it returns "1":
[root@company-Backupserver ~]# su - bacula
-bash-3.2$ /usr/lib/bacula/mtx-changer /dev/sg3 loaded
1 /dev/nst0 0
1
-bash-3.2$
Could it be a permission problem? What Uid Gid should
mtx-changer have? Could some automatic mechanism on my CentOS
installation have changed the permissions of the autochanger
device to a lower level?
I am a little stuck here.
Thanks for all help!
Nils
John Drescher schrieb:
On Wed, Aug 6, 2008 at 1:42 PM, Nils Blanck-Wehde
<nils.blanck-wehde AT backofficeservice DOT biz>
<mailto:nils.blanck-wehde AT backofficeservice DOT biz> wrote:
Hello list,
I just encountered this issue:
06-Aug 19:31 company_bacula-sd JobId 210: Fatal error: 3992
Bad
autochanger "load slot 8, drive 0": ERR=Child died from
signal 15:
Termination.
Results=Loading media from Storage Element 8 into drive
0...done
Program killed by Bacula watchdog (timeout)
Earlier today I got this message:
06-Aug 17:10 company_bacula-dir JobId 208: Using Device
"DLT-Drive-1"
06-Aug 17:10 company_bacula-sd JobId 208: 3301 Issuing
autochanger "loaded? drive 0" command.
06-Aug 17:10 company_bacula-sd JobId 208: 3302 Autochanger
"loaded? drive 0", result: nothing loaded.
06-Aug 17:10 company_bacula-sd JobId 208: 3304 Issuing
autochanger "load slot 1, drive 0" command.
06-Aug 17:13 company-appsrv-fd JobId 208: Fatal
error: ../../filed/job.c:1817 Bad response to Append Data
command. Wanted 3000 OK data
, got 3903 Error append data
06-Aug 17:15 company_bacula-sd JobId 208: Fatal error: 3992
Bad autochanger "load slot 1, drive 0": ERR=Child died from
signal 15: Termination.
Results=Loading media from Storage Element 1 into drive
0...done
Program killed by Bacula watchdog (timeout)
The default maximum changer wait is 5 minutes. If the changer
does not
complete in 5 minutes bacula will kill the mtx-changer
script.
See Maximum Changer Wait in
http://bacula.org/en/rel-manual/Storage_Daemon_Configuratio.html
John
--
*B**ack**O**ffice**S**ervice* - Beratung und Service für Ihre IT
-
*Anschrift: *
Niederkastenholzer Str. 40
53881 Euskirchen
*Telefon: *+49 2255 953204* *
*Fax: *+49 2255 953208
*Mobil: *+49 177 3397547
*Bankverbindung: *
Raiffeisenbank Rheinbach Voreifel eG
Kto. Nr. 340286014 (BLZ 370 696 27)
*Online: *
info AT backofficeservice DOT biz <mailto:bos AT blanck-wehde DOT de>
www.backofficeservice.biz <http://www.backofficeservice.biz>
-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move
Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win
great prizes
Grand prize is a trip for two to an Open Source event anywhere
in the world
http://moblin-contest.org/redirect.php?banner_id=100&url="">
<http://moblin-contest.org/redirect.php?banner_id=100&url="">
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
<mailto:Bacula-users AT lists.sourceforge DOT net>
https://lists.sourceforge.net/lists/listinfo/bacula-users
!DSPAM:489ad34a16761048915462!
--
*B**ack**O**ffice**S**ervice* - Beratung und Service für Ihre IT
-
*Anschrift: *
Niederkastenholzer Str. 40
53881 Euskirchen
*Telefon: *+49 2255 953204* *
*Fax: *+49 2255 953208
*Mobil: *+49 177 3397547
*Bankverbindung: *
Raiffeisenbank Rheinbach Voreifel eG
Kto. Nr. 340286014 (BLZ 370 696 27)
*Online: *
info AT backofficeservice DOT biz <mailto:bos AT blanck-wehde DOT de>
www.backofficeservice.biz <http://www.backofficeservice.biz>
------------------------------------------------------------------------
-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's
challenge
Build the coolest Linux based applications with Moblin SDK & win
great prizes
Grand prize is a trip for two to an Open Source event anywhere in
the world
http://moblin-contest.org/redirect.php?banner_id=100&url="">
------------------------------------------------------------------------
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users
!DSPAM:489add52223816216930368!
--
BackOfficeService- Beratung und Service für Ihre IT -
Anschrift:
Niederkastenholzer Str. 40
53881 Euskirchen
Telefon: +49
2255 953204
Fax: +49
2255 953208
Mobil: +49
177 3397547
Bankverbindung:
Raiffeisenbank
Rheinbach
Voreifel eG
Kto. Nr.
340286014 (BLZ
370 696 27)
Online:
info AT backofficeservice DOT biz
www.backofficeservice.biz
-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url="">
_______________________________________________ Bacula-users mailing list Bacula-users AT lists.sourceforge DOT net https://lists.sourceforge.net/lists/listinfo/bacula-users
--
BackOfficeService
-
Beratung und Service für Ihre IT -
Anschrift:
Niederkastenholzer
Str. 40
53881
Euskirchen
|
Telefon:
+49
2255 953204
Fax:
+49
2255 953208
Mobil: +49
177 3397547
|
Bankverbindung:
Raiffeisenbank
Rheinbach Voreifel eG
Kto.
Nr. 340286014 (BLZ 370 696 27)
|
Online:
info@backofficeservice.biz
www.backofficeservice.biz
|
|
-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users
|
|
|