Bacula-users

[Bacula-users] Back error only in one machine

2015-07-13 06:11:29
Subject: [Bacula-users] Back error only in one machine
From: Iban Cabrillo <cabrillo AT ifca.unican DOT es>
To: Bacula Users <Bacula-users AT lists.sourceforge DOT net>
Date: Mon, 13 Jul 2015 12:06:24 +0200
Hi,
  I do not really understand why this back is failing (this is the only back that never works), I have other machines in the same configuration (bacula is in internal network and it makes the back through a firewall port forwarding) and so many other on the same internal network.

ii  bacula                              5.2.6+dfsg-5ubuntu1                    all          network backup service - metapackage
ii  bacula-client                       5.2.6+dfsg-5ubuntu1                    all          network backup service - client metapackage
ii  bacula-common                       5.2.6+dfsg-5ubuntu1                    amd64        network backup service - common support files
ii  bacula-common-mysql                 5.2.6+dfsg-5ubuntu1                    amd64        network backup service - MySQL common files
ii  bacula-console                      5.2.6+dfsg-5ubuntu1                    amd64        network backup service - text console
ii  bacula-director-common              5.2.6+dfsg-5ubuntu1                    amd64        network backup service - Director common files
ii  bacula-director-mysql               5.2.6+dfsg-5ubuntu1                    amd64        network backup service - MySQL storage for Director
ii  bacula-fd                           5.2.6+dfsg-5ubuntu1                    amd64        network backup service - file daemon
ii  bacula-sd                           5.2.6+dfsg-5ubuntu1                    amd64        network backup service - storage daemon
ii  bacula-sd-mysql                     5.2.6+dfsg-5ubuntu1                    amd64        network backup service - MySQL SD tools
ii  bacula-server                       5.2.6+dfsg-5ubuntu1                    all          network backup service - server metapackage
ii  bacula-traymonitor                  5.2.6+dfsg-5ubuntu1                    amd64        network backup service - tray monitor

  The bacula back process start well, and asking the clients status everything seems to be OK:
 
*status client=ibergrid-voms-fd
Connecting to Client ibergrid-voms-fd at ibergrid-voms.ifca.es:9102

ibergrid-voms-fd Version: 5.0.0 (26 January 2010)  x86_64-redhat-linux-gnu redhat (Carbon)
Daemon started 10-Jul-15 13:32, 9 Jobs run since started.
 Heap: heap=135,168 smbytes=91,924 max_bytes=794,880 bufs=97 max_bufs=502
 Sizeof: boffset_t=8 size_t=8 debug=0 trace=0

Running Jobs:
Director connected at: 13-Jul-15 10:32
No Jobs running.
====

Terminated Jobs:
 JobId  Level    Files      Bytes   Status   Finished        Name
======================================================================
 16498  Full    137,046    1.624 G  OK       10-Jul-15 05:48 BackIbergrid-voms
 16516  Full    137,045    1.231 G  OK       10-Jul-15 13:42 BackIbergrid-voms
 16516  Full    137,045    1.231 G  OK       10-Jul-15 16:43 BackIbergrid-voms
 16524  Full    137,045    1.231 G  OK       10-Jul-15 19:06 BackIbergrid-voms
 16524  Full    137,045    1.231 G  OK       10-Jul-15 22:06 BackIbergrid-voms
 16546  Full    137,056    1.231 G  OK       11-Jul-15 18:30 BackIbergrid-voms
 16546  Full    137,056    1.231 G  OK       11-Jul-15 21:32 BackIbergrid-voms
 16566  Full    137,067    1.231 G  OK       12-Jul-15 16:37 BackIbergrid-voms
 16566  Full    137,067    1.225 G  OK       12-Jul-15 19:38 BackIbergrid-voms
 16580  Full    137,078    1.226 G  OK       13-Jul-15 10:11 BackIbergrid-voms


I do spooldata = yes

13-Jul 10:11 bacula-sd JobId 16580: Job write elapsed time = 00:10:41, Transfer rate = 1.942 M Bytes/second
13-Jul 10:11 bacula-sd JobId 16580: Committing spooled data to Volume "L30006L3". Despooling 1,249,398,751 bytes ...
13-Jul 10:12 bacula-sd JobId 16580: Despooling elapsed time = 00:00:30, Transfer rate = 41.64 M Bytes/second
13-Jul 10:12 bacula-sd JobId 16580: Sending spooled attrs to the Director. Despooling 31,611,415 bytes ...



While the client say that the job has ended correctly the diector see this still running:

 Running Jobs:
Console connected at 13-Jul-15 10:18
 JobId Level   Name                       Status
======================================================================
 16580 Full    BackIbergrid-voms.2015-07-13_10.01.02_03 is running
........

The storage deamon sees the jobs end ok:

====
*status storage=TSM3500-LTO3
Connecting to Storage daemon TSM3500-LTO3 at bacula.ifca.es:9103

bacula-sd Version: 5.2.6 (21 February 2012) x86_64-pc-linux-gnu ubuntu 12.10
Daemon started 07-May-15 13:54. Jobs: run=1611, running=0.
 Heap: heap=270,336 smbytes=1,093,786 max_bytes=2,185,198 bufs=144 max_bufs=191
 Sizes: boffset_t=8 size_t=8 int32_t=4 int64_t=8 mode=0,0

Running Jobs:
No Jobs running.
====

Jobs waiting to reserve a drive:
====

Terminated Jobs:
 JobId  Level    Files      Bytes   Status   Finished        Name
===================================================================
.......
 16580  Full    137,078    1.245 G  OK       13-Jul-15 10:12 BackIbergrid-voms
====

Device status:
Autochanger "TSM3500" with devices:
   "ULT3580-TD3" (/dev/nst1)
   "ULT3580-TD5" (/dev/nst0)
Device "FileStorage" (/data/Bacula/Default) is not open.
Device "ULT3580-TD3" (/dev/nst1) is mounted with:
    Volume:      L30006L3
    Pool:        Full
    Media type:  LTO-3
    Slot 7 is loaded in drive 1.
    Total Bytes=393,608,082,432 Blocks=1,501,523 Bytes/block=262,139
    Positioned at File=425 Block=0

Device "ULT3580-TD5" (/dev/nst0) is not open.
    Drive 0 is not loaded.
====

Used Volume status:
L30006L3 on device "ULT3580-TD3" (/dev/nst1)
    Reader=0 writers=0 devres=0 volinuse=0
L50014L5 on device "ULT3580-TD5" (/dev/nst0)
    Reader=0 writers=0 devres=0 volinuse=0
====

Data spooling: 0 active jobs, 0 bytes; 1302 total jobs, 19,998,308,040 max bytes/job.
Attr spooling: 0 active jobs, 9,230,696,467 bytes; 1302 total jobs, 9,230,696,467 max bytes.
====



The spool data is no longer on spool directory.ls -la /data/spool/bacula-sd.data.16580.BackIbergrid-voms.2015-07-13_10.01.02_03.ULT3580-TD3.spool
-rw-r----- 1 bacula tape 237513329 Jul 13 10:03 /data/spool/bacula-sd.data.16580.BackIbergrid-voms.2015-07-13_10.01.02_03.ULT3580-TD3.spool

 ls -la /data/spool/bacula-sd.data.16580.BackIbergrid-voms.2015-07-13_10.01.02_03.ULT3580-TD3.spool
ls: cannot access /data/spool/bacula-sd.data.16580.BackIbergrid-voms.2015-07-13_10.01.02_03.ULT3580-TD3.spool: No such file or directory

During long time (1h-2h ) , list jobs, soy the job as running...

16,580 | BackIbergrid-voms      | 2015-07-13 10:01:05 | B    | F     |         0 |                 0 | R   

*messages
13-Jul 12:01 bacula-dir JobId 16580: Fatal error: Network error with FD during Backup: ERR=Connection reset by peer
13-Jul 12:01 bacula-dir JobId 16580: Fatal error: No Job status returned from FD.
13-Jul 12:01 bacula-dir JobId 16580: Error: Bacula bacula-dir 5.2.6 (21Feb12):
  Build OS:               x86_64-pc-linux-gnu ubuntu 12.10
  JobId:                  16580
  Job:                    BackIbergrid-voms.2015-07-13_10.01.02_03
  Backup Level:           Full
  Client:                 "ibergrid-voms-fd" 5.0.0 (26Jan10) x86_64-redhat-linux-gnu,redhat,(Carbon)
  FileSet:                "FullServicesNoHome" 2013-09-02 23:05:00
  Pool:                   "Full" (From Job FullPool override)
  Catalog:                "MyCatalog" (From Client resource)
  Storage:                "TSM3500-LTO3" (From Job resource)
  Scheduled time:         13-Jul-2015 10:00:59
  Start time:             13-Jul-2015 10:01:05
  End time:               13-Jul-2015 12:01:06
  Elapsed time:           2 hours 1 sec
  Priority:               10
  FD Files Written:       0
  SD Files Written:       137,078
  FD Bytes Written:       0 (0 B)
  SD Bytes Written:       1,245,362,413 (1.245 GB)
  Rate:                   0.0 KB/s
  Software Compression:   None
  VSS:                    no
  Encryption:             no
  Accurate:               no
  Volume name(s):         L30006L3
  Volume Session Id:      1611
  Volume Session Time:    1430999647
  Last Volume Bytes:      393,608,082,432 (393.6 GB)
  Non-fatal FD errors:    1
  SD Errors:              0
  FD termination status:  Error
  SD termination status:  OK
  Termination:            *** Backup Error ***

| 16,580 | BackIbergrid-voms      | 2015-07-13 10:01:05 | B    | F     |         0 |                 0 | f        

Any Idea?

Cheers, I
 

--
############################################################################
Iban Cabrillo Bartolome
Instituto de Fisica de Cantabria (IFCA)
Santander, Spain
Tel: +34942200969
PGP PUBLIC KEY: http://pgp.mit.edu/pks/lookup?op=get&search=0xD9DF0B3D6C8C08AC
############################################################################
Bertrand Russell:
"El problema con el mundo es que los estúpidos están seguros de todo y los inteligentes están llenos de dudas"
------------------------------------------------------------------------------
Don't Limit Your Business. Reach for the Cloud.
GigeNET's Cloud Solutions provide you with the tools and support that
you need to offload your IT needs and focus on growing your business.
Configured For All Businesses. Start Your Cloud Today.
https://www.gigenetcloud.com/
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users
<Prev in Thread] Current Thread [Next in Thread>