Bacula-users

Re: [Bacula-users] Catastrophic error. Cannot write overflow block to device "LTO4"

2011-07-11 17:03:43
Subject: Re: [Bacula-users] Catastrophic error. Cannot write overflow block to device "LTO4"
From: Steve Costaras <stevecs AT chaven DOT com>
To: Martin Simmons <martin AT lispworks DOT com>
Date: Mon, 11 Jul 2011 16:00:15 -0500

On 2011-07-11 06:13, Martin Simmons wrote:
>>>>>> On Sun, 10 Jul 2011 12:17:55 +0000, Steve Costaras said:
>> Importance: Normal
>> Sensitivity: Normal
>>
>> I am trying a full backup/multi-job to a single client and all was going 
>> well until this morning when I received the error below.   All other jobs 
>> were also canceled.
>>
>> My question is two fold:
>>
>> 1) What the heck is this error?  I can unmount the drive, issue a rawfill to
>> the tape w/ btape and no problems?
>> ...
>> 3000 OK label. VolBytes=1024 DVD=0 Volume="FA0016" Device="LTO4" (/dev/nst0)
>> Requesting to mount LTO4 ...
>> 3905 Bizarre wait state 7
>> Do not forget to mount the drive!!!
>> 2011-07-10 03SD-loki JobId 6: Wrote label to prelabeled Volume "FA0016" on 
>> device "LTO4" (/dev/nst0)
>> 2011-07-10 03SD-loki JobId 6: New volume "FA0016" mounted on device "LTO4" 
>> (/dev/nst0) at 10-Jul-2011 03:51.
>> 2011-07-10 03SD-loki JobId 6: Fatal error: block.c:439 Attempt to write on 
>> read-only Volume. dev="LTO4" (/dev/nst0)
>> 2011-07-10 03SD-loki JobId 6: End of medium on Volume "FA0016" Bytes=1,024 
>> Blocks=0 at 10-Jul-2011 03:51.
>> 2011-07-10 03SD-loki JobId 6: Fatal error: Job 6 canceled.
>> 2011-07-10 03SD-loki JobId 6: Fatal error: device.c:192 Catastrophic error. 
>> Cannot write overflow block to device "LTO4" (/dev/nst0). ERR=Input/output 
>> error
> Do you regularly see the "3905 Bizarre wait state 7" message?  It could be an
> indication of problems (and everything after that could be a consequence of
> it).
>
> What are the messages that lead up to that point?
Nothing, really, this was the 17th tape in a row on a ~3day (so far) 
backup.    No messages in /var/log/messages.   Previous messages from 
bacula are below as you can see it just blows chunks right after FA0016 
is mounted, all concurrent jobs are killed.    And I've tested that tape 
before the backup ran and again right after this failure with btape.   
no problems.



-----------
*label storage=LTO4 pool=BackupSetFA volume=FA0015
Connecting to Storage daemon LTO4 at loki:9103 ...
Sending label command for Volume "FA0015" Slot 14 ...
3000 OK label. VolBytes=1024 DVD=0 Volume="FA0015" Device="LTO4" (/dev/nst0)
Requesting to mount LTO4 ...
3001 Device "LTO4" (/dev/nst0) is mounted with Volume "FA0015"
*
2011-07-10 00SD-loki JobId 3: Wrote label to prelabeled Volume "FA0015" 
on device "LTO4" (/dev/nst0)
2011-07-10 00SD-loki JobId 3: New volume "FA0015" mounted on device 
"LTO4" (/dev/nst0) at 10-Jul-2011 00:48.
*
2011-07-10 00SD-loki JobId 3: Despooling elapsed time = 01:21:56, 
Transfer rate = 70.98 M Bytes/second
2011-07-10 00SD-loki JobId 3: Alert: smartctl version 5.38 
[x86_64-unknown-linux-gnu] Copyright (C) 2002-8 Bruce Allen
2011-07-10 00SD-loki JobId 3: Alert: Home page is 
http://smartmontools.sourceforge.net/
2011-07-10 00SD-loki JobId 3: Alert:
2011-07-10 00SD-loki JobId 3: Alert: TapeAlert: OK
2011-07-10 00SD-loki JobId 3: Alert:
2011-07-10 00SD-loki JobId 3: Alert: Error counter log:
2011-07-10 00SD-loki JobId 3: Alert:            Errors Corrected 
by           Total   Correction     Gigabytes    Total
2011-07-10 00SD-loki JobId 3: Alert:                ECC          
rereads/    errors   algorithm      processed    uncorrected
2011-07-10 00SD-loki JobId 3: Alert:            fast | delayed   
rewrites  corrected  invocations   [10^9 bytes]  errors
2011-07-10 00SD-loki JobId 3: Alert: read:          0        0         
0         0          0          0.000           0
2011-07-10 00SD-loki JobId 3: Alert: write:      3010        0      
3010      3010       3010          0.000           0
2011-07-10 00SD-loki JobId 3: Sending spooled attrs to the Director. 
Despooling 65,784,417 bytes ...
2011-07-10 00DIR-loki JobId 3: Bacula DIR-loki 5.0.3 (04Aug10): 
10-Jul-2011 00:58:04
   Build OS:               x86_64-unknown-linux-gnu ubuntu 10.04
   JobId:                  3
   Job:                    JOB-loki_var_ftp_.2011-07-07_17.45.00_05
   Backup Level:           Full
   Client:                 "FD-loki" 5.0.3 (04Aug10) 
x86_64-unknown-linux-gnu,ubuntu,10.04
   FileSet:                "FS-loki_var_ftp_" 2011-07-06 18:00:00
   Pool:                   "BackupSetFA" (From Run FullPool override)
   Catalog:                "MyCatalog" (From Client resource)
   Storage:                "LTO4" (From Pool resource)
   Scheduled time:         07-Jul-2011 17:45:00
   Start time:             07-Jul-2011 17:50:30
   End time:               10-Jul-2011 00:58:04
   Elapsed time:           2 days 7 hours 7 mins 34 secs
   Priority:               50
   FD Files Written:       186,287
   SD Files Written:       186,287
   FD Bytes Written:       2,925,298,735,317 (2.925 TB)
   SD Bytes Written:       2,925,332,067,132 (2.925 TB)
   Rate:                   14740.4 KB/s
   Software Compression:   None
   VSS:                    no
   Encryption:             no
   Accurate:               yes
   Volume name(s):         
FA0001|FA0002|FA0005|FA0006|FA0010|FA0011|FA0014|FA0015
   Volume Session Id:      4
   Volume Session Time:    1310078212
   Last Volume Bytes:      35,227,823,104 (35.22 GB)
   Non-fatal FD errors:    0
   SD Errors:              0
   FD termination status:  OK
   SD termination status:  OK
   Termination:            Backup OK

2011-07-10 00DIR-loki JobId 3: Begin pruning Jobs older than 1 year 25 
days .
2011-07-10 00DIR-loki JobId 3: No Jobs found to prune.
2011-07-10 00DIR-loki JobId 3: Begin pruning Jobs.
2011-07-10 00DIR-loki JobId 3: No Files found to prune.
2011-07-10 00DIR-loki JobId 3: End auto prune.

2011-07-10 03SD-loki JobId 6: End of Volume "FA0015" at 98:1604 on 
device "LTO4" (/dev/nst0). Write of 2097152 bytes got -1.
2011-07-10 03SD-loki JobId 6: Re-read of last block succeeded.
2011-07-10 03SD-loki JobId 6: End of medium on Volume "FA0015" 
Bytes=844,973,861,888 Blocks=402,915 at 10-Jul-2011 03:30.
2011-07-10 03SD-loki JobId 6: Please mount Volume "FA0016" or label a 
new one for:
     Job:          
JOB-loki_var_ftp_pub_Multimedia_DVD.2011-07-07_17.45.01_08
     Storage:      "LTO4" (/dev/nst0)
     Pool:         BackupSetFA
     Media type:   LTO4
*umount storage=LTO4
3001 Device "LTO4" (/dev/nst0) unmounted.
*label storage=LTO4 pool=BackupSetFA volume=FA0016
Connecting to Storage daemon LTO4 at loki:9103 ...
Sending label command for Volume "FA0016" Slot 15 ...
3000 OK label. VolBytes=1024 DVD=0 Volume="FA0016" Device="LTO4" (/dev/nst0)
Requesting to mount LTO4 ...
3905 Bizarre wait state 7
Do not forget to mount the drive!!!
2011-07-10 03SD-loki JobId 6: Wrote label to prelabeled Volume "FA0016" 
on device "LTO4" (/dev/nst0)
2011-07-10 03SD-loki JobId 6: New volume "FA0016" mounted on device 
"LTO4" (/dev/nst0) at 10-Jul-2011 03:51.
2011-07-10 03SD-loki JobId 6: Fatal error: block.c:439 Attempt to write 
on read-only Volume. dev="LTO4" (/dev/nst0)
2011-07-10 03SD-loki JobId 6: End of medium on Volume "FA0016" 
Bytes=1,024 Blocks=0 at 10-Jul-2011 03:51.
2011-07-10 03SD-loki JobId 6: Fatal error: Job 6 canceled.
2011-07-10 03SD-loki JobId 6: Fatal error: device.c:192 Catastrophic 
error. Cannot write overflow block to device "LTO4" (/dev/nst0). 
ERR=Input/output error

*
2011-07-10 03SD-loki JobId 6: Despooling elapsed time = 02:32:53, 
Transfer rate = 93.64 M Bytes/second
2011-07-10 03SD-loki JobId 6: Job write elapsed time = 57:37:54, 
Transfer rate = 8.278 M Bytes/second
2011-07-10 03FD-loki JobId 6: Error: bsock.c:393 Write error sending 
65536 bytes to Storage daemon:loki:9103: ERR=Connection reset by peer
2011-07-10 03FD-loki JobId 6: Fatal error: backup.c:1024 Network send 
error to SD. ERR=Connection reset by peer
2011-07-10 03SD-loki JobId 7: Fatal error: block.c:439 Attempt to write 
on read-only Volume. dev="LTO4" (/dev/nst0)
2011-07-10 03SD-loki JobId 7: Fatal error: spool.c:301 Fatal append 
error on device "LTO4" (/dev/nst0): ERR=block.c:1015 Read zero bytes at 
0:0 on device "LTO4" (/dev/nst0).

2011-07-10 03SD-loki JobId 7: Despooling elapsed time = 00:00:01, 
Transfer rate = 858.9 G Bytes/second
*
2011-07-10 03DIR-loki JobId 6: Error: Bacula DIR-loki 5.0.3 (04Aug10): 
10-Jul-2011 03:52:08
   Build OS:               x86_64-unknown-linux-gnu ubuntu 10.04
   JobId:                  6
   Job:                    
JOB-loki_var_ftp_pub_Multimedia_DVD.2011-07-07_17.45.01_08
   Backup Level:           Full
   Client:                 "FD-loki" 5.0.3 (04Aug10) 
x86_64-unknown-linux-gnu,ubuntu,10.04
   FileSet:                "FS-loki_var_ftp_pub_Multimedia_DVD" 
2011-07-06 18:00:01
   Pool:                   "BackupSetFA" (From Run FullPool override)
   Catalog:                "MyCatalog" (From Client resource)
   Storage:                "LTO4" (From Pool resource)
   Scheduled time:         07-Jul-2011 17:45:01
   Start time:             07-Jul-2011 17:50:30
   End time:               10-Jul-2011 03:52:08
   Elapsed time:           2 days 10 hours 1 min 38 secs
   Priority:               50
   FD Files Written:       452
   SD Files Written:       452
   FD Bytes Written:       1,717,640,639,816 (1.717 TB)
   SD Bytes Written:       1,717,632,388,872 (1.717 TB)
   Rate:                   8222.4 KB/s
   Software Compression:   None
   VSS:                    no
   Encryption:             no
   Accurate:               yes
   Volume name(s):         FA0011|FA0012|FA0015
   Volume Session Id:      6
   Volume Session Time:    1310078212
   Last Volume Bytes:      1,024 (1.024 KB)
   Non-fatal FD errors:    1
   SD Errors:              0
   FD termination status:  Error
   SD termination status:  Error
   Termination:            *** Backup Error ***

*
2011-07-10 03SD-loki JobId 7: Job write elapsed time = 57:26:37, 
Transfer rate = 16.61 M Bytes/second
2011-07-10 03FD-loki JobId 7: Error: bsock.c:393 Write error sending 
65536 bytes to Storage daemon:loki:9103: ERR=Connection reset by peer
2011-07-10 03FD-loki JobId 7: Fatal error: backup.c:1024 Network send 
error to SD. ERR=Connection reset by peer
2011-07-10 03SD-loki JobId 5: Fatal error: block.c:439 Attempt to write 
on read-only Volume. dev="LTO4" (/dev/nst0)
2011-07-10 03SD-loki JobId 5: Fatal error: spool.c:301 Fatal append 
error on device "LTO4" (/dev/nst0): ERR=block.c:1015 Read zero bytes at 
0:0 on device "LTO4" (/dev/nst0).

2011-07-10 03SD-loki JobId 5: Despooling elapsed time = 00:00:01, 
Transfer rate = 141.9 G Bytes/second
2011-07-10 03DIR-loki JobId 7: Error: Bacula DIR-loki 5.0.3 (04Aug10): 
10-Jul-2011 03:57:49
   Build OS:               x86_64-unknown-linux-gnu ubuntu 10.04
   JobId:                  7
   Job:                    
JOB-loki_var_ftp_pub_mirrors.2011-07-07_17.45.01_09
   Backup Level:           Full
   Client:                 "FD-loki" 5.0.3 (04Aug10) 
x86_64-unknown-linux-gnu,ubuntu,10.04
   FileSet:                "FS-loki_var_ftp_pub_mirrors" 2011-07-06 18:00:01
   Pool:                   "BackupSetFA" (From Run FullPool override)
   Catalog:                "MyCatalog" (From Client resource)
   Storage:                "LTO4" (From Pool resource)
   Scheduled time:         07-Jul-2011 17:45:01
   Start time:             07-Jul-2011 17:50:31
   End time:               10-Jul-2011 03:57:49
   Elapsed time:           2 days 10 hours 7 mins 18 secs
   Priority:               50
   FD Files Written:       403,525
   SD Files Written:       403,525
   FD Bytes Written:       3,435,178,457,337 (3.435 TB)
   SD Bytes Written:       3,435,250,296,123 (3.435 TB)
   Rate:                   16417.6 KB/s
   Software Compression:   None
   VSS:                    no
   Encryption:             no
   Accurate:               yes
   Volume name(s):         FA0003|FA0004|FA0008|FA0009|FA0013|FA0014
   Volume Session Id:      7
   Volume Session Time:    1310078212
   Last Volume Bytes:      1,024 (1.024 KB)
   Non-fatal FD errors:    1
   SD Errors:              0
   FD termination status:  Error
   SD termination status:  Error
   Termination:            *** Backup Error ***

2011-07-10 03DIR-loki JobId 5: Error: Bacula DIR-loki 5.0.3 (04Aug10): 
10-Jul-2011 03:58:22
   Build OS:               x86_64-unknown-linux-gnu ubuntu 10.04
   JobId:                  5
   Job:                    
JOB-loki_var_ftp_pub_Multimedia_BluRay.2011-07-07_17.45.01_07
   Backup Level:           Full
   Client:                 "FD-loki" 5.0.3 (04Aug10) 
x86_64-unknown-linux-gnu,ubuntu,10.04
   FileSet:                "FS-loki_var_ftp_pub_Multimedia_BluRay" 
2011-07-06 18:00:01
   Pool:                   "BackupSetFA" (From Run FullPool override)
   Catalog:                "MyCatalog" (From Client resource)
   Storage:                "LTO4" (From Pool resource)
   Scheduled time:         07-Jul-2011 17:45:01
   Start time:             07-Jul-2011 17:50:30
   End time:               10-Jul-2011 03:58:22
   Elapsed time:           2 days 10 hours 7 mins 52 secs
   Priority:               50
   FD Files Written:       683
   SD Files Written:       683
   FD Bytes Written:       2,718,316,374,045 (2.718 TB)
   SD Bytes Written:       2,718,316,489,228 (2.718 TB)
   Rate:                   12989.4 KB/s
   Software Compression:   None
   VSS:                    no
   Encryption:             no
   Accurate:               yes
   Volume name(s):         FA0000|FA0001|FA0004|FA0005|FA0009|FA0010
   Volume Session Id:      5
   Volume Session Time:    1310078212
   Last Volume Bytes:      1,024 (1.024 KB)
   Non-fatal FD errors:    0
   SD Errors:              0
   FD termination status:  OK
   SD termination status:  Error
   Termination:            *** Backup Error ***

*umount storage=LTO4
3002 Device "LTO4" (/dev/nst0) unmounted.
------------

------------------------------------------------------------------------------
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users