Bacula-users

[Bacula-users] Sense Key : Medium Error Write error

2014-07-09 03:30:12
Subject: [Bacula-users] Sense Key : Medium Error Write error
From: kkuznetsov <bacula-forum AT backupcentral DOT com>
To: bacula-users AT lists.sourceforge DOT net
Date: Wed, 09 Jul 2014 00:26:59 -0700
I have Hp LTO-3 scsi device 
https://h10057.www1.hp.com/ecomcat/hpcatalog/specs/provisioner/05/Q1539B.htm
witch worked for several years. Yesterday i wanted to make a bicg backup (about 
3 TB). I had 10 labeled new casstes (400GB). So i turned on backing up through 
web-bacula
System is openSuse 11
wg-srv-dir Version: 2.4.3
So, 3 tapes worked fine, on a fourth i saw this messages:

 2014-07-07 17:58:24   wg-srv-sd JobId 319: New volume "Cassette_144" mounted 
on device "Ultrium960" (/dev/st0) at 07-Jul-2014 17:58.
  2014-07-07 18:57:16   wg-srv-sd JobId 319: Error: block.c:568 Write error at 
21:1902 on device "Ultrium960" (/dev/st0). ERR=Input/output error.
  2014-07-07 19:12:16   wg-srv-sd JobId 319: Error: Error writing final EOF to 
tape. This Volume may not be readable.
dev.c:1681 ioctl MTWEOF error on "Ultrium960" (/dev/st0). ERR=Input/output 
error.
  2014-07-07 19:12:16   wg-srv-sd JobId 319: End of medium on Volume 
"Cassette_144" Bytes=21,121,357,824 Blocks=327,401 at 07-Jul-2014 19:12.
  2014-07-07 19:12:16   wg-srv-sd JobId 319: 3307 Issuing autochanger "unload 
slot 144, drive 2" command.


So only in several hours tape was pulled off. In a morning i inserted a new 145 
tape. But nothing happend, it still waited for a tape, even it was in already. 
So i canceled job in an hour. After that i attemped to label new tape - and the 
same story - it even does not see a tape inside of it (145), i manually pulled 
off tape (through the button) and inserrted a new tape without label, baclula 
did not see it anyway.

Here is a full log of backup:

 2014-07-07 11:58:42   wg-srv-dir JobId 319: Start Backup JobId 319, 
Job=wg-archive2.2014-07-07_11.58.41
  2014-07-07 11:58:42   wg-srv-dir JobId 319: Using Device "Ultrium960"
  2014-07-07 11:58:42   wg-srv-sd JobId 319: 3307 Issuing autochanger "unload 
slot 129, drive 2" command.
  2014-07-07 11:59:01   wg-srv-sd JobId 319: 3304 Issuing autochanger "load 
slot 141, drive 2" command.
  2014-07-07 12:00:31   wg-srv-sd JobId 319: 3305 Autochanger "load slot 141, 
drive 2", status is OK.
  2014-07-07 12:00:38   wg-srv-sd JobId 319: Volume "Cassette_141" previously 
written, moving to end of data.
  2014-07-07 12:01:37   wg-srv-sd JobId 319: Ready to append to end of Volume 
"Cassette_141" at file=278.
  2014-07-07 12:42:28   wg-srv-sd JobId 319: End of Volume "Cassette_141" at 
413:3145 on device "Ultrium960" (/dev/st0). Write of 64512 bytes got -1.
  2014-07-07 12:42:34   wg-srv-sd JobId 319: Re-read of last block succeeded.
  2014-07-07 12:42:34   wg-srv-sd JobId 319: End of medium on Volume 
"Cassette_141" Bytes=411,986,340,864 Blocks=6,386,196 at 07-Jul-2014 12:42.
  2014-07-07 12:42:34   wg-srv-sd JobId 319: 3307 Issuing autochanger "unload 
slot 141, drive 2" command.
  2014-07-07 12:43:34   wg-srv-sd JobId 319: 3301 Issuing autochanger "loaded? 
drive 2" command.
  2014-07-07 12:43:34   wg-srv-sd JobId 319: 3302 Autochanger "loaded? drive 
2", result: nothing loaded.
  2014-07-07 12:43:34   wg-srv-sd JobId 319: 3304 Issuing autochanger "load 
slot 142, drive 2" command.
  2014-07-07 12:52:05   wg-srv-sd JobId 319: 3305 Autochanger "load slot 142, 
drive 2", status is OK.
  2014-07-07 12:52:16   wg-srv-sd JobId 319: Wrote label to prelabeled Volume 
"Cassette_142" on device "Ultrium960" (/dev/st0)
  2014-07-07 12:52:16   wg-srv-sd JobId 319: New volume "Cassette_142" mounted 
on device "Ultrium960" (/dev/st0) at 07-Jul-2014 12:52.
  2014-07-07 15:18:36   wg-srv-sd JobId 319: End of Volume "Cassette_142" at 
412:11565 on device "Ultrium960" (/dev/st0). Write of 64512 bytes got -1.
  2014-07-07 15:18:46   wg-srv-sd JobId 319: Re-read of last block succeeded.
  2014-07-07 15:18:46   wg-srv-sd JobId 319: End of medium on Volume 
"Cassette_142" Bytes=412,719,713,280 Blocks=6,397,564 at 07-Jul-2014 15:18.
  2014-07-07 15:18:46   wg-srv-sd JobId 319: 3307 Issuing autochanger "unload 
slot 142, drive 2" command.
  2014-07-07 15:19:39   wg-srv-sd JobId 319: 3301 Issuing autochanger "loaded? 
drive 2" command.
  2014-07-07 15:19:39   wg-srv-sd JobId 319: 3302 Autochanger "loaded? drive 
2", result: nothing loaded.
  2014-07-07 15:19:39   wg-srv-sd JobId 319: 3304 Issuing autochanger "load 
slot 143, drive 2" command.
  2014-07-07 15:21:39   wg-srv-sd JobId 319: 3305 Autochanger "load slot 143, 
drive 2", status is OK.
  2014-07-07 15:21:50   wg-srv-sd JobId 319: Wrote label to prelabeled Volume 
"Cassette_143" on device "Ultrium960" (/dev/st0)
  2014-07-07 15:21:50   wg-srv-sd JobId 319: New volume "Cassette_143" mounted 
on device "Ultrium960" (/dev/st0) at 07-Jul-2014 15:21.
  2014-07-07 17:27:02   wg-srv-sd JobId 319: End of Volume "Cassette_143" at 
412:1 on device "Ultrium960" (/dev/st0). Write of 64512 bytes got -1.
  2014-07-07 17:27:10   wg-srv-sd JobId 319: Re-read of last block succeeded.
  2014-07-07 17:27:10   wg-srv-sd JobId 319: End of medium on Volume 
"Cassette_143" Bytes=411,973,696,512 Blocks=6,386,000 at 07-Jul-2014 17:27.
  2014-07-07 17:27:10   wg-srv-sd JobId 319: 3307 Issuing autochanger "unload 
slot 143, drive 2" command.
  2014-07-07 17:28:21   wg-srv-sd JobId 319: 3301 Issuing autochanger "loaded? 
drive 2" command.
  2014-07-07 17:28:21   wg-srv-sd JobId 319: 3302 Autochanger "loaded? drive 
2", result: nothing loaded.
  2014-07-07 17:28:21   wg-srv-sd JobId 319: 3304 Issuing autochanger "load 
slot 144, drive 2" command.
  2014-07-07 17:58:13   wg-srv-sd JobId 319: 3305 Autochanger "load slot 144, 
drive 2", status is OK.
  2014-07-07 17:58:24   wg-srv-sd JobId 319: Wrote label to prelabeled Volume 
"Cassette_144" on device "Ultrium960" (/dev/st0)
  2014-07-07 17:58:24   wg-srv-sd JobId 319: New volume "Cassette_144" mounted 
on device "Ultrium960" (/dev/st0) at 07-Jul-2014 17:58.
  2014-07-07 18:57:16   wg-srv-sd JobId 319: Error: block.c:568 Write error at 
21:1902 on device "Ultrium960" (/dev/st0). ERR=Input/output error.
  2014-07-07 19:12:16   wg-srv-sd JobId 319: Error: Error writing final EOF to 
tape. This Volume may not be readable.
dev.c:1681 ioctl MTWEOF error on "Ultrium960" (/dev/st0). ERR=Input/output 
error.
  2014-07-07 19:12:16   wg-srv-sd JobId 319: End of medium on Volume 
"Cassette_144" Bytes=21,121,357,824 Blocks=327,401 at 07-Jul-2014 19:12.
  2014-07-07 19:12:16   wg-srv-sd JobId 319: 3307 Issuing autochanger "unload 
slot 144, drive 2" command.
  2014-07-08 02:59:14   wg-srv-sd JobId 319: 3301 Issuing autochanger "loaded? 
drive 2" command.
  2014-07-08 02:59:14   wg-srv-sd JobId 319: 3302 Autochanger "loaded? drive 
2", result: nothing loaded.
  2014-07-08 02:59:14   wg-srv-sd JobId 319: 3304 Issuing autochanger "load 
slot 145, drive 2" command.
  2014-07-08 12:21:12   wg-archive2-fd JobId 319: Fatal error: backup.c:892 
Network send error to SD. ERR=Input/output error
  2014-07-08 12:21:12   wg-srv-sd JobId 319: Job 
wg-archive2.2014-07-07_11.58.41 marked to be canceled.
  2014-07-08 12:21:14   wg-srv-sd JobId 319: Job 
wg-archive2.2014-07-07_11.58.41 marked to be canceled.
  2014-07-08 12:21:20   wg-srv-dir JobId 319: Bacula wg-srv-dir 2.4.3 
(10Oct08): 08-Jul-2014 12:21:20
  Build OS:               i686-pc-linux-gnu suse 11.0
  JobId:                  319
  Job:                    wg-archive2.2014-07-07_11.58.41
  Backup Level:           Full
  Client:                 "wg-archive2-fd" 2.4.3 (10Oct08) 
i686-pc-linux-gnu,suse,11.0
  FileSet:                "wg-archive-fs" 2008-11-27 20:02:38
  Pool:                   "Default" (From Job resource)
  Storage:                "Ultrium960" (From Job resource)
  Scheduled time:         07-Jul-2014 11:58:39
  Start time:             07-Jul-2014 11:58:42
  End time:               08-Jul-2014 12:21:20
  Elapsed time:           1 day 22 mins 38 secs
  Priority:               10
  FD Files Written:       727
  SD Files Written:       0
  FD Bytes Written:       980,284,578,636 (980.2 GB)
  SD Bytes Written:       0 (0 B)
  Rate:                   11170.3 KB/s
  Software Compression:   None
  VSS:                    no
  Storage Encryption:     no
  Volume name(s):         Cassette_141|Cassette_142|Cassette_143|Cassette_144
  Volume Session Id:      23
  Volume Session Time:    1375795922
  Last Volume Bytes:      64,512 (64.51 KB)
  Non-fatal FD errors:    0
  SD Errors:              0
  FD termination status:  Canceled
  SD termination status:  Error
  Termination:            Backup Canceled



Anyway in a var/log/messages was this:

Jul  7 18:10:40 wg-srv kernel: st0: Sense Key : Medium Error [deferred]
Jul  7 18:10:40 wg-srv kernel: Info fld=0xfc00
Jul  7 18:10:40 wg-srv kernel: st0: Add. Sense: Write error
Jul  7 18:25:40 wg-srv kernel: st0: Error 80000 (sugg. bt 0x0, driver bt 0x0, 
host bt 0x8).
Jul  7 18:42:15 wg-srv kernel: st0: Error 80000 (sugg. bt 0x0, driver bt 0x0, 
host bt 0x8).
Jul  7 18:57:16 wg-srv kernel: st0: Error 80000 (sugg. bt 0x0, driver bt 0x0, 
host bt 0x8).
Jul  7 19:12:16 wg-srv kernel: st0: Error 80000 (sugg. bt 0x0, driver bt 0x0, 
host bt 0x8).
Jul  7 23:05:37 wg-srv kernel: st0: Error 80000 (sugg. bt 0x0, driver bt 0x0, 
host bt 0x8).
Jul  8 02:58:57 wg-srv kernel: st0: Error 80000 (sugg. bt 0x0, driver bt 0x0, 
host bt 0x8).



Jul  7 18:25:40 wg-srv kernel: mptscsih: ioc0: attempting task abort! 
(sc=e3bb8980)
Jul  7 18:25:40 wg-srv kernel: mptscsih: ioc0: task abort: SUCCESS (sc=e3bb8980)
Jul  7 18:25:50 wg-srv kernel: mptscsih: ioc0: attempting task abort! 
(sc=e3bb8980)
Jul  7 18:25:51 wg-srv kernel: mptscsih: ioc0: task abort: SUCCESS (sc=e3bb8980)
Jul  7 18:26:01 wg-srv kernel: mptscsih: ioc0: attempting task abort! 
(sc=f7a35380)
Jul  7 18:26:01 wg-srv kernel: mptscsih: ioc0: task abort: SUCCESS (sc=f7a35380)
Jul  7 18:26:11 wg-srv kernel: mptscsih: ioc0: attempting task abort! 
(sc=f7a35980)
Jul  7 18:26:11 wg-srv kernel: mptscsih: ioc0: task abort: SUCCESS (sc=f7a35980)
Jul  7 18:26:21 wg-srv kernel: mptscsih: ioc0: attempting task abort! 
(sc=e3bb8580)
Jul  7 18:26:22 wg-srv kernel: mptscsih: ioc0: task abort: SUCCESS (sc=e3bb8580)
Jul  7 18:26:32 wg-srv kernel: mptscsih: ioc0: attempting task abort! 
(sc=f7a35580)
Jul  7 18:26:32 wg-srv kernel: mptscsih: ioc0: task abort: SUCCESS (sc=f7a35580)
Jul  7 18:26:42 wg-srv kernel: mptscsih: ioc0: attempting task abort! 
(sc=f7838480)
Jul  7 18:26:43 wg-srv kernel: mptscsih: ioc0: task abort: SUCCESS (sc=f7838480)
Jul  7 18:26:53 wg-srv kernel: mptscsih: ioc0: attempting task abort! 
(sc=f7838980)
Jul  7 18:26:53 wg-srv kernel: mptscsih: ioc0: task abort: SUCCESS (sc=f7838980)
Jul  7 18:27:03 wg-srv kernel: mptscsih: ioc0: attempting task abort! 
(sc=f7a35180)
Jul  7 18:27:03 wg-srv kernel: mptscsih: ioc0: task abort: SUCCESS (sc=f7a35180)
Jul  7 18:27:13 wg-srv kernel: mptscsih: ioc0: attempting task abort! 
(sc=f7a35680)
Jul  7 18:27:14 wg-srv kernel: mptscsih: ioc0: task abort: SUCCESS (sc=f7a35680)
Jul  7 18:42:15 wg-srv kernel: mptscsih: ioc0: attempting task abort! 
(sc=f7838b80)
Jul  7 18:42:15 wg-srv kernel: mptscsih: ioc0: task abort: SUCCESS (sc=f7838b80)
Jul  7 18:57:15 wg-srv kernel: mptscsih: ioc0: attempting task abort! 
(sc=f7838180)
Jul  7 18:57:16 wg-srv kernel: mptscsih: ioc0: task abort: SUCCESS (sc=f7838180)
Jul  7 19:12:16 wg-srv kernel: mptscsih: ioc0: attempting task abort! 
(sc=f7838180)
Jul  7 19:12:16 wg-srv kernel: mptscsih: ioc0: task abort: SUCCESS (sc=f7838180)
Jul  7 23:05:36 wg-srv kernel: mptscsih: ioc0: attempting task abort! 
(sc=f7a35280)
Jul  7 23:05:37 wg-srv kernel: mptscsih: ioc0: task abort: SUCCESS (sc=f7a35280)
Jul  8 02:58:57 wg-srv kernel: mptscsih: ioc0: attempting task abort! 
(sc=f7838780)
Jul  8 02:58:57 wg-srv kernel: mptscsih: ioc0: task abort: SUCCESS (sc=f7838780)


mt -f /dev/st0 status
mt: /dev/st0: rmtopen failed: Device or resource busy


/usr/sbin/btape -v -c /etc/bacula/bacula-sd.conf /dev/st0
Tape block granularity is 1024 bytes.
btape: butil.c:285 Using device: "/dev/st0" for writing.
08-Jul 19:31 btape JobId 0: 3301 Issuing autochanger "loaded? drive 2" command.
08-Jul 19:31 btape JobId 0: 3302 Autochanger "loaded? drive 2", result: nothing 
loaded.


And nothing more, just hanging

/usr/sbin/tapeinfo -f /dev/st0
cannot open SCSI device '/dev/st0' - Device or resource busy


But sg0 device is working with tapeinfo(but not with mt and btape):

/usr/sbin/tapeinfo -f /dev/sg0
Product Type: Tape Drive
Vendor ID: 'HP      '
Product ID: 'Ultrium 3-SCSI  '
Revision: 'G64D'
Attached Changer API: No
SerialNumber: 'xxxxxxxxxx'
MinBlock: 1
MaxBlock: 16777215
SCSI ID: 3
SCSI LUN: 0
Ready: yes
BufferedMode: yes
Medium Type: Not Loaded
Density Code: 0x44
BlockSize: 0
DataCompEnabled: yes
DataCompCapable: yes
DataDeCompEnabled: yes
CompType: 0x1
DeCompType: 0x1
BOP: yes
Block Position: 0
Partition 0 Remaining Kbytes: 400308
Partition 0 Size in Kbytes: 400308
ActivePartition: 0
EarlyWarningSize: 0
NumPartitions: 0
MaxPartitions: 0


I even changed scsi card, which is in hp server (we have a backup card), 
reinserted cables, turned off/on LTO-device, but nothing happend, no any 
messages, it is just like lto device is broken. But how can i really be sure 
that device is dead?

+----------------------------------------------------------------------
|This was sent by kkuznetsov AT web.1tv DOT ru via Backup Central.
|Forward SPAM to abuse AT backupcentral DOT com.
+----------------------------------------------------------------------



------------------------------------------------------------------------------
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
http://p.sf.net/sfu/Bonitasoft
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users

<Prev in Thread] Current Thread [Next in Thread>