Bacula-users

Re: [Bacula-users] ERR=Input/output error / only some times

2008-10-14 09:52:44
Subject: Re: [Bacula-users] ERR=Input/output error / only some times
From: Stefan Lubitz <stefan.lubitz AT mackevision DOT de>
To: Arno Lehmann <al AT its-lehmann DOT de>, bacula-users AT lists.sourceforge DOT net
Date: Tue, 14 Oct 2008 15:47:24 +0200
Hi Arno,

thanks for the infos. I found the following messages in /var/log/messages.
Could this show a problem? I am not familiar with this
Until now I have not worked with smartmontools. I have installed them 
and added the /dev/sg3 in the smartd.conf file.
Let's see what happens.
The Tapes I am using are new, also is the Library and all the cabels.

Thanks for further help,
Stefan

Messages:
Oct 14 14:04:56 BACKUP3 kernel: [101067.953408] mptscsih: ioc1: 
attempting task abort! (sc=ffff81011bd15a80)
Oct 14 14:04:56 BACKUP3 kernel: [101067.953413] ch 2:0:4:1: CDB: 
Inquiry: 12 00 00 00 24 00
Oct 14 14:04:58 BACKUP3 kernel: [101070.275853] mptbase: ioc1: 
Initiating recovery
Oct 14 14:05:00 BACKUP3 kernel: [101072.206197] st 2:0:4:0: mptscsih: 
ioc1: completing cmds: fw_channel 0, fw_id 4, sc=ffff810119901580, mf = 
ffff81011bd42bc0, idx=a
Oct 14 14:05:00 BACKUP3 kernel: [101072.206203] ch 2:0:4:1: mptscsih: 
ioc1: completing cmds: fw_channel 0, fw_id 4, sc=ffff81011bd15a80, mf = 
ffff81011bd42c20, idx=b
Oct 14 14:05:03 BACKUP3 kernel: [101076.299678] mptscsih: ioc1: Issue of 
TaskMgmt failed!
Oct 14 14:05:03 BACKUP3 kernel: [101076.299683] mptscsih: ioc1: task 
abort: FAILED (sc=ffff81011bd15a80)
Oct 14 14:05:03 BACKUP3 kernel: [101076.299685] mptscsih: ioc1: 
attempting task abort! (sc=ffff810119901580)
Oct 14 14:05:03 BACKUP3 kernel: [101076.299689] st 2:0:4:0: CDB: 
Read(6): 08 00 00 fc 00 00
Oct 14 14:05:03 BACKUP3 kernel: [101076.299712] mptscsih: ioc1: task 
abort: SUCCESS (sc=ffff810119901580)
Oct 14 14:05:03 BACKUP3 kernel: [101076.299670] mptscsih: ioc1: 
attempting target reset! (sc=ffff81011bd15a80)
Oct 14 14:05:03 BACKUP3 kernel: [101076.299670] ch 2:0:4:1: CDB: 
Inquiry: 12 00 00 00 24 00
Oct 14 14:05:04 BACKUP3 kernel: [101077.027907] mptscsih: ioc1: target 
reset: SUCCESS (sc=ffff81011bd15a80)
Oct 14 14:05:04 BACKUP3 kernel: [101077.027907] st0: Sense Key : Unit 
Attention [current]
Oct 14 14:05:04 BACKUP3 kernel: [101077.027907] st0: Add. Sense: Power 
on, reset, or bus device reset occurred
Oct 14 14:05:04 BACKUP3 kernel: [101077.401595]  target2:0:4: Beginning 
Domain Validation
Oct 14 14:05:04 BACKUP3 kernel: [101077.421570]  target2:0:4: Ending 
Domain Validation
Oct 14 14:05:04 BACKUP3 kernel: [101077.421570]  target2:0:4: FAST-80 
WIDE SCSI 160.0 MB/s DT (12.5 ns, offset 127)
Oct 14 14:05:04 BACKUP3 kernel: [101077.636087]  target2:0:4: Beginning 
Domain Validation
Oct 14 14:05:04 BACKUP3 kernel: [101077.659128]  target2:0:4: Ending 
Domain Validation
Oct 14 14:05:04 BACKUP3 kernel: [101077.659128]  target2:0:4: FAST-80 
WIDE SCSI 160.0 MB/s DT (12.5 ns, offset 127)
Oct 14 14:05:04 BACKUP3 kernel: [101077.659128]  target2:0:5: Beginning 
Domain Validation
Oct 14 14:05:04 BACKUP3 kernel: [101077.663128]  target2:0:5: Domain 
Validation skipping write tests
Oct 14 14:05:04 BACKUP3 kernel: [101077.663128]  target2:0:5: Ending 
Domain Validation
Oct 14 14:05:04 BACKUP3 kernel: [101077.663128]  target2:0:5: FAST-80 
WIDE SCSI 160.0 MB/s DT (12.5 ns, offset 127)
Oct 14 14:16:35 BACKUP3 -- MARK --
Oct 14 14:20:05 BACKUP3 kernel: [102122.127816] mptscsih: ioc1: 
attempting task abort! (sc=ffff81011e54fe40)
Oct 14 14:20:05 BACKUP3 kernel: [102122.127821] st 2:0:4:0: CDB: 
Read(6): 08 00 00 fc 00 00
Oct 14 14:20:05 BACKUP3 kernel: [102122.247790] mptscsih: ioc1: task 
abort: FAILED (sc=ffff81011e54fe40)
Oct 14 14:20:05 BACKUP3 kernel: [102122.247796] mptscsih: ioc1: 
attempting target reset! (sc=ffff81011e54fe40)
Oct 14 14:20:05 BACKUP3 kernel: [102122.247798] st 2:0:4:0: CDB: 
Read(6): 08 00 00 fc 00 00
Oct 14 14:20:07 BACKUP3 kernel: [102124.943766] mptbase: ioc1: 
Initiating recovery
Oct 14 14:20:09 BACKUP3 kernel: [102126.894740] st 2:0:4:0: mptscsih: 
ioc1: completing cmds: fw_channel 0, fw_id 4, sc=ffff81011e54fe40, mf = 
ffff81011bd46760, idx=a9
Oct 14 14:20:12 BACKUP3 kernel: [102131.260388] mptscsih: ioc1: Issue of 
TaskMgmt failed!
Oct 14 14:20:12 BACKUP3 kernel: [102131.260391] mptscsih: ioc1: target 
reset: FAILED (sc=ffff81011e54fe40)
Oct 14 14:20:12 BACKUP3 kernel: [102131.260394] mptscsih: ioc1: 
attempting bus reset! (sc=ffff81011e54fe40)
Oct 14 14:20:12 BACKUP3 kernel: [102131.260396] st 2:0:4:0: CDB: 
Read(6): 08 00 00 fc 00 00
Oct 14 14:20:13 BACKUP3 kernel: [102132.196536] mptscsih: ioc1: bus 
reset: SUCCESS (sc=ffff81011e54fe40)
Oct 14 14:20:23 BACKUP3 kernel: [102143.700212] st0: Error 80000 (sugg. 
bt 0x0, driver bt 0x0, host bt 0x8).
Oct 14 14:20:23 BACKUP3 kernel: [102143.821719]  target2:0:4: Beginning 
Domain Validation
Oct 14 14:20:23 BACKUP3 kernel: [102144.177692]  target2:0:4: Ending 
Domain Validation
Oct 14 14:20:23 BACKUP3 kernel: [102144.177692]  target2:0:4: FAST-80 
WIDE SCSI 160.0 MB/s DT (12.5 ns, offset 127)
Oct 14 14:20:23 BACKUP3 kernel: [102144.390932]  target2:0:4: Beginning 
Domain Validation
Oct 14 14:20:23 BACKUP3 kernel: [102144.398927]  target2:0:4: Ending 
Domain Validation
Oct 14 14:20:23 BACKUP3 kernel: [102144.398927]  target2:0:4: FAST-80 
WIDE SCSI 160.0 MB/s DT (12.5 ns, offset 127)
Oct 14 14:20:23 BACKUP3 kernel: [102144.398927]  target2:0:5: Beginning 
Domain Validation
Oct 14 14:20:24 BACKUP3 kernel: [102144.422927]  target2:0:5: Domain 
Validation skipping write tests
Oct 14 14:20:24 BACKUP3 kernel: [102144.422927]  target2:0:5: Ending 
Domain Validation




Arno Lehmann schrieb:
> Hi,
>
> 14.10.2008 11:54, Stefan Lubitz wrote:
>   
>> Hi all,
>>
>> I have a strange behavior when I run btape.
>> The test abborts at different positions. Sometimes it runs completely. 
>> Sometimes it stops after the rewind or after the "Got EOF on tape".
>> I have no idea where I can search what's wrong. The strange thing is, 
>> that I was able to write (bconsole / run a job) on two tapes.
>> After writing I tried to restore the job, but the restore failed. So I 
>> started from beginning with testing my Autoloader.
>> And guess what......... It failed.
>>     
>
> Looks like a hardware-related problem.
>
>   
>> If I write an EOF manually (mt -f /dev/nst0 weof), the the test will 
>> work one time. The second time it fails again.
>> Is this normal?
>>
>> I start btape with the following syntax:
>> btape -c /dev/bacula/bacula-sd.conf -v -d 6 /dev/nst0
>>
>> Here ist the btape output:
>> btape: btape.c:831 Wrote 1000 blocks of 64412 bytes.
>> btape: btape.c:505 Wrote 1 EOF to "Drive-1" (/dev/nst0)
>> btape: btape.c:847 Wrote 1000 blocks of 64412 bytes.
>> btape: btape.c:505 Wrote 1 EOF to "Drive-1" (/dev/nst0)
>> btape: btape.c:856 Rewind OK.
>> 1000 blocks re-read correctly.
>> Got EOF on tape.
>>
>> Here is the Error Message:
>> btape: btape.c:856 Rewind OK.
>> 13-Oct 15:08 btape JobId 0: Error: block.c:999 Read error on fd=3 at 
>> file:blk 0:976 on device "Drive-1" (/dev/nst0). ERR=Input/output error.
>> btape: btape.c:868 Read block 977 failed! ERR=Input/output error
>>
>> Next try, same result:
>> 13-Oct 16:33 btape JobId 0: Error: block.c:999 Read error on fd=3 at 
>> file:blk 1:897 on device "Drive-1" (/dev/nst0). ERR=Input/output error.
>> btape: btape.c:868 Read block 1898 failed! ERR=Input/output error
>>
>> Can you please help me, where I can start to investigate?
>>     
>
> Start by having a look into the system's log - there might be errors 
> reported there.
>
> This might be a broken tape drive, bad cabling or termination, or a 
> rancid tape.
>
> Also, tapelerts might be present for the tape drive - you can see them 
> using smartmontools or the tapealert program.
>
> Arno
>
>   
>> Thanks,
>> Stefan
>>
>>     
>
>   

-- 
Stefan Lubitz
System Administrator
___________________________________________________________________________
MACKEVISION
Medien Design GmbH, Stuttgart
Forststraße 7
D - 70174 Stuttgart

Tel.: +49 (0) 711/933048-0
Fax.: +49 (0) 711/933048-90

stefan.lubitz AT mackevision DOT de
www.mackevision.de

Geschäftsführer: Armin Pohl, Joachim Lincke
HRB 243735 Amtsgericht Stuttgart
Steuernummer: 99070 / 01935
USt-ID-Nr. DE 169457786


-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users

<Prev in Thread] Current Thread [Next in Thread>