Bacula-users

Re: [Bacula-users] read error

2008-05-15 04:24:50
Subject: Re: [Bacula-users] read error
From: Arno Lehmann <al AT its-lehmann DOT de>
To: bacula-users AT lists.sourceforge DOT net
Date: Thu, 15 May 2008 10:11:05 +0200
Hi,

15.05.2008 01:25, Mark Nienberg wrote:
> I continue to be plagued with the problem shown below.  About 80% of the time 
> it 
> works perfectly.  The rest of the time I have to use bconsole to unmount, 
> then mount 
> the tape again.  The second read always works and the backup continues.  Of 
> course it 
> happens in the middle of the night when the changer switches tapes and so I 
> have to 
> intervene the next day.
> 
> The good news is this has nothing to do with the tape change.  The 
> mtx-changer script 
> is waiting until the mt command returns a status of ONLINE.  It does so 
> successfully, 
> and I can see it in the mtx-changer log. This means the drive was able to 
> read the 
> system area of the tape and is happy with it.  I don't know why then bacula 
> has 
> trouble reading the data portion.

Perhaps a simple extra wait time after the load operation would be enough.

> I'm thinking about writing a script that cron could run every hour to see if 
> the 
> device is blocked even if the correct tape is loaded.  If so, it could do a 
> unmount/mount.

That would be possible, I guess.

>  Or maybe I could add some command to the changer script to read some 
> data off the tape after it loads or something?

You could even call btape to read the tape label and retry that a few 
times... parsing the output for a valid label shouldn't be too hard.

>  Or maybe bacula could be improved to 
> try harder (whatever that means) to resolve the issue on its own?

Hmm... Bacula relies on mtx-changer to leave a usable tape in the 
drive, so you're on the right track to look at mtx-changer here.

> Bacula 2.2.8 with a VXA320 drive and packetloader changer.
> 
> Thanks for any ideas.
> 
> Mark

Arno

> 10-May 07:24 khyber-dir JobId 2043: All records pruned from Volume 
> "A0000006"; 
> marking it "Purged"
> 10-May 07:24 khyber-dir JobId 2043: Recycled volume "A0000006"
> 10-May 07:24 khyber-sd JobId 2043: 3307 Issuing autochanger "unload slot 3, 
> drive 0" 
> command.
> 10-May 07:25 khyber-sd JobId 2043: 3304 Issuing autochanger "load slot 5, 
> drive 0" 
> command.
> 10-May 07:26 khyber-sd JobId 2043: 3305 Autochanger "load slot 5, drive 0", 
> status is OK.
> 10-May 07:26 khyber-sd JobId 2043: 3301 Issuing autochanger "loaded? drive 0" 
> command.
> 10-May 07:26 khyber-sd JobId 2043: 3302 Autochanger "loaded? drive 0", result 
> is Slot 5.
> 
> 10-May 07:26 khyber-sd JobId 2043: Error: block.c:995 Read error on fd=7 at 
> file:blk 
> 0:0 on device "VXA3drive" (/dev/nst0). ERR=Input/output error.
> 10-May 07:26 khyber-sd JobId 2043: Error: block.c:995 Read error on fd=7 at 
> file:blk 
> 0:0 on device "VXA3drive" (/dev/nst0). ERR=Input/output error.
> 10-May 07:26 khyber-sd JobId 2043: Error: block.c:995 Read error on fd=7 at 
> file:blk 
> 0:0 on device "VXA3drive" (/dev/nst0). ERR=Input/output error.
> 10-May 07:26 khyber-sd JobId 2043: Error: block.c:995 Read error on fd=7 at 
> file:blk 
> 0:0 on device "VXA3drive" (/dev/nst0). ERR=Input/output error.
> 10-May 07:26 khyber-sd JobId 2043: Error: block.c:995 Read error on fd=7 at 
> file:blk 
> 0:0 on device "VXA3drive" (/dev/nst0). ERR=Input/output error.
> 
> 10-May 07:26 khyber-sd JobId 2043: Please mount Volume "A0000006" or label a 
> new one for:
>      Job:          gecko.2008-05-10_01.05.59
>      Storage:      "VXA3drive" (/dev/nst0)
>      Pool:         WeekBar
>      Media type:   VXA3
> 
> --snip--
> 
> 10-May 09:22 khyber-sd JobId 2043: 3301 Issuing autochanger "loaded? drive 0" 
> command.
> 10-May 09:22 khyber-sd JobId 2043: 3302 Autochanger "loaded? drive 0", result 
> is Slot 5.
> 10-May 09:22 khyber-sd JobId 2043: Recycled volume "A0000006" on device 
> "VXA3drive" 
> (/dev/nst0), all previous data lost.
> 10-May 09:22 khyber-sd JobId 2043: New volume "A0000006" mounted on device 
> "VXA3drive" (/dev/nst0) at 10-May-2008 09:22.
> 10-May 10:11 khyber-sd JobId 2043: Job write elapsed time = 07:00:09, 
> Transfer rate = 
> 8.237 M bytes/second
> 
> 
> -------------------------------------------------------------------------
> This SF.net email is sponsored by: Microsoft 
> Defy all challenges. Microsoft(R) Visual Studio 2008. 
> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
> _______________________________________________
> Bacula-users mailing list
> Bacula-users AT lists.sourceforge DOT net
> https://lists.sourceforge.net/lists/listinfo/bacula-users
> 

-- 
Arno Lehmann
IT-Service Lehmann
www.its-lehmann.de

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users

<Prev in Thread] Current Thread [Next in Thread>