Bacula-users

[Bacula-users] SIGSEV with btape

2008-12-01 09:24:25
Subject: [Bacula-users] SIGSEV with btape
From: Willians Vivanco <williansvi AT infomed.sld DOT cu>
To: bacula-users AT lists.sourceforge DOT net
Date: Mon, 01 Dec 2008 09:14:54 -0500
After some failed attemps to report a bug using the BTS
(http://bugs.bacula.org/) I decided to write to your the mail list. I 
will follow
the indications explained in http://www.bacula.org/en/?page=bugs.

----

*Operating System*: debian/GNU Linux [Lenny (kernel 2.6.26-1-686)]

*Version of Bacula*: 2.4.2 (26 July 2008)

*Tape drive*: HP StorageWorks MSL6000 series

*Description of the problem*:
I have a HP StorageWorks MSL6000 series tape library. My bacula-sd (and
the rest of the daemons) are running in a blade server connected to the
tape library trough a FibreChannel SAN. My problem is that the btape
"fill" test fails. Find below the output of all commands I've tried and
some other useful information.

*Results of "_test_" btape test*:

*test

<output command="test">
=== Write, rewind, and re-read test ===

I'm going to write 1000 records and an EOF
then write 1000 records and an EOF, then rewind,
and re-read the data to verify that it is correct.

This is an *essential* feature ...

btape: btape.c:831 Wrote 1000 blocks of 64412 bytes.
btape: btape.c:505 Wrote 1 EOF to "Drive-0" (/dev/nst0)
btape: btape.c:847 Wrote 1000 blocks of 64412 bytes.
btape: btape.c:505 Wrote 1 EOF to "Drive-0" (/dev/nst0)
btape: btape.c:856 Rewind OK.
1000 blocks re-read correctly.
Got EOF on tape.
1000 blocks re-read correctly.
=== Test Succeeded. End Write, rewind, and re-read test ===


=== Write, rewind, and position test ===

I'm going to write 1000 records and an EOF
then write 1000 records and an EOF, then rewind,
and position to a few blocks and verify that it is correct.

This is an *essential* feature ...

btape: btape.c:943 Wrote 1000 blocks of 64412 bytes.
btape: btape.c:505 Wrote 1 EOF to "Drive-0" (/dev/nst0)
btape: btape.c:959 Wrote 1000 blocks of 64412 bytes.
btape: btape.c:505 Wrote 1 EOF to "Drive-0" (/dev/nst0)
btape: btape.c:968 Rewind OK.
Reposition to file:block 0:4
Block 5 re-read correctly.
Reposition to file:block 0:200
Block 201 re-read correctly.
Reposition to file:block 0:999
Block 1000 re-read correctly.
Reposition to file:block 1:0
Block 1001 re-read correctly.
Reposition to file:block 1:600
Block 1601 re-read correctly.
Reposition to file:block 1:999
Block 2000 re-read correctly.
=== Test Succeeded. End Write, rewind, and re-read test ===



=== Append files test ===

This test is essential to Bacula.

I'm going to write one record  in file 0,
                  two records in file 1,
            and three records in file 2

btape: btape.c:475 Rewound "Drive-0" (/dev/nst0)
btape: btape.c:1577 Wrote one record of 64412 bytes.
btape: btape.c:1579 Wrote block to device.
btape: btape.c:505 Wrote 1 EOF to "Drive-0" (/dev/nst0)
btape: btape.c:1577 Wrote one record of 64412 bytes.
btape: btape.c:1579 Wrote block to device.
btape: btape.c:1577 Wrote one record of 64412 bytes.
btape: btape.c:1579 Wrote block to device.
btape: btape.c:505 Wrote 1 EOF to "Drive-0" (/dev/nst0)
btape: btape.c:1577 Wrote one record of 64412 bytes.
btape: btape.c:1579 Wrote block to device.
btape: btape.c:1577 Wrote one record of 64412 bytes.
btape: btape.c:1579 Wrote block to device.
btape: btape.c:1577 Wrote one record of 64412 bytes.
btape: btape.c:1579 Wrote block to device.
btape: btape.c:505 Wrote 1 EOF to "Drive-0" (/dev/nst0)
btape: btape.c:372 open device "Drive-0" (/dev/nst0): OK
btape: btape.c:475 Rewound "Drive-0" (/dev/nst0)
btape: btape.c:1097 Now moving to end of medium.
btape: btape.c:526 Moved to end of medium.
We should be in file 3. I am at file 3. This is correct!

Now the important part, I am going to attempt to append to the tape.

btape: btape.c:1577 Wrote one record of 64412 bytes.
btape: btape.c:1579 Wrote block to device.
btape: btape.c:505 Wrote 1 EOF to "Drive-0" (/dev/nst0)
btape: btape.c:475 Rewound "Drive-0" (/dev/nst0)
Done appending, there should be no I/O errors

Doing Bacula scan of blocks:
1 block of 64448 bytes in file 1
End of File mark.
2 blocks of 64448 bytes in file 2
End of File mark.
3 blocks of 64448 bytes in file 3
End of File mark.
1 block of 64448 bytes in file 4
End of File mark.
Total files=4, blocks=7, bytes = 451,136
End scanning the tape.
We should be in file 4. I am at file 4. This is correct!

The above Bacula scan should have output identical to what follows.
Please double check it ...
=== Sample correct output ===
1 block of 64448 bytes in file 1
End of File mark.
2 blocks of 64448 bytes in file 2
End of File mark.
3 blocks of 64448 bytes in file 3
End of File mark.
1 block of 64448 bytes in file 4
End of File mark.
Total files=4, blocks=7, bytes = 451,136
=== End sample correct output ===

If the above scan output is not identical to the
sample output, you MUST correct the problem
or Bacula will not be able to write multiple Jobs to
the tape.


=== Write, backup, and re-read test ===

I'm going to write three records and an EOF
then backup over the EOF and re-read the last record.
Bacula does this after writing the last block on the
tape to verify that the block was written correctly.

This is not an *essential* feature ...

btape: btape.c:475 Rewound "Drive-0" (/dev/nst0)
btape: btape.c:710 Wrote first record of 64412 bytes.
btape: btape.c:721 Wrote second record of 64412 bytes.
btape: btape.c:732 Wrote third record of 64412 bytes.
btape: btape.c:505 Wrote 1 EOF to "Drive-0" (/dev/nst0)
btape: btape.c:748 Backspaced over EOF OK.
btape: btape.c:753 Backspace record OK.
btape: btape.c:771
Block re-read correct. Test succeeded!
=== End Write, backup, and re-read test ===



=== Forward space files test ===

This test is essential to Bacula.

I'm going to write five files then test forward spacing

btape: btape.c:475 Rewound "Drive-0" (/dev/nst0)
btape: btape.c:1577 Wrote one record of 64412 bytes.
btape: btape.c:1579 Wrote block to device.
btape: btape.c:505 Wrote 1 EOF to "Drive-0" (/dev/nst0)
btape: btape.c:1577 Wrote one record of 64412 bytes.
btape: btape.c:1579 Wrote block to device.
btape: btape.c:1577 Wrote one record of 64412 bytes.
btape: btape.c:1579 Wrote block to device.
btape: btape.c:505 Wrote 1 EOF to "Drive-0" (/dev/nst0)
btape: btape.c:1577 Wrote one record of 64412 bytes.
btape: btape.c:1579 Wrote block to device.
btape: btape.c:1577 Wrote one record of 64412 bytes.
btape: btape.c:1579 Wrote block to device.
btape: btape.c:1577 Wrote one record of 64412 bytes.
btape: btape.c:1579 Wrote block to device.
btape: btape.c:505 Wrote 1 EOF to "Drive-0" (/dev/nst0)
btape: btape.c:1577 Wrote one record of 64412 bytes.
btape: btape.c:1579 Wrote block to device.
btape: btape.c:1577 Wrote one record of 64412 bytes.
btape: btape.c:1579 Wrote block to device.
btape: btape.c:505 Wrote 1 EOF to "Drive-0" (/dev/nst0)
btape: btape.c:1577 Wrote one record of 64412 bytes.
btape: btape.c:1579 Wrote block to device.
btape: btape.c:505 Wrote 1 EOF to "Drive-0" (/dev/nst0)
btape: btape.c:475 Rewound "Drive-0" (/dev/nst0)
btape: btape.c:1311 Now forward spacing 1 file.
We should be in file 1. I am at file 1. This is correct!
btape: btape.c:1323 Now forward spacing 2 files.
We should be in file 3. I am at file 3. This is correct!
btape: btape.c:475 Rewound "Drive-0" (/dev/nst0)
btape: btape.c:1336 Now forward spacing 4 files.
We should be in file 4. I am at file 4. This is correct!

btape: btape.c:1354 Now forward spacing 1 more file.
We should be in file 5. I am at file 5. This is correct!

=== End Forward space files test ===


Ah, I see you have an autochanger configured.
To test the autochanger you must have a blank tape
that I can write on in Slot 1.

Do you wish to continue with the Autochanger test? (y/n): y


=== Autochanger test ===

3301 Issuing autochanger "loaded" command.
Slot 1 loaded. I am going to unload it.
3302 Issuing autochanger "unload 1 0" command.
unload status=OK 0
3303 Issuing autochanger "load 1 0" command.
3303 Autochanger "load 1 0" status is OK.
btape: btape.c:372 open device "Drive-0" (/dev/nst0): OK
btape: btape.c:1241 Rewound "Drive-0" (/dev/nst0)
btape: btape.c:1248 Wrote EOF to "Drive-0" (/dev/nst0)

The test autochanger worked!!
</output>


I got this in dmesg output after running the btape test command:

[ 1471.898470] st0: Sense Key : Aborted Command [current]
[ 1471.898470] st0: Add. Sense: Information unit iuCRC error detected
[ 1476.199199] st0: Sense Key : Aborted Command [current]
[ 1476.199364] st0: Add. Sense: Information unit iuCRC error detected
[ 1479.457267] st0: Sense Key : Aborted Command [current]
[ 1479.457431] st0: Add. Sense: Information unit iuCRC error detected
[ 1482.704076] st0: Sense Key : Aborted Command [current]
[ 1482.704240] st0: Add. Sense: Information unit iuCRC error detected*


*Results of "_qfill_" btape test* (with 1000 blocks of bytes):

*qfill
btape: btape.c:2431 Test writing blocks of 64512 bytes to tape.
How many blocks do you want to write? (1000):
btape: btape.c:475 Rewound "Drive-0" (/dev/nst0)
btape: btape.c:2448 Begin writing 1000 Bacula blocks to tape ...
++++++++++
btape: btape.c:505 Wrote 1 EOF to "Drive-0" (/dev/nst0)
btape: btape.c:475 Rewound "Drive-0" (/dev/nst0)
1000 blocks of 64448 bytes in file 1
End of File mark.
Total files=1, blocks=1000, bytes = 64,448,000



*Results of "_qfill_" btape test* (with 1001 blocks of bytes, and here 
comes the error, when i try to write more tan 64Mb to the tape):

*qfill
btape: btape.c:2431 Test writing blocks of 64512 bytes to tape.
How many blocks do you want to write? (1000): 1001
btape: btape.c:475 Rewound "Drive-0" (/dev/nst0)
btape: btape.c:2448 Begin writing 1001 Bacula blocks to tape ...
+28-Nov 15:33 btape: ERROR in record.c:253 Failed ASSERT: block->binbuf 
== (uint32_t) (block->bufp - block->buf)
28-Nov 15:33 btape: Fatal Error because: Bacula interrupted by signal 
11: Segmentation violation
Kaboom! btape, btape got signal 11 - Segmentation violation. Attempting 
traceback.
Kaboom! exepath=/root
Calling: /root/btraceback /root/btape 7106
execv: /root/btraceback failed: ERR=No such file or directory
Traceback complete, attempting cleanup ...
#




*Results of "fill" btape test* (It returns the same error of above)

*fill

This command simulates Bacula writing to a tape.
It requires either one or two blank tapes, which it
will label and write.

If you have an autochanger configured, it will use
the tapes that are in slots 1 and 2, otherwise, you will
be prompted to insert the tapes when necessary.

It will print a status approximately
every 322 MB, and write an EOF every 3.2 GB.  If you have
selected the simple test option, after writing the first tape
it will rewind it and re-read the last block written.

If you have selected the multiple tape test, when the first tape
fills, it will ask for a second, and after writing a few more
blocks, it will stop.  Then it will begin re-reading the
two tapes.

This may take a long time -- hours! ...

Do you want to run the simplified test (s) with one tape
or the complete multiple tape (m) test: (s/m) s
Simple test (single tape) selected.
Wrote Volume label for volume "TestVolume1".
Wrote Start of Session label.
15:29:29 Begin writing Bacula records to tape ...
28-Nov 15:29 btape JobId 0: Error: block.c:568 Write error at 1:2 on 
device "Drive-0" (/dev/nst0). ERR=Input/output error.
28-Nov 15:29 btape JobId 0: Error: block.c:318 Volume data error at 
1:4294967295!
Block checksum mismatch in block=4018 len=64512: calc=a0cac187
blk=a16bca86
28-Nov 15:29 btape JobId 0: Error: Re-read last block at EOT failed. 
ERR=block.c:318 Volume data error at 1:4294967295!
Block checksum mismatch in block=4018 len=64512: calc=a0cac187
blk=a16bca86
btape: btape.c:2349 Last block at: 1:1 this_dev_block_num=2
btape: btape.c:2383 End of tape 1:-1. VolumeCapacity=129,024. Write
rate = 6.5 KB/s
28-Nov 15:29 btape: ERROR in record.c:332 Failed ASSERT: remlen >= 
WRITE_RECHDR_LENGTH
28-Nov 15:29 btape: Fatal Error because: Bacula interrupted by signal 
11: Segmentation violation
Kaboom! btape, btape got signal 11 - Segmentation violation. Attempting 
traceback.
Kaboom! exepath=/root
Calling: /root/btraceback /root/btape 6197
execv: /root/btraceback failed: ERR=No such file or directory
Traceback complete, attempting cleanup ...
#

-- 
Some days ago I had reported this issue on IRC. I maded a pastebin with
the output of dmesg after running btape test. On that moment I was
getting a kernel oops, I was using 2.6.18.

I recently upgraded to 2.6.26. This time I got no oops but bacula keeps
failing the same it did with 2.6.18.

I'm curious about the cause of the failed assert.

Regards,
Willians Vivanco






---------------------------------------
    Red Telematica de Salud - Cuba
          CNICM - Infomed

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users

<Prev in Thread] Current Thread [Next in Thread>