Amanda-Users

Drive not sensing EOF, EOT???

2002-12-26 14:36:09
Subject: Drive not sensing EOF, EOT???
From: Chris Dahn <chris.dahn AT computer DOT org>
To: amanda-users AT amanda DOT org
Date: Thu, 26 Dec 2002 14:03:50 -0500
  I think this may be what is causing my infinite running amrecover problem. 
It seems the drive isn't properly sensing EOF and/or EOT correctly. Or 
something along those lines since the tape stops at EOT, but dd keeps right 
on going.

  I really hope someone can set me straight on this. Here's the pertinent 
information, I'm running linux kernel 2.4.19, mt-st version 0.7.  I have a 
100GB SDLT drive. I've had this problem on kernel versions back to 2.4.16 I 
believe, although this is the first actual controlled experiment I've run. I 
recently completely reinstalled the server since we needed to upgrade 
software, and I've been getting really flaky amrecover problems, so I just 
put the newest kernel available on it. I was hoping the new kernel would fix 
my amrecover problems, but they persist. I decided to run the following 
experiment to determine if the problem is amanda, the tape drive, or some 
other odd software problem. Dear God, somebody help me.

  Here's my test file:
7176 Dec 21 04:12 sendsize.20021221040023.debug

  First, let's see what the drive says:
$mt -f /dev/sdlt-norw compression 0
$mt -f /dev/sdlt-norw status
SCSI 2 tape drive:
File number=0, block number=0, partition=0.
Tape block size 0 bytes. Density code 0x48 (no translation).
Soft error count since last status=0
General status bits on (41010000):
 BOT ONLINE IM_REP_EN

  I execute the following command:
$dd if=sendsize.20021221040023.debug of=/dev/sdlt-norw
14+1 records in
14+1 records out

  This will not rewind the tape when it's done. Let's see where the tape 
stopped:
$mt -f /dev/sdlt-norw tell
At block 16.
$mt -f /dev/sdlt-norw status
SCSI 2 tape drive:
File number=1, block number=0, partition=0.
Tape block size 0 bytes. Density code 0x48 (no translation).
Soft error count since last status=0
General status bits on (81010000):
 EOF ONLINE IM_REP_EN

  So far so good. I rewind the tape:
$mt -f /dev/sdlt-norw rewind

  I check the block count, just to be sure:
$mt -f /dev/sdlt-norw tell
At block 0.

  I then execute this command:
$dd if=/dev/sdlt-norw of=blah

  This will not rewind the tape when it's done. After a while, I CTRL-C this 
command, and dd states the following:
61+0 records in
61+0 records out

  Looking at the file 'blah', I see:
$ls -l blah
31232 Dec 26 06:42 blah

  As it should be after reading 61 records. However, the tape is at:
$mt -f /dev/sdlt-norw tell
At block 16.

  This is odd, it read a lot more than 16 blocks from the tape, or so dd would 
lead me to believe. After all, what is in that 31232 byte file 'blah'?  
Looking at the file, we see that when it gets to the end of the file on tape, 
the tape drive stops, but dd keeps writing the last block that it read over 
and over and over to 'blah':
Total bytes written: 52354334720 (49GB, 100MB/s)
.....
sendsize: getting size via gnutar for /export/ext_raid level 1
sendsize: spawning /usr/libexec/runtar in pipeline
sendsize: argument list: /bin/tar --create --file /dev/null --directory 
/export/ext_raid --one-file-system --listed-incremental 
/var/lib/amanda/gnutar-lists/snoopy.cs.drexel.edu_export_ext_raid_1.new 
--sparse --ignore-failed-read --totals .
Total bytes written: 4605429760 (4.3GB, 41MB/s)
.....
sendsize: pid 16484 finish time Sat Dec 21 04:12:08 2002
es written: 52354334720 (49GB, 100MB/s)
.....
sendsize: getting size via gnutar for /export/ext_raid level 1
sendsize: spawning /usr/libexec/runtar in pipeline
sendsize: argument list: /bin/tar --create --file /dev/null --directory 
/export/ext_raid --one-file-system --listed-incremental 
/var/lib/amanda/gnutar-lists/snoopy.cs.drexel.edu_export_ext_raid_1.new 
--sparse --ignore-failed-read --totals .
Total bytes written: 4605429760 (4.3GB, 41MB/s)
.....
sendsize: pid 16484 finish time Sat Dec 21 04:12:08 2002
es written: 52354334720 (49GB, 100MB/s)
.....
sendsize: getting size via gnutar for /export/ext_raid level 1
sendsize: spawning /usr/libexec/runtar in pipeline
sendsize: argument list: /bin/tar --create --file /dev/null --directory 
/export/ext_raid --one-file-system --listed-incremental 
/var/lib/amanda/gnutar-lists/snoopy.cs.drexel.edu_export_ext_raid_1.new 
--sparse --ignore-failed-read --totals .
Total bytes written: 4605429760 (4.3GB, 41MB/s)
.....
sendsize: pid 16484 finish time Sat Dec 21 04:12:08 2002

  It will continue to do this FOREVER.  This is okay for tar files, since tar 
understands where to stop, and does so.  This also means that I can tar/untar 
to and from a tape without a problem.  I imagine that tar knows when to stop, 
and so it closes the connection to the device, which is why I get my files 
from amrecover properly.

  So, now I wanted to make sure all of my end markers are being written, so I 
executed the following:
$mt -f /dev/sdlt-norw rewind
$dd if=/dev/sdlt-norw of=blah2 count=30
30+0 records in
30+0 records out
$mt -f /dev/sdlt-norw tell
At block 16.

  Looking at the file, I notice that it did exactly the same thing that the 
previous dd did.  Does this imply that the EOT marker is being properly 
written, and so the tape drive is stopping at the right place, but something 
isn't being told to dd so it knows to stop?  Am I just doing something 
completely wrong?  For brevity's sake, I tried the same test, but writing 2 
files to the tape. There is a single NULL character that is written between 
the files, which should properly signify EOF, but dd will keep right on 
reading beyond that to EOT, except that it keeps repeating the last block 
over and over again once it hits EOT.

  Dear God, somebody help me.

-Chris

<Prev in Thread] Current Thread [Next in Thread>
  • Drive not sensing EOF, EOT???, Chris Dahn <=