Networker

Re: [Networker] SDLT320 unreadable tapes

2004-01-19 06:15:43
Subject: Re: [Networker] SDLT320 unreadable tapes
From: John Herlihy <johnh AT XSIDATA.COM DOT AU>
To: NETWORKER AT LISTMAIL.TEMPLE DOT EDU
Date: Mon, 19 Jan 2004 22:12:04 +1100
nah - I'm using /dev/ntape/tape#_d1 which is the non-rewind device. If that was 
it then all tapes would be affected.
 
Also - a weird thing is that the sus tapes will mount fine and accept backups 
fine, but as soon as you remove them from the media db (or try to do a 
restore), then you're unable to read the header information on the tape.

        -----Original Message----- 
        From: Davina Treiber [mailto:Treiber AT hotpop DOT com] 
        Sent: Mon 19/01/2004 6:53 PM 
        To: Legato NetWorker discussion; John Herlihy 
        Cc: 
        Subject: Re: [Networker] SDLT320 unreadable tapes
        
        

        I haven't worked with Tru64 for a while so the device naming conventions
        aren't fresh in my mind, but is it possible that in some way you are
        using rewind devices? That would account for the abnormally high amounts
        of data written to some volumes, and could also account for the
        corruption you are seeing. Of course if this is the case it's bad news
        in terms of recovering your data. Just a thought...
        
        John Herlihy wrote:
        > Hi,
        > 
        > sorry for the length of this email, but figured I'd chuck all the info
        > in there now - I am seeing an issue where it looks like the Networker
        > headers on the tape are incomplete.
        > 
        > This is the environment:
        > Tru64 v5.1A
        > Networker Power Edition v6.1.3
        > SDLT320 drives
        > 
        > When trying to use "scanner -i <device>" to scan a tape back in it:
        > 1 - prompts for you to enter in the name of the volume.
        > 2 - complains that there is no pool named `'
        > 3 - fails in a short amount of time (ie about 5-10 secs)
        > 
        > Here is the scanner output:
        > =================================================
        > osun5680[/]# scanner -s nsr01 -vim /dev/rmt/0cbn
        > scanner: using 'rd=server1:/dev/rmt/0cbn' as the device name
        > scanner: Opened /dev/rmt/0cbn for read
        > scanner: Rewinding...
        > scanner: Rewinding done
        > scanner: Reading the label...
        > scanner: Reading the label done
        > scanner: SYSTEM error: Tape label read: Bad file number
        > scanner: SYSTEM error: Tape label read: Bad file number
        > scanner: scanning for valid records...
        > scanner: read: 131072 bytes
        > scanner: read: 131072 bytes
        > scanner: Found valid record:
        > scanner: volume id 2434907393
        > scanner: file number 110
        > scanner: record number 5930
        > scanner: Enter the volume's name: SU0026
        > scanner: volume name `SU0026'
        > scanner: scanning sdlt320 tape SU0026 on rd=server1:/dev/rmt/0cbn
        > scanner: volume id 2434907393 record size 131072
        > created 1/01/70 10:00:00 expires 1/01/70 10:00:00
        > scanner: adding sdlt320 tape SU0026 to pool
        > scanner: RAP error: There is no pool named `'.
        > scanner: create pool manually after scanner; continuing...
        > scanner: Rewinding...
        > scanner: Rewinding done
        > scanner: setting position from fn 0, rn 0 to fn 2, rn 0
        > scanner: Opened /dev/rmt/0cbn for read
        > scanner: unexpected file number, wanted 2 got 112
        > scanner: adjusting file number from 2 to 112
        > scanner: scanning file 112, record 0
        > scanner: unexpected volume id, wanted 2434907393 got 2434907393
        > scanner: Opened /dev/rmt/0cbn for read
        > scanner: done with sdlt320 tape SU0026
        > scanner: Rewinding...
        > scanner: Rewinding done
        > =================================================
        > 
        > We were able to obtain the header from the tape via the command:
        > dd if=/dev/rmt/0cbn of=/tmp/tapeheader bs=128k count=1
        > 
        > ..and then view it with the command:
        > strings /tmp/tapeheader
        > 
        > Here is the output from 2 problem tapes:
        > =================================================
        > For volume SU0026:
        > VOL1SU0026NETWORKER                                           3
        > setting position from fn %lu, rn %lu to fn %lu,
        >
        > For volume SU0116:
        > VOL1SU0116NETWORKER                                           3
        > setting position from fn %lu, rn %lu to fn %lu,
        >
        > =================================================
        > 
        > This is what the header of a good tape looks like:
        > =================================================
        > VOL1SU0295NETWORKER                                           3
        > setting position from fn %lu, rn %lu to fn %lu,
        > C%D2
        > SU0295
        > volume pool
        > SCRATCH
        > =================================================
        > 
        > ALSO - I'm also seeing that an abnormal amount of data is being 
written
        > to these tapes via the "mminfo -m" output. I don't know about SU0026 
as
        > it's already been deleted from the media db, but SU0116 has 1202GB on
        > it!! I've looked through the mminfo output and found other tapes which
        > have between 500GB-1848GB!!!
        > 
        > I've checked four of these tapes which contained 1848GB, 921GB, 671GB 
&
        > 1700GB respectively, and only the 671GB tape was able to be read.
        > 
        > I used "tcopy" to get a listing of the tapes structures, and the one
        > that worked had 2 x 32KB header files before changing to 128KB data
        > blocks while the other 3 only had 128KB blocks.
        > 
        > I'm investigating driver versions at the moment, but can anyone think 
of
        > what could be causing this? There doesn't appear to be any common
        > trigger (Windows & Unix systems are affected across multiple drives...
        > firmware has been upgraded, etc).
        >