Networker

Re: [Networker] SDLT320 unreadable tapes

2004-01-21 05:53:40
Subject: Re: [Networker] SDLT320 unreadable tapes
From: "Mark Bradshaw (BTOpenWorld)" <notthehoople AT BTOPENWORLD DOT COM>
To: NETWORKER AT LISTMAIL.TEMPLE DOT EDU
Date: Tue, 20 Jan 2004 20:32:46 +0000
Hi John,

I'm a bit confused here. Your scanner shows you are using /dev/rmt/0cbn as a
remote device on a Storage Node but you don't mention this in your
environment. Is /dev/rmt/0cbn the problem device and if so are you running
scanner on a Solaris Storage Node?

Ah - looking a bit closer the prompt you are running scanner on is
'osun5680' so I guess it is a Solaris box. Can you flesh out your
environment for us please?

Also could it be that you are sharing SDLT drives between Solaris and Tru64
in a SAN and are maybe suffering from SCSI resets?

Some random thoughts!

Cheers

Mark

> nah - I'm using /dev/ntape/tape#_d1 which is the non-rewind device. If that
> was it then all tapes would be affected.
>
> Also - a weird thing is that the sus tapes will mount fine and accept backups
> fine, but as soon as you remove them from the media db (or try to do a
> restore), then you're unable to read the header information on the tape.
>
> -----Original Message-----
> From: Davina Treiber [mailto:Treiber AT hotpop DOT com]
> Sent: Mon 19/01/2004 6:53 PM
> To: Legato NetWorker discussion; John Herlihy
> Cc:
> Subject: Re: [Networker] SDLT320 unreadable tapes
>
>
>
> I haven't worked with Tru64 for a while so the device naming conventions
> aren't fresh in my mind, but is it possible that in some way you are
> using rewind devices? That would account for the abnormally high amounts
> of data written to some volumes, and could also account for the
> corruption you are seeing. Of course if this is the case it's bad news
> in terms of recovering your data. Just a thought...
>
> John Herlihy wrote:
>> Hi,
>>
>> sorry for the length of this email, but figured I'd chuck all the info
>> in there now - I am seeing an issue where it looks like the Networker
>> headers on the tape are incomplete.
>>
>> This is the environment:
>> Tru64 v5.1A
>> Networker Power Edition v6.1.3
>> SDLT320 drives
>>
>> When trying to use "scanner -i <device>" to scan a tape back in it:
>> 1 - prompts for you to enter in the name of the volume.
>> 2 - complains that there is no pool named `'
>> 3 - fails in a short amount of time (ie about 5-10 secs)
>>
>> Here is the scanner output:
>> =================================================
>> osun5680[/]# scanner -s nsr01 -vim /dev/rmt/0cbn
>> scanner: using 'rd=server1:/dev/rmt/0cbn' as the device name
>> scanner: Opened /dev/rmt/0cbn for read
>> scanner: Rewinding...
>> scanner: Rewinding done
>> scanner: Reading the label...
>> scanner: Reading the label done
>> scanner: SYSTEM error: Tape label read: Bad file number
>> scanner: SYSTEM error: Tape label read: Bad file number
>> scanner: scanning for valid records...
>> scanner: read: 131072 bytes
>> scanner: read: 131072 bytes
>> scanner: Found valid record:
>> scanner: volume id 2434907393
>> scanner: file number 110
>> scanner: record number 5930
>> scanner: Enter the volume's name: SU0026
>> scanner: volume name `SU0026'
>> scanner: scanning sdlt320 tape SU0026 on rd=server1:/dev/rmt/0cbn
>> scanner: volume id 2434907393 record size 131072
>> created 1/01/70 10:00:00 expires 1/01/70 10:00:00
>> scanner: adding sdlt320 tape SU0026 to pool
>> scanner: RAP error: There is no pool named `'.
>> scanner: create pool manually after scanner; continuing...
>> scanner: Rewinding...
>> scanner: Rewinding done
>> scanner: setting position from fn 0, rn 0 to fn 2, rn 0
>> scanner: Opened /dev/rmt/0cbn for read
>> scanner: unexpected file number, wanted 2 got 112
>> scanner: adjusting file number from 2 to 112
>> scanner: scanning file 112, record 0
>> scanner: unexpected volume id, wanted 2434907393 got 2434907393
>> scanner: Opened /dev/rmt/0cbn for read
>> scanner: done with sdlt320 tape SU0026
>> scanner: Rewinding...
>> scanner: Rewinding done
>> =================================================
>>
>> We were able to obtain the header from the tape via the command:
>> dd if=/dev/rmt/0cbn of=/tmp/tapeheader bs=128k count=1
>>
>> ..and then view it with the command:
>> strings /tmp/tapeheader
>>
>> Here is the output from 2 problem tapes:
>> =================================================
>> For volume SU0026:
>> VOL1SU0026NETWORKER                                           3
>> setting position from fn %lu, rn %lu to fn %lu,
>>
>> For volume SU0116:
>> VOL1SU0116NETWORKER                                           3
>> setting position from fn %lu, rn %lu to fn %lu,
>>
>> =================================================
>>
>> This is what the header of a good tape looks like:
>> =================================================
>> VOL1SU0295NETWORKER                                           3
>> setting position from fn %lu, rn %lu to fn %lu,
>> C%D2
>> SU0295
>> volume pool
>> SCRATCH
>> =================================================
>>
>> ALSO - I'm also seeing that an abnormal amount of data is being written
>> to these tapes via the "mminfo -m" output. I don't know about SU0026 as
>> it's already been deleted from the media db, but SU0116 has 1202GB on
>> it!! I've looked through the mminfo output and found other tapes which
>> have between 500GB-1848GB!!!
>>
>> I've checked four of these tapes which contained 1848GB, 921GB, 671GB &
>> 1700GB respectively, and only the 671GB tape was able to be read.
>>
>> I used "tcopy" to get a listing of the tapes structures, and the one
>> that worked had 2 x 32KB header files before changing to 128KB data
>> blocks while the other 3 only had 128KB blocks.
>>
>> I'm investigating driver versions at the moment, but can anyone think of
>> what could be causing this? There doesn't appear to be any common
>> trigger (Windows & Unix systems are affected across multiple drives...
>> firmware has been upgraded, etc).
>>
>
>
>

--
Note: To sign off this list, send a "signoff networker" command via email
to listserv AT listmail.temple DOT edu or visit the list's Web site at
http://listmail.temple.edu/archives/networker.html where you can
also view and post messages to the list.
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=