Veritas-bu

[Veritas-bu] DLT drives going down

2006-01-04 09:48:07
Subject: [Veritas-bu] DLT drives going down
From: simon.weaver AT astrium.eads DOT net (WEAVER, Simon)
Date: Wed, 4 Jan 2006 14:48:07 -0000
This message is in MIME format. Since your mail reader does not understand
this format, some or all of this message may not be legible.

------_=_NextPart_001_01C6113D.DFB07AC0
Content-Type: text/plain


ACtually just one more thing I noticed - you said they checked cables - what
about REPLACE? Reason I say this, is due to the amount of time its been
working (June to Oct) and we are now in Jan, so I think its safe to say it
sounds like H/W

Simon Weaver
Technical Support
Windows Domain Administrator 

EADS Astrium
Tel: 02392-708598 

Email: Simon.Weaver AT Astrium.eads DOT net 

-----Original Message-----
From: Barber, Layne (Contractor) [mailto:layne.barber.ctr AT csd.disa DOT mil] 
Sent: 04 January 2006 14:34
To: WEAVER, Simon; veritas-bu AT mailman.eng.auburn DOT edu
Subject: RE: [Veritas-bu] DLT drives going down


They have upgraded the FW to the latest/greatest and checked cables. I agree
on the polling.

  _____  

From: WEAVER, Simon [mailto:simon.weaver AT astrium.eads DOT net] 
Sent: Wednesday, January 04, 2006 08:21
To: Barber, Layne (Contractor); veritas-bu AT mailman.eng.auburn DOT edu
Subject: RE: [Veritas-bu] DLT drives going down


Hmmmmm what about Firmware for the SDLT tape drives?
Any loose cables / connections or possible to change Scsi connectors?
 
Not too sure why they feel the software is polling a drive could cause a
problem? - if anything I would say polling a device is probably good as
Netbackup confirms it can see it!
 
The thing that makes me wonder if its cable / firmware issue is the comment
"MEDIUM NOT PRESENT".
Thanks

Simon Weaver
Technical Support
Windows Domain Administrator 

EADS Astrium
Tel: 02392-708598 

Email: Simon.Weaver AT Astrium.eads DOT net 

-----Original Message-----
From: Barber, Layne (Contractor) [mailto:layne.barber.ctr AT csd.disa DOT mil] 
Sent: 04 January 2006 14:10
To: veritas-bu AT mailman.eng.auburn DOT edu
Subject: [Veritas-bu] DLT drives going down


We have an issue of drives randomly going down every night. NBU 5.0 mp5
HP-UX 11.11 STK L180 w/ STK 3400 scsi bridge.
 
For some reason, 1 or more drives go down at random every night when backups
run. Different tapes and different drives. Backups will be running fine and
then drives begin to go down. These are SDLT320 drives. once they go down,
you can't use robtest to move the tapes (medium not present error) or use
the robtest unload command (device not present).
 
If we power cycle the scsi bridge, we can talk to the drives and do what
ever we want. STK is claiming that there is something coming from the host
that is "polling" the library from the physical layer (assume HBA). We have
had the SA for the master/media server disable any polling and load the
latest patches from HP to no avail. We have changed from auto index to a
manual map index as well.
 
This was working from the end of June up until the second week in October.
 
Thoughts/suggestions?
 
Log snippets from last night:
 


syslog entries
Jan  4 05:37:42 ujachr01 vmunix: SCSI TAPE: dev = 0xcd0801c0 I/O error
during close
Jan  4 05:50:10 ujachr01 vmunix: SCSI TAPE: dev = 0xcd0801c0 I/O error
during close
Jan  4 11:27:52 ujachr01 vmunix: SCSI TAPE: dev = 0xcd0800c0 I/O error
during close
Jan  4 11:34:36 ujachr01 tldcd[18968]: TLD(1) key = 0x5, asc = 0x3a, ascq =
0x0, MEDIUM NOT PRESENT
Jan  4 11:34:36 ujachr01 tldcd[18968]: TLD(1) Move_medium error
Jan  4 11:34:36 ujachr01 tldd[4233]: TLD(1) drive 5 (device 4) is being
DOWNED, status: Robotic dismount failure
Jan  4 11:34:36 ujachr01 tldd[4233]: Check integrity of the drive, drive
path, and media
 
drive 5 (addr 504) access = 0 Contains Cartridge = yes
Source address = 1119 (slot 120)
Barcode = JA1156
 

Jan  4 11:55:12 ujachr01 tldcd[19684]: TLD(1) key = 0x5, asc = 0x3a, ascq =
0x0, MEDIUM NOT PRESENT
Jan  4 11:55:12 ujachr01 tldcd[19684]: TLD(1) Move_medium error
Jan  4 11:55:12 ujachr01 tldd[4233]: TLD(1) drive 1 (device 0) is being
DOWNED, status: Robotic dismount failure
Jan  4 11:55:12 ujachr01 tldd[4233]: Check integrity of the drive, drive
path, and media
 
drive 1 (addr 500) access = 0 Contains Cartridge = yes
Source address = 1106 (slot 107)
Barcode = JA1064


This email is for the intended addressee only.
If you have received it in error then you must not use, retain, disseminate
or otherwise deal with it.
Please notify the sender by return email.
The views of the author may not necessarily constitute the views of EADS
Astrium Limited.
Nothing in this email shall bind EADS Astrium Limited in any contract or
obligation.

EADS Astrium Limited, Registered in England and Wales No. 2449259
Registered Office: Gunnels Wood Road, Stevenage, Hertfordshire, SG1 2AS,
England
        



This email is for the intended addressee only.
If you have received it in error then you must not use, retain, disseminate or 
otherwise deal with it.
Please notify the sender by return email.
The views of the author may not necessarily constitute the views of EADS 
Astrium Limited.
Nothing in this email shall bind EADS Astrium Limited in any contract or 
obligation.

EADS Astrium Limited, Registered in England and Wales No. 2449259
Registered Office: Gunnels Wood Road, Stevenage, Hertfordshire, SG1 2AS, England
------_=_NextPart_001_01C6113D.DFB07AC0
Content-Type: text/html

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=us-ascii">
<TITLE>Message</TITLE>

<META content="MSHTML 6.00.2900.2180" name=GENERATOR></HEAD>
<BODY>
<DIV><SPAN class=270104714-04012006><FONT face=Arial color=#0000ff 
size=2>ACtually just one more thing I noticed - you said they checked cables - 
what about REPLACE? Reason I say this, is due to the amount of time its been 
working (June to Oct) and we are now in Jan, so I think its safe to say it 
sounds like H/W</FONT></SPAN></DIV><!-- Converted from text/rtf format -->
<P><SPAN lang=en-gb><B><FONT face=Arial color=#0000ff size=2>Simon 
Weaver</FONT></B><FONT face=Arial><BR></FONT><B><FONT face=Arial color=#0000ff 
size=2>Technical Support</FONT></B><FONT face=Arial><BR></FONT><B><FONT 
face=Arial color=#0000ff size=2>Windows Domain Administrator</FONT></B><FONT 
face=Arial> </FONT></SPAN></P>
<P><SPAN lang=en-gb><B><I><FONT face=Arial size=2>EADS 
Astrium</FONT></I></B><I></I><FONT face=Arial><BR></FONT><B></B><B><I><FONT 
face=Arial size=2>Tel: 02392-70</FONT><FONT face=Arial 
size=2>8598</FONT></I></B><I></I><FONT face=Arial> </FONT></SPAN></P>
<P><SPAN lang=en-gb><B><FONT face=Arial color=#ff0000 size=2>Email: 
Simon.Weaver AT Astrium.eads DOT net</FONT></B><FONT face=Arial> 
</FONT></SPAN></P>
<BLOCKQUOTE style="MARGIN-RIGHT: 0px">
  <DIV></DIV>
  <DIV class=OutlookMessageHeader lang=en-us dir=ltr align=left><FONT 
  face=Tahoma size=2>-----Original Message-----<BR><B>From:</B> Barber, Layne 
  (Contractor) [mailto:layne.barber.ctr AT csd.disa DOT mil] <BR><B>Sent:</B> 
04 
  January 2006 14:34<BR><B>To:</B> WEAVER, Simon; 
  veritas-bu AT mailman.eng.auburn DOT edu<BR><B>Subject:</B> RE: [Veritas-bu] 
DLT 
  drives going down<BR><BR></FONT></DIV>
  <DIV dir=ltr align=left><FONT face=Arial color=#0000ff size=2><SPAN 
  class=925513214-04012006>They have upgraded the FW to the latest/greatest and 
  checked cables. I agree on the polling.</SPAN></FONT></DIV><BR>
  <DIV class=OutlookMessageHeader lang=en-us dir=ltr align=left>
  <HR tabIndex=-1>
  <FONT face=Tahoma size=2><B>From:</B> WEAVER, Simon 
  [mailto:simon.weaver AT astrium.eads DOT net] <BR><B>Sent:</B> Wednesday, 
January 04, 
  2006 08:21<BR><B>To:</B> Barber, Layne (Contractor); 
  veritas-bu AT mailman.eng.auburn DOT edu<BR><B>Subject:</B> RE: [Veritas-bu] 
DLT 
  drives going down<BR></FONT><BR></DIV>
  <DIV></DIV>
  <DIV><SPAN class=119461614-04012006><FONT face=Arial color=#0000ff 
  size=2>Hmmmmm what about Firmware for the SDLT tape 
drives?</FONT></SPAN></DIV>
  <DIV><SPAN class=119461614-04012006><FONT face=Arial color=#0000ff size=2>Any 
  loose cables / connections or possible to change Scsi 
  connectors?</FONT></SPAN></DIV>
  <DIV><SPAN class=119461614-04012006><FONT face=Arial color=#0000ff 
  size=2></FONT></SPAN>&nbsp;</DIV>
  <DIV><SPAN class=119461614-04012006><FONT face=Arial color=#0000ff size=2>Not 
  too sure why they feel the software is polling a drive could cause a 
  problem?&nbsp;- if anything I would say polling a device is probably good as 
  Netbackup confirms it can see it!</FONT></SPAN></DIV>
  <DIV><SPAN class=119461614-04012006><FONT face=Arial color=#0000ff 
  size=2></FONT></SPAN>&nbsp;</DIV>
  <DIV><SPAN class=119461614-04012006><FONT face=Arial color=#0000ff size=2>The 
  thing that makes me wonder if its cable / firmware issue is the comment 
  "MEDIUM NOT PRESENT".</FONT></SPAN></DIV>
  <DIV><SPAN class=119461614-04012006><FONT face=Arial color=#0000ff 
  size=2>Thanks</FONT></SPAN></DIV><!-- Converted from text/rtf format -->
  <P><SPAN lang=en-gb><B><FONT face=Arial color=#0000ff size=2>Simon 
  Weaver</FONT></B><FONT face=Arial><BR></FONT><B><FONT face=Arial 
color=#0000ff 
  size=2>Technical Support</FONT></B><FONT face=Arial><BR></FONT><B><FONT 
  face=Arial color=#0000ff size=2>Windows Domain Administrator</FONT></B><FONT 
  face=Arial> </FONT></SPAN></P>
  <P><SPAN lang=en-gb><B><I><FONT face=Arial size=2>EADS 
  Astrium</FONT></I></B><I></I><FONT face=Arial><BR></FONT><B></B><B><I><FONT 
  face=Arial size=2>Tel: 02392-70</FONT><FONT face=Arial 
  size=2>8598</FONT></I></B><I></I><FONT face=Arial> </FONT></SPAN></P>
  <P><SPAN lang=en-gb><B><FONT face=Arial color=#ff0000 size=2>Email: 
  Simon.Weaver AT Astrium.eads DOT net</FONT></B><FONT face=Arial> 
</FONT></SPAN></P>
  <BLOCKQUOTE dir=ltr style="MARGIN-RIGHT: 0px">
    <DIV></DIV>
    <DIV class=OutlookMessageHeader lang=en-us dir=ltr align=left><FONT 
    face=Tahoma size=2>-----Original Message-----<BR><B>From:</B> Barber, Layne 
    (Contractor) [mailto:layne.barber.ctr AT csd.disa DOT mil] <BR><B>Sent:</B> 
04 
    January 2006 14:10<BR><B>To:</B> 
    veritas-bu AT mailman.eng.auburn DOT edu<BR><B>Subject:</B> [Veritas-bu] 
DLT drives 
    going down<BR><BR></FONT></DIV>
    <DIV><SPAN class=222420014-04012006><FONT face=Arial size=2>We have an 
issue 
    of drives randomly going down every night. NBU 5.0 mp5 HP-UX 11.11 STK L180 
    w/ STK 3400 scsi bridge.</FONT></SPAN></DIV>
    <DIV><SPAN class=222420014-04012006><FONT face=Arial 
    size=2></FONT></SPAN>&nbsp;</DIV>
    <DIV><SPAN class=222420014-04012006><FONT face=Arial size=2>For some 
reason, 
    1 or more drives go down at random every night when backups run. Different 
    tapes and different drives. Backups will be running fine and then drives 
    begin to go down. These are SDLT320 drives. once they go down, you can't 
use 
    robtest to move the tapes (medium not present error) or use the robtest 
    unload command (device not present).</FONT></SPAN></DIV>
    <DIV><SPAN class=222420014-04012006><FONT face=Arial 
    size=2></FONT></SPAN>&nbsp;</DIV>
    <DIV><SPAN class=222420014-04012006><FONT face=Arial size=2>If we power 
    cycle the scsi bridge, we can talk to the drives and do what ever we want. 
    STK is claiming that there is something coming from the host that is 
    "polling" the library from the physical layer (assume HBA). We have had the 
    SA for the master/media server disable any polling and load the latest 
    patches from HP to no avail. We have changed from auto index to a manual 
map 
    index as well.</FONT></SPAN></DIV>
    <DIV><SPAN class=222420014-04012006><FONT face=Arial 
    size=2></FONT></SPAN>&nbsp;</DIV>
    <DIV><SPAN class=222420014-04012006><FONT face=Arial size=2>This was 
working 
    from the end of June up until the second week in 
October.</FONT></SPAN></DIV>
    <DIV><SPAN class=222420014-04012006><FONT face=Arial 
    size=2></FONT></SPAN>&nbsp;</DIV>
    <DIV><SPAN class=222420014-04012006><FONT face=Arial 
    size=2>Thoughts/suggestions?</FONT></SPAN></DIV>
    <DIV><SPAN class=222420014-04012006><FONT face=Arial 
    size=2></FONT></SPAN>&nbsp;</DIV>
    <DIV><SPAN class=222420014-04012006><FONT face=Arial size=2>Log snippets 
    from last night:</FONT></SPAN></DIV>
    <DIV><SPAN class=222420014-04012006><FONT face=Arial 
    size=2></FONT></SPAN>&nbsp;</DIV><SPAN class=222420014-04012006><FONT 
    face=Arial size=2>
    <DIV><BR>syslog entries<BR>Jan&nbsp; 4 05:37:42 ujachr01 vmunix: SCSI TAPE: 
    dev = 0xcd0801c0 I/O error during close<BR>Jan&nbsp; 4 05:50:10 ujachr01 
    vmunix: SCSI TAPE: dev = 0xcd0801c0 I/O error during close<BR>Jan&nbsp; 4 
    11:27:52 ujachr01 vmunix: SCSI TAPE: dev = 0xcd0800c0 I/O error during 
    close<BR>Jan&nbsp; 4 11:34:36 ujachr01 tldcd[18968]: TLD(1) key = 0x5, asc 
= 
    0x3a, ascq = 0x0, MEDIUM NOT PRESENT<BR>Jan&nbsp; 4 11:34:36 ujachr01 
    tldcd[18968]: TLD(1) Move_medium error<BR>Jan&nbsp; 4 11:34:36 ujachr01 
    tldd[4233]: TLD(1) drive 5 (device 4) is being DOWNED, status: Robotic 
    dismount failure<BR>Jan&nbsp; 4 11:34:36 ujachr01 tldd[4233]: Check 
    integrity of the drive, drive path, and media</DIV>
    <DIV>&nbsp;</DIV>
    <DIV>drive 5 (addr 504) access = 0 Contains Cartridge = yes<BR>Source 
    address = 1119 (slot 120)<BR>Barcode = JA1156</DIV>
    <DIV>&nbsp;</DIV>
    <DIV><BR>Jan&nbsp; 4 11:55:12 ujachr01 tldcd[19684]: TLD(1) key = 0x5, asc 
= 
    0x3a, ascq = 0x0, MEDIUM NOT PRESENT<BR>Jan&nbsp; 4 11:55:12 ujachr01 
    tldcd[19684]: TLD(1) Move_medium error<BR>Jan&nbsp; 4 11:55:12 ujachr01 
    tldd[4233]: TLD(1) drive 1 (device 0) is being DOWNED, status: Robotic 
    dismount failure<BR>Jan&nbsp; 4 11:55:12 ujachr01 tldd[4233]: Check 
    integrity of the drive, drive path, and media</DIV>
    <DIV>&nbsp;</DIV>
    <DIV>drive 1 (addr 500) access = 0 Contains Cartridge = yes<BR>Source 
    address = 1106 (slot 107)<BR>Barcode = 
  JA1064<BR></FONT></SPAN></DIV></BLOCKQUOTE>
  <TABLE>
    <TBODY>
    <TR>
      <TD bgColor=#ffffff><FONT color=#000000>This email is for the intended 
        addressee only.<BR>If you have received it in error then you must not 
        use, retain, disseminate or otherwise deal with it.<BR>Please notify 
the 
        sender by return email.<BR>The views of the author may not necessarily 
        constitute the views of EADS Astrium Limited.<BR>Nothing in this email 
        shall bind EADS Astrium Limited in any contract or 
        obligation.<BR><BR>EADS Astrium Limited, Registered in England and 
Wales 
        No. 2449259<BR>Registered Office: Gunnels Wood Road, Stevenage, 
        Hertfordshire, SG1 2AS, 
England<BR></FONT></TD></TR></TBODY></TABLE></BLOCKQUOTE></BODY></HTML>

<table><tr><td bgcolor=#ffffff><font color=#000000>This email is for the 
intended addressee only.<br>
If you have received it in error then you must not use, retain, disseminate or 
otherwise deal with it.<br>
Please notify the sender by return email.<br>
The views of the author may not necessarily constitute the views of EADS 
Astrium Limited.<br>
Nothing in this email shall bind EADS Astrium Limited in any contract or 
obligation.<br>
<br>
EADS Astrium Limited, Registered in England and Wales No. 2449259<br>
Registered Office: Gunnels Wood Road, Stevenage, Hertfordshire, SG1 2AS, 
England<br>
</font></td></tr></table>
------_=_NextPart_001_01C6113D.DFB07AC0--

<Prev in Thread] Current Thread [Next in Thread>