Veritas-bu

Re: [Veritas-bu] Linux slowly dropping drives.

2009-01-29 19:09:45
Subject: Re: [Veritas-bu] Linux slowly dropping drives.
From: "Donaldson, Mark" <Mark.Donaldson AT staples DOT com>
To: "Cornely, David" <David_Cornely AT intuit DOT com>, <veritas-bu AT mailman.eng.auburn DOT edu>
Date: Thu, 29 Jan 2009 16:59:58 -0700
Well, the odd thing is that only the Linux box does this.  The 17 other
AIX media servers and the 1 solaris media server (slow transition) don't
have this problem at all. 

-----Original Message-----
From: veritas-bu-bounces AT mailman.eng.auburn DOT edu
[mailto:veritas-bu-bounces AT mailman.eng.auburn DOT edu] On Behalf Of Cornely,
David
Sent: Thursday, January 29, 2009 4:01 PM
To: veritas-bu AT mailman.eng.auburn DOT edu
Subject: Re: [Veritas-bu] Linux slowly dropping drives.

I'm about to configure a new NBU environment using RHEL5 (master & media
server) so this is of some interest to me too.  However, I'll be using a
VTL to emulate LTO-4.

The one thing that caught my eye immediately was the FC bridges.  In the
past I've experienced many problems with bridges in general, regardless
of the tape drive behind them.
Are you able to collect logs from the bridges?  Are there any known
firmware bugs for the bridges?  It might be worth upgrading the firmware
if so...


-----Original Message-----
From: veritas-bu-bounces AT mailman.eng.auburn DOT edu
[mailto:veritas-bu-bounces AT mailman.eng.auburn DOT edu] On Behalf Of
Donaldson, Mark
Sent: Thursday, January 29, 2009 2:48 PM
To: Rosenkoetter, Gabriel
Cc: veritas-bu AT mailman.eng.auburn DOT edu
Subject: Re: [Veritas-bu] Linux slowly dropping drives.

The two that say they need cleaning are different than the ones that are
down.  There's no correlation there.

I just put in 6.5.3 yesterday so maybe the behavior will change (from
6.5.2).

I'm short logs but I'll see what I can gather.

The drive "up" just fine, it's just on next use they'll go back down
again.

Drives are in an ADIC scalar 10K.  LTO2 scsi drives behind their SNC
fiber bridge. Frankly, I don't remember if the library does cleaning on
it's own  - something to look into.

There's no touch file.

-----Original Message-----
From: Rosenkoetter, Gabriel [mailto:Gabriel.Rosenkoetter AT radian DOT biz] 
Sent: Thursday, January 29, 2009 12:30 PM
To: Donaldson, Mark
Cc: veritas-bu AT mailman.eng.auburn DOT edu
Subject: RE: [Veritas-bu] Linux slowly dropping drives.

One drive may easily have needed cleaning for longer... or are you
saying the two that advertise that are a disjoint set from the one
that's down?

What do you see in /var/log/messages when you try to up the drives?

What do you see in /usr/openv/netbackup/logs/bptm/log.MMDDYY and in
/var/log/messages when the drives are downed of their own accord? (Turn
VERBOSE=5 on in bp.conf if you haven't already.)

Also, please describe your environment in more detail:

Are these drives in a tape library? If so, does that tape library
perform automatic cleaning of the drives?

Do you have the /usr/openv/volmgr/database/NO_TAPEALERT touch file in
place? (Does that file even still get used under 6.5? I don't see a
parallel setting in nbemmcmd yet...)

(I have something of a vested interest here... I'm about to migrate from
HP-UX 11iv2 to RHEL 5 for our NetBackup servers, so if there's a
fundamental flaw in the Linux ST driver or NetBackup's use of it, I'd
like to know sooner...)

--
gabriel rosenkoetter
Radian Group Inc, Senior Systems Engineer
gabriel.rosenkoetter AT radian DOT biz, 215 231 1556


-----Original Message-----
From: Donaldson, Mark [mailto:Mark.Donaldson AT Staples DOT com]
Sent: Wednesday, January 28, 2009 2:34 PM
To: Justin Piszcz
Cc: veritas-bu AT mailman.eng.auburn DOT edu
Subject: Re: [Veritas-bu] Linux slowly dropping drives.

Interesting - I never thought of tpclean.

tpclean shows two in "need cleaning" status but, as I have one drive
down now, there's no correlation with the down drive.

lto2 are scsi drives connected via fc through fiber bridges.

I've had this problem through multiple version of this OS but it's the
only Linux media server in my environment.  (We're using the native lto2
drive, too - supposed to be part of this OS).

-M

-----Original Message-----
From: Justin Piszcz [mailto:jpiszcz AT lucidpixels DOT com]
Sent: Wednesday, January 28, 2009 11:33 AM
To: Donaldson, Mark
Cc: veritas-bu AT mailman.eng.auburn DOT edu
Subject: Re: [Veritas-bu] Linux slowly dropping drives.



On Wed, 28 Jan 2009, Donaldson, Mark wrote:

> We have a dedicated media server built on an AMD box running RHEL 5.2
> (2.6.18-92.1.13.el5 #1 SMP Thu Sep 4 03:51:21 EDT 2008 x86_64).
>
> Over time, our LTO2 drives will go down one by one.  A "scan" doesn't
> seem to show any issues but if I "vmoprcmd -up" them, they'll just go
> down again.  After I collect a half-dozen down drives, I reboot the
> server and they'll be fine again for while.
>
> Anybody else having this trouble? Have you solved it?
>
> -M
>
> _______________________________________________
> Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
> http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
>

What does tpclean -L say?

Have they ever stayed up in the past?

Do you have a fiber switched environment?

Justin.






_______________________________________________
Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu

_______________________________________________
Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu


_______________________________________________
Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu