Veritas-bu

[Veritas-bu] Interesting Things w/ SSO on a Fibre Tape SAN (W in2k)

2002-06-20 20:16:30
Subject: [Veritas-bu] Interesting Things w/ SSO on a Fibre Tape SAN (W in2k)
From: anderson.david AT scrippshealth DOT org (Anderson, David)
Date: Thu, 20 Jun 2002 17:16:30 -0700
Scott,

Thanks for the kick I needed.  I finally printed out the 4.5 Release Notes
and dug in (still don't have a real copy of the manuals).  Page 85, bullet
6, second paragraph:
 
        "The use of SCSI reserve/release is on by default, but can be
disabled by using an entry in the UNIX bp.conf file or in the registry on
Windows NetBackup server."

There are some other interesting "ahhs" in here that I'm finding.  At this
point, however, it looks like the NIC was definitely the problem.  It has
been previously identified by HP/Compaq as a problem.  I finally got one of
the local technical sales reps to show me the internal report (read: company
confidential).  They've known about it for several months now.  A driver fix
was/is promised for 2nd Quarter 2002.  Well, I'm waiting!

Thanks everyone for your help.

David Anderson
ScrippsHealth Information Services

-----Original Message-----
From: scott.kendall AT abbott DOT com [mailto:scott.kendall AT abbott DOT com]
Sent: Thursday, June 20, 2002 4:15 PM
To: Anderson, David
Cc: veritas-bu AT mailman.eng.auburn DOT edu
Subject: RE: [Veritas-bu] Interesting Things w/ SSO on a Fibre Tape SAN
(Win2k)


I'm not positive with 4.5, but I know it is off by default on 3.4.  To
enable
it touch the file:

<installpath>/netbackup/db/config/ENABLE_SCSI_RESERVE

there is an entire section in the _3 release notes that can be found on the
web (which is actually different for some reason than the release notes that
were part of the actual patch download).

There are a lot of limitations and things to be aware of (FC-SCSI bridge
must
support target resets, passthru driver required, etc.).  I wouldn't just
enable it as part of a troubleshooting step.  This is a major change that
may
or may not work depending on your environment.


- Scott



 

                    "Anderson, David"

                    <anderson.david@scrippsh        To:
"'scott.kendall AT abbott DOT com'" <scott.kendall AT abbott DOT com>           
                    ealth.org>                      cc:

                                                    Subject:     RE:
[Veritas-bu] Interesting Things w/ SSO on a Fibre Tape   
                    06/17/2002 06:00 PM             SAN (W in2k)

 

 





Scott,

Also, you said that it is off by default.  How do you "turn it on".  There
isn't any reference to this.

David Anderson
ScrippsHealth Information Services
v: (858) 678-6238
tieline: 318-6238


-----Original Message-----
From: scott.kendall AT abbott DOT com [mailto:scott.kendall AT abbott DOT com]
Sent: Monday, June 17, 2002 3:39 PM
To: Anderson, David
Cc: veritas-bu AT mailman.eng.auburn DOT edu
Subject: Re: [Veritas-bu] Interesting Things w/ SSO on a Fibre Tape SAN
(Win2k)



We ran into a similar problem with Compaq's GbE card, NC6136.  It would die
(but still had link light) after a large number of connections from clients
were established, wasn't necessarily related to throughput... more number os
sessions.  This would KILL SSO.  Working with Compaq on it and am told a new
driver is in the works.  In the meantime, it works great with the Intel
driver
(actually an Intel card underneath all the logo'ing).

As for the SCSI reserve/release... fyi, it is available with 3.4.1 in patch
_3.  In either one, 3.4.1_3 or 4.5, I'm being told that it is off by
default.


- Scott





                    "Anderson, David"

                    <anderson.david@scrippshealth        To:
"'veritas-bu AT mailman.eng.auburn DOT edu'"
                    .org>
<veritas-bu AT mailman.eng.auburn DOT edu>
                    Sent by:                             cc:     "Adams,
Tim" <Adams.Tim AT scrippshealth DOT org>, "Morris, Brett"
                    veritas-bu-admin AT mailman DOT eng.
<Morris.Brett AT scrippshealth DOT org>
                    auburn.edu                           Subject:
[Veritas-bu] Interesting Things w/ SSO on a Fibre Tape SAN
                                                         (Win2k)



                    06/17/2002 10:48 AM









First off, I'll apologize for being long winded, but this is too good NOT to
share.

Our environment:
All brand new hardware
- two Compaq ProLiant ML530's, dual CPU, 1.1 G memory
- Windows 2000
- Master Server also acting as a Media Server
- StorageTek L180 (4 Drives, also tried a Compaq MSL5026 (2 drives) all SDLT
- Compaq 8EL Fibre Switch, Compaq Modular Data Router (SCSI bridge)
- We had started with NBU 3.4.1 and in desperation, installed 4.5 (the one
with SCSI Reserve/Release).
(by the way, if you're a Windows shop, do the 4.5, and to heck with Motif!)

We have been having a lot of problems with drives randomly dropping off
line.  The libraries all saw a "SCSI Bus Rewind" request, so dropped
whatever it was doing and waited for the actual command to rewind (which
would never come).  There was literally nothing that we didn't try.  With
marginal support from any of the vendors by the way, except StorageTek.

There seemed to be no correlation between events in NetBackup or issues
directly related to the hardware.  Over the past (many) weeks, we did notice
that we had occasional network problems.  Specifically, the Master and Media
servers deciding to not communication on the Gigabit Ethernet LAN.  We
originally had Compaq Gig-NIC's, but they seemed to be having a lot of
packet over-run problems, among others.  We have installed the 3Com card
instead.  This helped, but did not resolve the issues.

Finally, one of our guys started thinking about SSO being able to
communicate with itself, server to server to coordinate the SCSI
Reserve/Release function.  With this in mind, he installed a second NIC (yes
the original Compaq card) and set the hosts tables so that the Master and
Media servers would use only the dedicated channel to communicate.

We have now run for three days, including a full weekend running over 1500
separate small jobs.  No a single problem.  I'll admit that it is still too
early to put this matter to bed just yet, but this solution looks very
promising at this point.

Has anyone else seen problems of this sort?

David Anderson
ScrippsHealth Information Services

_______________________________________________
Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu











<Prev in Thread] Current Thread [Next in Thread>