[Veritas-bu] Speaking of disappearing drives
2004-05-11 14:33:52
I'll start with our setup first:
Sun 280R
1G nic
4G memory
2 x 900mhz procs
2 x StorEdge 2G HBA's
Solaris 8 (108528-23)
Netbackup DC 3.4.1 (w/manual mod to sgscan to detect the ait3 drives
(http://seer.support.veritas.com/docs/246050.htm))
ADIC Scalar1000
12 x AIT3 drives
3 x SNC5100's
Fiber is directly connected from the HBA's to the SNC's
About 2 months ago we upgraded from AIT2 to AIT3 drives, and we upgraded from
JNI's to StorEdge 2G HBA's. From day 1 we've been having problems with drives
randomly dropping off. It never seems to happen during a backup but happens at
the end and, I believe, sometimes during no activity. If one drive is down and
we leave it down then everything will work fine for a week. If we bring up the
down'd drive (with stopltid/tpconfig -> up drive/ltid) then anywhere from 10
minutes to a few hours another drive will go down. When a drive goes down it is
still accessable through the robot (using the panel on the robot) and the SNC's
don't detect any drive problems although they do report random "loss of sync"
from the HBA's occasionally. Any mt commands run against the down drive show
the drive as being inaccessable. I've implemented a temporary bandaid to "fix"
the problem by using 'vmoprcmd -up <drive #>' which has been working pretty
well and for some reason has caused things to be a little more stable - don't
ask me why. I've verified the drive is being used after I up it with this
command.
I also tweaked some of the system resource settings according to Veritas'
recommended system requirements, and I have a case open with both Sun and Adic
(who really haven't been any help at all). I've also enabled verbose output
from tl8d. I know that there are still some system settings that need to be
adjusted because it doesn't matter if one backup or 10 backups are running the
iowait averages about 50% but bounces around between 20-95% (no backups = 0%
iowait). This is the next area we are concentrating on.
I'm not sure but I suspect that what we're seeing is more of a problem with the
driver for the HBA's but I'm curious if anyone else has had similar problems
with their setup or has any suggestions?
Thanks,
--
Chris
|
<Prev in Thread] |
Current Thread |
[Next in Thread> |
- [Veritas-bu] Speaking of disappearing drives,
Chris Collier <=
|
|
|