Networker

Re: [Networker] Problem with Legato 6.1.3

2003-07-08 09:52:41
Subject: Re: [Networker] Problem with Legato 6.1.3
From: "David E. Nelson" <david.nelson AT NI DOT COM>
To: NETWORKER AT LISTMAIL.TEMPLE DOT EDU
Date: Tue, 8 Jul 2003 08:51:45 -0500
Hi Davina,

Here's the original msg to Legato Tech Support.  I'd be curious if this ever
made it into their KnowledgeBase thingie.

Regards,
        /\/elson


---------- Forwarded message ----------
Date: Tue, 4 Dec 2001 13:59:06 -0600 (CST)

Hi Peter,

We found the problem!  It was a Cisco switch issue.  Also, thus far, there
isn't a Cisco BugID associated with the problem.  I think this should
definitely be entered into the knowledge base since the cause of the problem is
obscure and unless you know what to look for, you probably won't find it.

Symptoms:

- Backups would start with acceptable performance numbers.  They'd then fall
off slowly.  Eventually slowing to only a few kB/sec, followed by loss of
connection and NW retries that would repeat the cycle until exhausting the
number retries.

- This did not only affect NetWorker, but also affected other backup solutions
such as Backup Exec.

- Large numbers of file xfers across the network with Window's clients would
fail on *a* file then continue.  A retry by the individual on the failed file
would succeed.

- Random 'File Not Accessible' error msgs from Windows clients.

- Windows clients would report back 'The Network Name has Been Lost'.

- Rsh commands would appear to "hang" for long periods of time on Solaris
clients.

- Moving a server connection to a different Cisco switch would solve the
problem.


Testing:

Here is the info about the Cat5k uplinks.
Module type:
WS-U5534-GESX 800-02419-01 -B0 28-2416-05 -A0 Dual Port 1000BASE-SX-MMF
Uplink Module -D0
Criteria for determining bad modules:
1) serial number < 22400000  ("show module" or "show version")
2) large number of FCS errors

Troubleshooting
If you have any uplink modules with a lot of FCS errors, can you send me
the following info for ports on both sides of the problem link?:
1. sh counters
2. clear counters
wait 1 minute....
3. sh counters so we can see how rapidly fcs's are increasing
4. sh asicreg gigmac
5. sh asicreg gigmac_phoenix
6. sh cdp neighbor detail

Yes FCS errors was the main indicator that showed the bug, and the FCS
errors would be counted on the switch that the affected switch was
directly connected to - not the affected switch itself.  In other words if
switch A is connected to switch B which has the faulty hardware, switch A
would show the FCS errors.

The main symptoms we saw were file transfers failing on certain files -
usually .dll files in Windows file copies.  The Windows PC would have an
error similar to " The Network Name has Been Lost" leading us to think
maybe WINS was acting up at one time.  We also had of course NT and unix
backups failing now in the core.  Other complaints were that users could
not copy files to and from nirvana or users could not get to nirvana - I
don't know how true these were.  Cisco engineers told me that symptoms
vary for different affected customers, but usually FCS errors or Alignment
Errors will show up on the Gigabit ports on the Supervisor Module.

--
Note: To sign off this list, send a "signoff networker" command via email
to listserv AT listmail.temple DOT edu or visit the list's Web site at
http://listmail.temple.edu/archives/networker.html where you can
also view and post messages to the list.
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=

<Prev in Thread] Current Thread [Next in Thread>