Networker

[Networker] Networker server experiencing frequent unexpected reboots

2009-04-09 17:15:31
Subject: [Networker] Networker server experiencing frequent unexpected reboots
From: stancole <networker-forum AT BACKUPCENTRAL DOT COM>
To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
Date: Thu, 9 Apr 2009 17:05:56 -0400
I have recently updated the HP software to the 8.20A that was just released, 
but I have gone through many iterations of HP software. 7.91 was the most 
stable and when I had the least amount of issues.  I am running 2k3 SP2 R2.   
Unfortunately there are no dumps that are being created to analyze.  Regarding 
the load on the dl360, I don't really think we are touching the performance 
limits of the 360 now.  We have 20 - 25 clients and a full backup is approx. 
5TB which only happens on the weekends. 

90% of my backups go directly to the Data Domain devices.  The only real time 
there is any traffic over the fiber is during a clone for offsite or the 
nightly Oracle backups that run on the EDL.  Those are only 200GB and are 
usually done in 3 or 4 hours.  Generally the reboots happen after a large clone 
job or other large backup.  I went ahead and disabled the ASR in bios, and am 
monitoring that now.  I haven't had a reboot since, but I also have not really 
had the box under a load yet.  This weekend will be a true test.  Also 
regarding the hardware, I believe this is a hardware or driver issue as well.  
But the problem is that I have tested through the following methods and the 
problem still exists:

- Original install on DL580 G1
- reboots started shortly after install
- replaced memory, (no change)
- Re-installed on different DL580 G1 (same problem)
- ordered new dl360 g5 (ran clean for a few months)
- reboots started again.
 - replaced motherboard
 - updated networker software
 - replaced memory
 - messed with NIC config switching between teaming configs and running without 
team
 - rebuilt new dl360 and restored networker server
 - removed MOM client (reboots stopped for 8 months)
- updated the HP Firmware and software (reboots started again)
- reverted the HP software and firmware (reboots continued)
- modified the zoning config in our fiber switches
- updated networker software
- re-seated memory
- re-seated hba cards
- replaced memory
- updated HP software
- updated HBA firmware
- updated HBA drivers
- disabled ASR in bios (?)

So after all of this I still have the problem.

+----------------------------------------------------------------------
|This was sent by scole AT scriptpro DOT com via Backup Central.
|Forward SPAM to abuse AT backupcentral DOT com.
+----------------------------------------------------------------------

To sign off this list, send email to listserv AT listserv.temple DOT edu and 
type "signoff networker" in the body of the email. Please write to 
networker-request AT listserv.temple DOT edu if you have any problems with this 
list. You can access the archives at 
http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER