Veritas-bu

[Veritas-bu] Tru64 NetBackup Performance

2010-03-10 11:52:40
Subject: [Veritas-bu] Tru64 NetBackup Performance
From: "David McMullin" <David.McMullin AT CBC-Companies DOT com>
To: "veritas-bu AT mailman.eng.auburn DOT edu" <veritas-bu AT mailman.eng.auburn DOT edu>
Date: Wed, 10 Mar 2010 11:51:57 -0500
Perhaps it is an oracle thing?
I had our DBA remove the "check logical" parameter and our throughput improved 
10X

------------------------------

Message: 11
Date: Tue, 9 Mar 2010 13:11:49 -0600
From: Heathe Yeakley <hkyeakley AT gmail DOT com>
Subject: [Veritas-bu] Tru64 NetBackup Performance
To: Veritas-bu AT mailman.eng.auburn DOT edu
Message-ID:
        <a8a60f81003091111i14c7f79fn32d42ec576ab6370 AT mail.gmail DOT com>
Content-Type: text/plain; charset=ISO-8859-1

--== Warning: Wall of text incoming ==--

I have a NetBackup environment consisting of:

-= Local Site =-
1 Red Hat Linux AS 4 Master running NBU 6.0 MP7
2 Red Hat Linux AS 4 Media Servers running NBU 6.0 MP7
3 Tru64 V5.1B (Rev. 2650) SAN Media running NBU 6.0 MP7 (Mix between O/S patch 
kit 6 and 7)
1 Spectra Logic T380 with 12 IBM LTO4 drives running latest BlueScale patches 
and drive firmware.
1 NetApp 1400 VTL running latest firmware.

-= DR Site =-
1 Red Hat Linux AS 4 Master running NBU 6.0 MP7
1 Tru64 V5.1B (Rev. 2650) SAN Media running NBU 6.0 MP7
1 Spectra Logic T200 with 12 IBM LTO4 drives running latest BlueScale patches 
and drive firmware.

Last July we replaced our ADIC i2000 library (LTO2 drives) with a Spectra Logic 
T380. Once we got the library deployed we noticed that our Linux systems are 
able to write to the library at LTO4 speeds, and the regular network clients 
even get decent throughput over a 1gb ethernet network. But the 3 Tru64 SAN 
media servers absolutely crawl.
In spite of the fact that I have the SAN media server license installed, I can 
only get about 10 - 20 MB/s on the policies using the
Tru64 storage units.

Our main production database sits on a GS1280 (30 CPUS ,114 GB memory), and we 
have a ES80 attached to another Spectra Logic library at our DR site. Every 
Sunday morning, I backup an RMAN backup to tape, mail the tapes to my DR site, 
and restore the RMAN files using a Spectra Logic T200 attached to the ES80, 
which also has the SAN Media Server software installed. My GS1280 system takes 
15-20 hours to backup, but my DR system can restore the same files in 6-7 hours 
running at 80 - 110 MB/s. I'm completely baffled how the smaller system gets 
such awesome throughput while my production box plods along at sub-ethernet 
speeds.

I've spent the past several months researching performance and tuning 
suggestions and I've applied settings 1 at a time when I can get an outage.

To speed up testing, we have another GS1280 with 1/2 the CPU and memory as the 
production system, and it only runs test databases, so it's easier to ask to 
reboot it if I want to try tuning a particular kernel parm or what not. I 
installed the SAN media server software on this second 1280 and I've been 
trying to tune it to NetBackup for the last couple of months.

Within NetBackup, I've tuned the Size and Number of data buffers, and it has no 
visible effect.

I've used the hwmgr command to look at the driver and firmware level of just 
about every piece of equipment on both systems, up to and including the 
individual busses. The GS1280 has everything the ES80 does, it just has more of 
it.

I've verified HBA drivers on all boxes and all appear to be at the latest 
firmware.

I've asked my SAN guys to double check the zoning, LUN masking, configuration 
and firmware levels on the SAN switches here and at my DR site to see if 
there's anything that might be preventing Tru64 from writing to either of my 
libraries at SAN speeds. They have checked and everything seems to be in order 
on both SAN environments. Furthermore, I've asked them to look at port 
utilization on the SAN switches during test backups from the 1280 and they tell 
me that the HBAs are hardly being utilized.

We recently deployed a NetApp VTL, and I was curious if perhaps the VTL got 
better performance (which would indicate some type of incompatibility between 
Tru64 and Spectra Logic). There isn't one that I can find. If I setup a test 
policy to write to the VTL from my test GS1280 and let it write to all 80 
virtual drives, no one stream exceeds about 10 - 20 MB/s.

Next, I looked at the fragmentation level of the AdvFS domains on both systems. 
While some are heavily fragmented, the I/O performance on both systems is 100% 
for every file domain I've checked.

The fact that all my clients (Windows, Linux and the handful of Solaris 10) 
work well with both libraries makes me think that this is something in Tru64. 
If that's true, then I'm trying to figure out what is set correctly on my DR ES 
80 that's jacked up on my local 1280.

According to section 1.9 of the Tru64 tuning manual
(http://h30097.www3.hp.com/docs/base_doc/DOCUMENTATION/V51B_HTML/ARH9GCTE/TITLE.HTM)
the 5 most commonly tuned kernel subsystems are: vm, ipc, proc, inet, and 
socket. Furthermore, http://seer.entsupport.symantec.com/docs/235845.htm is a 
technote advising Tru64 kernel changes for NetBackup. I have examined the 
values across all my systems. In most cases, the values on both systems meet or 
exceed tuning suggestions I have found in manufacturer documentation. The two 
or three values I have tuned so far have had no effect.

http://www.scribd.com/doc/19213788/Net-Backup-6
I found this TechNote which recommends setting the sem_mni and sem_msl values 
to 1,000. sem_msl is currently set to 500 on my local 1280, and I think this is 
perhaps the only kernel parm I have yet to tune. I'm going to ask for an outage 
this week to increase this setting to 1,000. If that doesn't work, then I 
believe I will be officially stumped.

I've also watched the EVM channels and the binary error log and haven't seen 
anything alarming. The tape drives aren't throwing errors and appear to be 
working fine.

This is leading me to believe that there is something not tuned correctly 
between the Tru64 O/S and the NetBackup client. If it's not in the kernel then 
I simply don't know where else to look.

I'll be posting this to the NetBackup forums on Symantec.com, the ITRC forums 
on HP.com and the NetBackup mailing list.

Can anyone think of any stone I've left unturned? Thanks.




_______________________________________________
Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu

<Prev in Thread] Current Thread [Next in Thread>