What are you NUMBER_DATA_BUFFERS and SIZE_DATA_BUFFERS files set to? Also, what firmware on the tape drives are you using?
How long do the 100% busy periods last?
From: veritas-bu-bounces AT mailman.eng.auburn DOT edu [mailto:veritas-bu-bounces AT mailman.eng.auburn DOT edu]
On Behalf Of Jim VandeVegt
Sent: Wednesday, May 19, 2010 12:27 PM
To: veritas-bu AT mailman.eng.auburn DOT edu
Subject: [Veritas-bu] Digging into tape performance, long delays on duplications
Just implemented HP LTO5 tape drives, although I have seen this behavior on the older LTO2 drives as well.
8gbps Qlogic switch. NetBackup 6.5.6 w/ vault
Sun T5240 master/media, Solaris 10 with last Thursday's recommended patch set, 4Gbps HBA (getting an 8 soon).
The main monitoring tool I have at my disposal is 'iostat -xn' on Solaris. I like to use a 6-second interval.
I also consult the switch port statistics which do not show an increasing error count.
When doing tape-to-tape duplication, the tape drive devices will show periods of data movement (typically 15-350 MB/s depending on the size of the image file it is working
on) and then several successive (iostat) intervals of zero throughput where the %b column reads a constant 100% blocked. During this time it also shows 1.0 under the actv column. Zero everywhere else. The NetBackup it also shows the amount of data transferred
as constant. It is these periods I'm trying to understand.
Specifically, how can I dig into this? Will syslog changes get more information on the topic? NetBackup logs to increase verbosity? dtrace?
Is it tape backhitching? (Seems to be far too long a time period.)
On a few of the occasions, after a few minutes a message like this shows up in /var/adm/messages:
scsi: [ID 107833 kern.warning] WARNING: /pci@400/pci@0/pci@d/SUNW,qlc@0/fp@0,0/st@w500308c0a0c8b001,0 (st4):
SCSI transport failed: reason 'timeout': giving up
(This seems to indicate a command got dropped somewhere.)
on most occasions after a short time period, 15-60 seconds maybe, the data just starts going again.
Whatever it is, it seems to really kill the overall duplication rate. Thanks,
____________________________________________________________
This message and any attachments are confidential, may contain privileged
information, and are intended solely for the recipient named above.
If you are not the intended recipient, or a person responsible for
delivery to the named recipient, you are notified that any review,
distribution, dissemination or copying is prohibited. If you have
received this message in error, you should notify the sender by return
email and delete the message from your computer system.