ADSM-L

Read block size discrepancy between LV image and B/A client backups

2004-04-29 16:01:33
Subject: Read block size discrepancy between LV image and B/A client backups
From: Ted Byrne <ted.byrne AT ADELPHIA DOT NET>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Thu, 29 Apr 2004 15:33:44 -0400
AIX client v. 5.1.5.5
AIX server v. 5.1.8.0
AIX version 5.2.0.0

We  have a customer who is experiencing discrepancies in performance
between large-file cold backups of Oracle data and LV image backups, both
sent over the SAN to LTO-2 tape drives.  On the lv image backups, they are
getting about 70 MB/s sustained.  For the B/A client backups, they are
getting less than 30 MB/s.

Both backups are made using the same server stanza, and data is sent to the
same stgpool on the server, so the TSM configuration should be
identical.  (See below.)  In investigating this issue, the customer
determined that the block size used when reading the data for the image
backup was 256K, where the b/a client backup was about half of that.  This
is what they reported:

       the size of disk IO for the Oracle filesystems was close to 128k,
       whereas the size of disk IO of the image backup was fixed at 256k.
       Also, the image backup had a lot less disk seek than the
       regular Oracle cold backup through SAN.

The customer believes that the read block size is contributing to, if not
entirely responsible for, the performance discrepancies.  I would expect
that the image backup would have less disk seek, since it is processing the
entire LV, but the discrepancy in read block size is puzzling.

Other systems that we have tested with similar data have achieved equally
high throughput for the LV image backup and the b/a client backup of large
files, so it's not clear why this environment (mission-critical production
of course) would get such dramatically different throughput.

I have a PMR open with IBM regarding this issue.  Is anyone aware of a way
to control the block size used for reads by the TSM client?

I'd like to do some client-side tracing to see if we can turn up the reason
for the performance differences, but I would like to be selective about the
tracing, so as to not unduly impact the client machine.  Any suggestions
regarding what traceflags would be best to use?

If anyone has any suggestions regarding how to approach this, I'm all ears.

Thanks,

Ted

   SERVERNAME         LAXU30_DBCOLD
   NOdename           LAXU23_DBCOLD
   PASSWORDAccess     generate
   enablelanfree      yes
   lanfreecommmethod  tcpip
   lanfreetcpport     1500
   TCPPort            1500
   TCPServeraddress   LAXu30.nowhere.com
   TCPCLientport      1503
   HTTPPort           1583
   TCPBuffsize        256
   TCPWindowsize      1024
   TCPNodelay         Yes
   TXNBYTELIMIT       2097152
   LargeCommBuffers   No
   Inclexcl           /usr/tivoli/tsm/client/ba/bin/inclexcl.dbcold.def
   ERRORLOGR          30 D
   errorlogname       /usr/tivoli/tsm/client/ba/bin/dsmerror.dbcold.log
   SLAXDLOGR          30 D
   sLAXdlogname       /usr/tivoli/tsm/client/ba/bin/dsmsLAXd.dbcold.log
   SLAXDMode          Prompted
   ResourceUtilization  2
   DOMAIN.IMAGE       /dev/redo1lv /dev/redo2lv /dev/redo3lv /dev/redo4lv

<Prev in Thread] Current Thread [Next in Thread>
  • Read block size discrepancy between LV image and B/A client backups, Ted Byrne <=