Veritas-bu

[Veritas-bu] RMAN performance doth sucketh on media server.

2005-05-23 20:55:38
Subject: [Veritas-bu] RMAN performance doth sucketh on media server.
From: Mark.Donaldson AT cexp DOT com (Mark.Donaldson AT cexp DOT com)
Date: Mon, 23 May 2005 18:55:38 -0600
NET_BUFFER_SIZE is used when communicating to a remote media server and is
therefore not applicable in this situation.

NUMBER_DATA_BUFFERS for me is 16 and SIZE_DATA_BUFFERS is 256K.

Still, I don't think that buffering is an issue since I'm not getting any
significant notes about any waits for full buffers.  If anything, the buffer
is too large because I'm not filling them as fast as I can empty them.
Buffering too small to an LTO2 drive, though, is a guaranteed performance
killer.  

-M

-----Original Message-----
From: yeo-ming.koh AT amd DOT com [mailto:yeo-ming.koh AT amd DOT com]
Sent: Monday, May 23, 2005 6:49 PM
To: Mark.Donaldson AT cexp DOT com; veritas-bu AT mailman.eng.auburn DOT edu
Subject: RE: [Veritas-bu] RMAN performance doth sucketh on media server.


This will be useful to you.

-----Original Message-----
From: veritas-bu-admin AT mailman.eng.auburn DOT edu
[mailto:veritas-bu-admin AT mailman.eng.auburn DOT edu]On Behalf Of
Mark.Donaldson AT cexp DOT com
Sent: Tuesday, May 24, 2005 8:42 AM
To: veritas-bu AT mailman.eng.auburn DOT edu
Subject: [Veritas-bu] RMAN performance doth sucketh on media server.


I've got a media server with several large Oracle databases on it.  Simply
put, the performance sucks. 

It's an RMAN backup with 4 channels, MPX'd to 4/drive (so we're using one
tape drive).  I'm also making an in-line tape copy to a second tape at the
same time for off-site purposes.  Both tape drives are LTO2 shared in an SSO
environment.

Monitoring iostat, I see a burst of ~2MB/sec or so, usually after the
write_data "start writing data when first buffer is available" is logged,
then a fallback to 100/200/300 KB/sec for the remainder of the job.  The
"waiting on full/empty" buffer notes are few but when they are there, it's
"waiting on full buffer".

Here's the brief report from my summarizer on the policy:

---------------------------------------
## Gathering data.....Done.
## Write to buffer waiting on available buffer:
No Data Found.

## Write to tape waiting on full buffer:
Min: 38363  Avg: 58225  Max: 134966 with 13 samples
---------------------------------------

The bptm logs are showing the job spending a lot of time doing this:

   0.0  0.0  0.0    0.0    0.0   0   0 rmt/13
    0.0    0.0    0.0  


18:16:19.866 [28104] <2> io_write_back_header: drive index 14,
ue-13_1116892156, file num = 88, mpx_headers = 3, copy 1
18:16:19.869 [28104] <2> io_write_back_header: drive index 14,
ue-13_1116892192, file num = 88, mpx_headers = 2, copy 1
18:16:19.870 [28104] <2> io_write_back_header: drive index 14,
ue-13_1116892207, file num = 88, mpx_headers = 1, copy 1
18:16:19.871 [28104] <2> write_data: completed writing backup header, start
writing data when first buffer is available, copy 1
18:16:19.878 [28104] <2> write_data: absolute block position prior to
writing backup header(s) is 878978, copy 2
18:16:19.878 [28104] <2> write_data: block position check: actual 878978,
expected 878978
18:16:19.878 [28104] <2> io_write_back_header: drive index 13,
ue-13_1116892156, file num = 96, mpx_headers = 3, copy 2
18:16:19.881 [28104] <2> io_write_back_header: drive index 13,
ue-13_1116892192, file num = 96, mpx_headers = 2, copy 2
18:16:19.882 [28104] <2> io_write_back_header: drive index 13,
ue-13_1116892207, file num = 96, mpx_headers = 1, copy 2
18:16:19.883 [28104] <2> write_data: completed writing backup header, start
writing data when first buffer is available, copy 2
18:16:19.883 [28104] <2> write_data: first write, twin_index: 0 cindex: 2
dont_process: 1 wrote_backup_hdr: 1 finished_buff: 0
18:16:19.883 [28104] <2> write_data: received first buffer (262144 bytes),
begin writing data
18:16:23.608 [26567] <2> bptm: INITIATING (VERBOSE = 0): -U
18:16:23.610 [26567] <2> db_byid: search for media id 005111
18:16:23.610 [26567] <2> db_byid: 005111 found at offset 2
18:16:23.622 [26567] <2> db_lock_media: unable to lock media at offset 2
(005111)
18:16:23.622 [26567] <2> db_byid: search for media id 005121
18:16:23.622 [26567] <2> db_byid: 005121 found at offset 4
18:16:23.642 [26567] <2> db_lock_media: unable to lock media at offset 4
(005121)
18:16:23.642 [26567] <2> bptm: EXITING with status 0 <----------
18:16:41.852 [28104] <2> write_backup: write_data() returned, exit_status =
0, CINDEX = 2, TWIN_INDEX = 0, backup_status = -7
18:16:41.852 [28104] <2> io_ioctl: command (0)MTWEOF 1 from (bptm.c.17789)
on drive index 14
18:16:43.199 [28104] <2> io_ioctl: command (0)MTWEOF 1 from (bptm.c.17789)
on drive index 13

<snip>

18:25:25.096 [29843] <2> bptm: INITIATING (VERBOSE = 0): -U
18:25:25.098 [29843] <2> db_byid: search for media id 005111
18:25:25.098 [29843] <2> db_byid: 005111 found at offset 2
18:25:25.110 [29843] <2> db_lock_media: unable to lock media at offset 2
(005111)
18:25:25.110 [29843] <2> db_byid: search for media id 005121
18:25:25.111 [29843] <2> db_byid: 005121 found at offset 4
18:25:25.130 [29843] <2> db_lock_media: unable to lock media at offset 4
(005121)
18:25:25.130 [29843] <2> bptm: EXITING with status 0 <----------
18:25:30.625 [29901] <2> bptm: INITIATING (VERBOSE = 0): -U
18:25:30.627 [29901] <2> db_byid: search for media id 005111
18:25:30.627 [29901] <2> db_byid: 005111 found at offset 2
18:25:30.639 [29901] <2> db_lock_media: unable to lock media at offset 2
(005111)
18:25:30.639 [29901] <2> db_byid: search for media id 005121
18:25:30.639 [29901] <2> db_byid: 005121 found at offset 4
18:25:30.659 [29901] <2> db_lock_media: unable to lock media at offset 4
(005121)
18:25:30.659 [29901] <2> bptm: EXITING with status 0 <----------
18:25:35.991 [29978] <2> bptm: INITIATING (VERBOSE = 0): -U
18:25:35.993 [29978] <2> db_byid: search for media id 005111
18:25:35.993 [29978] <2> db_byid: 005111 found at offset 2
18:25:36.008 [29978] <2> db_lock_media: unable to lock media at offset 2
(005111)
18:25:36.008 [29978] <2> db_byid: search for media id 005121
18:25:36.008 [29978] <2> db_byid: 005121 found at offset 4
18:25:36.028 [29978] <2> db_lock_media: unable to lock media at offset 4
(005121)
18:25:36.028 [29978] <2> bptm: EXITING with status 0 <----------
18:25:39.407 [29983] <2> bptm: INITIATING (VERBOSE = 0): -U
18:25:39.409 [29983] <2> db_byid: search for media id 005111
18:25:39.409 [29983] <2> db_byid: 005111 found at offset 2
18:25:39.428 [29983] <2> db_lock_media: unable to lock media at offset 2
(005111)
18:25:39.428 [29983] <2> db_byid: search for media id 005121
18:25:39.428 [29983] <2> db_byid: 005121 found at offset 4
18:25:39.448 [29983] <2> db_lock_media: unable to lock media at offset 4
(005121)
18:25:39.449 [29983] <2> bptm: EXITING with status 0 <----------
18:25:42.651 [29998] <2> bptm: INITIATING (VERBOSE = 0): -U
18:25:42.653 [29998] <2> db_byid: search for media id 005111
18:25:42.653 [29998] <2> db_byid: 005111 found at offset 2
18:25:42.667 [29998] <2> db_lock_media: unable to lock media at offset 2
(005111)
18:25:42.667 [29998] <2> db_byid: search for media id 005121
18:25:42.667 [29998] <2> db_byid: 005121 found at offset 4
18:25:42.687 [29998] <2> db_lock_media: unable to lock media at offset 4
(005121)
18:25:42.687 [29998] <2> bptm: EXITING with status 0 <----------
18:25:53.231 [57] <2> bptm: INITIATING (VERBOSE = 0): -U
18:25:53.233 [57] <2> db_byid: search for media id 005111
18:25:53.233 [57] <2> db_byid: 005111 found at offset 2
18:25:53.246 [57] <2> db_lock_media: unable to lock media at offset 2
(005111)
18:25:53.246 [57] <2> db_byid: search for media id 005121
18:25:53.246 [57] <2> db_byid: 005121 found at offset 4
18:25:53.266 [57] <2> db_lock_media: unable to lock media at offset 4
(005121)
18:25:53.267 [57] <2> bptm: EXITING with status 0 <----------
_______________________________________________
Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu