Veritas-bu

[Veritas-bu] RMAN performance doth sucketh on media server. -- RESEND

2005-05-23 20:52:13
Subject: [Veritas-bu] RMAN performance doth sucketh on media server. -- RESEND
From: Mark.Donaldson AT cexp DOT com (Mark.Donaldson AT cexp DOT com)
Date: Mon, 23 May 2005 18:52:13 -0600
I've got a media server with several large Oracle databases on it.  Simply
put, the performance sucks. 

It's an RMAN backup with 4 channels, MPX'd to 4/drive (so we're using one
tape drive).  I'm also making an in-line tape copy to a second tape at the
same time for off-site purposes.  Both tape drives are LTO2 shared in an SSO
environment.  FILESPERSET have been left to the default behavior.

Monitoring iostat, I see a burst of ~2MB/sec or so, usually after the
write_data "start writing data when first buffer is available" is logged,
then a fallback to 100/200/300 KB/sec for the remainder of the job.  The
%blocked and %wait on the tape device is pretty much zero so it doesn't seem
that the tape device is pushing back at the OS at all.

The "waiting on full/empty" buffer notes are few but when they are there,
it's "waiting on full buffer".  

Here's the brief report from my summarizer on the policy's full/empty
numbers.:
---------------------------------------
## Gathering data.....Done.
## Write to buffer waiting on available buffer:
No Data Found.

## Write to tape waiting on full buffer:
Min: 38363  Avg: 58225  Max: 134966 with 13 samples
---------------------------------------


The bptm logs are showing the job spending a lot of time doing this:

18:34:11.261 [2740] <2> db_lock_media: unable to lock media at offset 2
(005111)
18:34:11.261 [2740] <2> db_byid: search for media id 005121
18:34:11.261 [2740] <2> db_byid: 005121 found at offset 4
18:34:11.280 [2740] <2> db_lock_media: unable to lock media at offset 4
(005121)
18:34:11.280 [2740] <2> bptm: EXITING with status 0 <----------


A search of the knowlege base doesn't turn up much useful on this message.
The DBA's are getting bitchy on this, anybody got any suggestions?

-M

PS: The versions are NB 5.1.2 on Solaris for both this media server & master
server.


=======  bptm log snippet  ===========

18:16:19.866 [28104] <2> io_write_back_header: drive index 14,
ue-13_1116892156, file num = 88, mpx_headers = 3, copy 1
18:16:19.869 [28104] <2> io_write_back_header: drive index 14,
ue-13_1116892192, file num = 88, mpx_headers = 2, copy 1
18:16:19.870 [28104] <2> io_write_back_header: drive index 14,
ue-13_1116892207, file num = 88, mpx_headers = 1, copy 1
18:16:19.871 [28104] <2> write_data: completed writing backup header, start
writing data when first buffer is available, copy 1
18:16:19.878 [28104] <2> write_data: absolute block position prior to
writing backup header(s) is 878978, copy 2
18:16:19.878 [28104] <2> write_data: block position check: actual 878978,
expected 878978
18:16:19.878 [28104] <2> io_write_back_header: drive index 13,
ue-13_1116892156, file num = 96, mpx_headers = 3, copy 2
18:16:19.881 [28104] <2> io_write_back_header: drive index 13,
ue-13_1116892192, file num = 96, mpx_headers = 2, copy 2
18:16:19.882 [28104] <2> io_write_back_header: drive index 13,
ue-13_1116892207, file num = 96, mpx_headers = 1, copy 2
18:16:19.883 [28104] <2> write_data: completed writing backup header, start
writing data when first buffer is available, copy 2
18:16:19.883 [28104] <2> write_data: first write, twin_index: 0 cindex: 2
dont_process: 1 wrote_backup_hdr: 1 finished_buff: 0
18:16:19.883 [28104] <2> write_data: received first buffer (262144 bytes),
begin writing data
18:16:23.608 [26567] <2> bptm: INITIATING (VERBOSE = 0): -U
18:16:23.610 [26567] <2> db_byid: search for media id 005111
18:16:23.610 [26567] <2> db_byid: 005111 found at offset 2
18:16:23.622 [26567] <2> db_lock_media: unable to lock media at offset 2
(005111)
18:16:23.622 [26567] <2> db_byid: search for media id 005121
18:16:23.622 [26567] <2> db_byid: 005121 found at offset 4
18:16:23.642 [26567] <2> db_lock_media: unable to lock media at offset 4
(005121)
18:16:23.642 [26567] <2> bptm: EXITING with status 0 <----------
18:16:41.852 [28104] <2> write_backup: write_data() returned, exit_status =
0, CINDEX = 2, TWIN_INDEX = 0, backup_status = -7
18:16:41.852 [28104] <2> io_ioctl: command (0)MTWEOF 1 from (bptm.c.17789)
on drive index 14
18:16:43.199 [28104] <2> io_ioctl: command (0)MTWEOF 1 from (bptm.c.17789)
on drive index 13

<snip>

18:25:25.096 [29843] <2> bptm: INITIATING (VERBOSE = 0): -U
18:25:25.098 [29843] <2> db_byid: search for media id 005111
18:25:25.098 [29843] <2> db_byid: 005111 found at offset 2
18:25:25.110 [29843] <2> db_lock_media: unable to lock media at offset 2
(005111)
18:25:25.110 [29843] <2> db_byid: search for media id 005121
18:25:25.111 [29843] <2> db_byid: 005121 found at offset 4
18:25:25.130 [29843] <2> db_lock_media: unable to lock media at offset 4
(005121)
18:25:25.130 [29843] <2> bptm: EXITING with status 0 <----------
18:25:30.625 [29901] <2> bptm: INITIATING (VERBOSE = 0): -U
18:25:30.627 [29901] <2> db_byid: search for media id 005111
18:25:30.627 [29901] <2> db_byid: 005111 found at offset 2
18:25:30.639 [29901] <2> db_lock_media: unable to lock media at offset 2
(005111)
18:25:30.639 [29901] <2> db_byid: search for media id 005121
18:25:30.639 [29901] <2> db_byid: 005121 found at offset 4
18:25:30.659 [29901] <2> db_lock_media: unable to lock media at offset 4
(005121)
18:25:30.659 [29901] <2> bptm: EXITING with status 0 <----------
18:25:35.991 [29978] <2> bptm: INITIATING (VERBOSE = 0): -U
18:25:35.993 [29978] <2> db_byid: search for media id 005111
18:25:35.993 [29978] <2> db_byid: 005111 found at offset 2
18:25:36.008 [29978] <2> db_lock_media: unable to lock media at offset 2
(005111)
18:25:36.008 [29978] <2> db_byid: search for media id 005121
18:25:36.008 [29978] <2> db_byid: 005121 found at offset 4
18:25:36.028 [29978] <2> db_lock_media: unable to lock media at offset 4
(005121)
18:25:36.028 [29978] <2> bptm: EXITING with status 0 <----------
18:25:39.407 [29983] <2> bptm: INITIATING (VERBOSE = 0): -U
18:25:39.409 [29983] <2> db_byid: search for media id 005111
18:25:39.409 [29983] <2> db_byid: 005111 found at offset 2
18:25:39.428 [29983] <2> db_lock_media: unable to lock media at offset 2
(005111)
18:25:39.428 [29983] <2> db_byid: search for media id 005121
18:25:39.428 [29983] <2> db_byid: 005121 found at offset 4
18:25:39.448 [29983] <2> db_lock_media: unable to lock media at offset 4
(005121)
18:25:39.449 [29983] <2> bptm: EXITING with status 0 <----------
18:25:42.651 [29998] <2> bptm: INITIATING (VERBOSE = 0): -U
18:25:42.653 [29998] <2> db_byid: search for media id 005111
18:25:42.653 [29998] <2> db_byid: 005111 found at offset 2
18:25:42.667 [29998] <2> db_lock_media: unable to lock media at offset 2
(005111)
18:25:42.667 [29998] <2> db_byid: search for media id 005121
18:25:42.667 [29998] <2> db_byid: 005121 found at offset 4
18:25:42.687 [29998] <2> db_lock_media: unable to lock media at offset 4
(005121)
18:25:42.687 [29998] <2> bptm: EXITING with status 0 <----------
18:25:53.231 [57] <2> bptm: INITIATING (VERBOSE = 0): -U
18:25:53.233 [57] <2> db_byid: search for media id 005111
18:25:53.233 [57] <2> db_byid: 005111 found at offset 2
18:25:53.246 [57] <2> db_lock_media: unable to lock media at offset 2
(005111)
18:25:53.246 [57] <2> db_byid: search for media id 005121
18:25:53.246 [57] <2> db_byid: 005121 found at offset 4
18:25:53.266 [57] <2> db_lock_media: unable to lock media at offset 4
(005121)
18:25:53.267 [57] <2> bptm: EXITING with status 0 <----------

<Prev in Thread] Current Thread [Next in Thread>
  • [Veritas-bu] RMAN performance doth sucketh on media server. -- RESEND, Mark.Donaldson AT cexp DOT com <=