Veritas-bu

[Veritas-bu] RMAN performance doth sucketh on media server.

2005-05-23 22:57:07
Subject: [Veritas-bu] RMAN performance doth sucketh on media server.
From: len.boyle AT sas DOT com (len boyle)
Date: Mon, 23 May 2005 22:57:07 -0400
Hello Mark

I do not know the answers, but have a few questions. It appears that the 
bottleneck is not on the netbackup side of things.

It sounds like the database is having trouble feeding the data to netbackup. 
does the mpx 4 cause problems with
data base performance?
What are the non-tape perf stats?
What is the method you are using to move the data from rman to netbackup? Is 
this a stupid question? If it is a socket connection, does that mean that 
NET_BUFFER_SIZE still counts. Can  rman/netbackup use shared memory?

What is the fastest connection that the greater backup crowd reaches with 
rman backups?

len





----- Original Message ----- 
From: <Mark.Donaldson AT cexp DOT com>
To: <veritas-bu AT mailman.eng.auburn DOT edu>
Sent: Monday, May 23, 2005 8:55 PM
Subject: RE: [Veritas-bu] RMAN performance doth sucketh on media server.


> NET_BUFFER_SIZE is used when communicating to a remote media server and is
> therefore not applicable in this situation.
>
> NUMBER_DATA_BUFFERS for me is 16 and SIZE_DATA_BUFFERS is 256K.
>
> Still, I don't think that buffering is an issue since I'm not getting any
> significant notes about any waits for full buffers.  If anything, the 
> buffer
> is too large because I'm not filling them as fast as I can empty them.
> Buffering too small to an LTO2 drive, though, is a guaranteed performance
> killer.
>
> -M
>
> -----Original Message-----
> From: yeo-ming.koh AT amd DOT com [mailto:yeo-ming.koh AT amd DOT com]
> Sent: Monday, May 23, 2005 6:49 PM
> To: Mark.Donaldson AT cexp DOT com; veritas-bu AT mailman.eng.auburn DOT edu
> Subject: RE: [Veritas-bu] RMAN performance doth sucketh on media server.
>
>
> This will be useful to you.
>
> -----Original Message-----
> From: veritas-bu-admin AT mailman.eng.auburn DOT edu
> [mailto:veritas-bu-admin AT mailman.eng.auburn DOT edu]On Behalf Of
> Mark.Donaldson AT cexp DOT com
> Sent: Tuesday, May 24, 2005 8:42 AM
> To: veritas-bu AT mailman.eng.auburn DOT edu
> Subject: [Veritas-bu] RMAN performance doth sucketh on media server.
>
>
> I've got a media server with several large Oracle databases on it.  Simply
> put, the performance sucks.
>
> It's an RMAN backup with 4 channels, MPX'd to 4/drive (so we're using one
> tape drive).  I'm also making an in-line tape copy to a second tape at the
> same time for off-site purposes.  Both tape drives are LTO2 shared in an 
> SSO
> environment.
>
> Monitoring iostat, I see a burst of ~2MB/sec or so, usually after the
> write_data "start writing data when first buffer is available" is logged,
> then a fallback to 100/200/300 KB/sec for the remainder of the job.  The
> "waiting on full/empty" buffer notes are few but when they are there, it's
> "waiting on full buffer".
>
> Here's the brief report from my summarizer on the policy:
>
> ---------------------------------------
> ## Gathering data.....Done.
> ## Write to buffer waiting on available buffer:
> No Data Found.
>
> ## Write to tape waiting on full buffer:
> Min: 38363  Avg: 58225  Max: 134966 with 13 samples
> ---------------------------------------
>
> The bptm logs are showing the job spending a lot of time doing this:
>
>   0.0  0.0  0.0    0.0    0.0   0   0 rmt/13
>    0.0    0.0    0.0
>
>
> 18:16:19.866 [28104] <2> io_write_back_header: drive index 14,
> ue-13_1116892156, file num = 88, mpx_headers = 3, copy 1
> 18:16:19.869 [28104] <2> io_write_back_header: drive index 14,
> ue-13_1116892192, file num = 88, mpx_headers = 2, copy 1
> 18:16:19.870 [28104] <2> io_write_back_header: drive index 14,
> ue-13_1116892207, file num = 88, mpx_headers = 1, copy 1
> 18:16:19.871 [28104] <2> write_data: completed writing backup header, 
> start
> writing data when first buffer is available, copy 1
> 18:16:19.878 [28104] <2> write_data: absolute block position prior to
> writing backup header(s) is 878978, copy 2
> 18:16:19.878 [28104] <2> write_data: block position check: actual 878978,
> expected 878978
> 18:16:19.878 [28104] <2> io_write_back_header: drive index 13,
> ue-13_1116892156, file num = 96, mpx_headers = 3, copy 2
> 18:16:19.881 [28104] <2> io_write_back_header: drive index 13,
> ue-13_1116892192, file num = 96, mpx_headers = 2, copy 2
> 18:16:19.882 [28104] <2> io_write_back_header: drive index 13,
> ue-13_1116892207, file num = 96, mpx_headers = 1, copy 2
> 18:16:19.883 [28104] <2> write_data: completed writing backup header, 
> start
> writing data when first buffer is available, copy 2
> 18:16:19.883 [28104] <2> write_data: first write, twin_index: 0 cindex: 2
> dont_process: 1 wrote_backup_hdr: 1 finished_buff: 0
> 18:16:19.883 [28104] <2> write_data: received first buffer (262144 bytes),
> begin writing data
> 18:16:23.608 [26567] <2> bptm: INITIATING (VERBOSE = 0): -U
> 18:16:23.610 [26567] <2> db_byid: search for media id 005111
> 18:16:23.610 [26567] <2> db_byid: 005111 found at offset 2
> 18:16:23.622 [26567] <2> db_lock_media: unable to lock media at offset 2
> (005111)
> 18:16:23.622 [26567] <2> db_byid: search for media id 005121
> 18:16:23.622 [26567] <2> db_byid: 005121 found at offset 4
> 18:16:23.642 [26567] <2> db_lock_media: unable to lock media at offset 4
> (005121)
> 18:16:23.642 [26567] <2> bptm: EXITING with status 0 <----------
> 18:16:41.852 [28104] <2> write_backup: write_data() returned, exit_status 
> =
> 0, CINDEX = 2, TWIN_INDEX = 0, backup_status = -7
> 18:16:41.852 [28104] <2> io_ioctl: command (0)MTWEOF 1 from (bptm.c.17789)
> on drive index 14
> 18:16:43.199 [28104] <2> io_ioctl: command (0)MTWEOF 1 from (bptm.c.17789)
> on drive index 13
>
> <snip>
>
> 18:25:25.096 [29843] <2> bptm: INITIATING (VERBOSE = 0): -U
> 18:25:25.098 [29843] <2> db_byid: search for media id 005111
> 18:25:25.098 [29843] <2> db_byid: 005111 found at offset 2
> 18:25:25.110 [29843] <2> db_lock_media: unable to lock media at offset 2
> (005111)
> 18:25:25.110 [29843] <2> db_byid: search for media id 005121
> 18:25:25.111 [29843] <2> db_byid: 005121 found at offset 4
> 18:25:25.130 [29843] <2> db_lock_media: unable to lock media at offset 4
> (005121)
> 18:25:25.130 [29843] <2> bptm: EXITING with status 0 <----------
> 18:25:30.625 [29901] <2> bptm: INITIATING (VERBOSE = 0): -U
> 18:25:30.627 [29901] <2> db_byid: search for media id 005111
> 18:25:30.627 [29901] <2> db_byid: 005111 found at offset 2
> 18:25:30.639 [29901] <2> db_lock_media: unable to lock media at offset 2
> (005111)
> 18:25:30.639 [29901] <2> db_byid: search for media id 005121
> 18:25:30.639 [29901] <2> db_byid: 005121 found at offset 4
> 18:25:30.659 [29901] <2> db_lock_media: unable to lock media at offset 4
> (005121)
> 18:25:30.659 [29901] <2> bptm: EXITING with status 0 <----------
> 18:25:35.991 [29978] <2> bptm: INITIATING (VERBOSE = 0): -U
> 18:25:35.993 [29978] <2> db_byid: search for media id 005111
> 18:25:35.993 [29978] <2> db_byid: 005111 found at offset 2
> 18:25:36.008 [29978] <2> db_lock_media: unable to lock media at offset 2
> (005111)
> 18:25:36.008 [29978] <2> db_byid: search for media id 005121
> 18:25:36.008 [29978] <2> db_byid: 005121 found at offset 4
> 18:25:36.028 [29978] <2> db_lock_media: unable to lock media at offset 4
> (005121)
> 18:25:36.028 [29978] <2> bptm: EXITING with status 0 <----------
> 18:25:39.407 [29983] <2> bptm: INITIATING (VERBOSE = 0): -U
> 18:25:39.409 [29983] <2> db_byid: search for media id 005111
> 18:25:39.409 [29983] <2> db_byid: 005111 found at offset 2
> 18:25:39.428 [29983] <2> db_lock_media: unable to lock media at offset 2
> (005111)
> 18:25:39.428 [29983] <2> db_byid: search for media id 005121
> 18:25:39.428 [29983] <2> db_byid: 005121 found at offset 4
> 18:25:39.448 [29983] <2> db_lock_media: unable to lock media at offset 4
> (005121)
> 18:25:39.449 [29983] <2> bptm: EXITING with status 0 <----------
> 18:25:42.651 [29998] <2> bptm: INITIATING (VERBOSE = 0): -U
> 18:25:42.653 [29998] <2> db_byid: search for media id 005111
> 18:25:42.653 [29998] <2> db_byid: 005111 found at offset 2
> 18:25:42.667 [29998] <2> db_lock_media: unable to lock media at offset 2
> (005111)
> 18:25:42.667 [29998] <2> db_byid: search for media id 005121
> 18:25:42.667 [29998] <2> db_byid: 005121 found at offset 4
> 18:25:42.687 [29998] <2> db_lock_media: unable to lock media at offset 4
> (005121)
> 18:25:42.687 [29998] <2> bptm: EXITING with status 0 <----------
> 18:25:53.231 [57] <2> bptm: INITIATING (VERBOSE = 0): -U
> 18:25:53.233 [57] <2> db_byid: search for media id 005111
> 18:25:53.233 [57] <2> db_byid: 005111 found at offset 2
> 18:25:53.246 [57] <2> db_lock_media: unable to lock media at offset 2
> (005111)
> 18:25:53.246 [57] <2> db_byid: search for media id 005121
> 18:25:53.246 [57] <2> db_byid: 005121 found at offset 4
> 18:25:53.266 [57] <2> db_lock_media: unable to lock media at offset 4
> (005121)
> 18:25:53.267 [57] <2> bptm: EXITING with status 0 <----------
> _______________________________________________
> Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
> http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
>
>
> _______________________________________________
> Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
> http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
>