Veritas-bu

[Veritas-bu] Duplication - 6 hours for one 230M fragment!

2002-04-02 17:18:43
Subject: [Veritas-bu] Duplication - 6 hours for one 230M fragment!
From: Mark.Donaldson AT experianems DOT com (Donaldson, Mark)
Date: Tue, 2 Apr 2002 15:18:43 -0700
This message is in MIME format. Since your mail reader does not understand
this format, some or all of this message may not be legible.

------_=_NextPart_001_01C1DA94.59491020
Content-Type: text/plain;
        charset="ISO-8859-1"
Content-Transfer-Encoding: quoted-printable

Yeah, it's 262144 as of yesterday.  Before this it was the default 32k.
-Mark

-----Original Message-----
From: Karl.Rossing AT Federated DOT CA [mailto:Karl.Rossing AT Federated DOT CA]
Sent: Tuesday, April 02, 2002 1:04 PM
To: Donaldson, Mark
Cc: Veritasbu (E-mail)
Subject: Re: [Veritas-bu] Duplication - 6 hours for one 230M fragment!


What is your NET_BUFFER_SZ set to?

If it's not set try this:
winnipeg:/opt/openv/netbackup>$ cat /opt/openv/netbackup/NET_BUFFER_SZ
262144

Previously i had it set to the default value(i think it's 32k). =
Apparently=20
this needs to be set high when doing a dupplication.

Karl-Fr=E9d=E9ric R=F6ssing
Technical Analyst
Federated Insurance Companies of Canada
http://www.federated.ca




"Donaldson, Mark" <Mark.Donaldson AT experianems DOT com>
Sent by: veritas-bu-admin AT mailman.eng.auburn DOT edu
04/02/2002 12:11 PM

=20
        To:     "Veritasbu (E-mail)" =
<veritas-bu AT mailman.eng.auburn DOT edu>
        cc:=20
        Subject:        [Veritas-bu] Duplication - 6 hours for one 230M
fragment!


I create offsite backups by duplicating a select set of backup images.=20
These images were laid down as multiplexed backups from a half dozen=20
servers in several different classes.  I have only one master/media =
server=20
(v3.4.1 on Sol 2.6), the duplications therefore are taking place within =

the library (DTL7000) that created them.
When I duplicate these images, I specifically don't use the -mpx option =
on=20
bpduplicate.  It seemed to put more tape drives in service and I'm =
trying=20
to keep a some of them in reserve for other activities.
My problem, the duplication of 10 images (less than 350G) has been =
running=20
for 5 days now.  Way too long - big time way too long.  One image=20
duplication (~5Gig, 280000 files) took 17 hours to duplicate.
If you look at the bptm log snippet below, it's more than six hours =
from=20
start to finish for one fragment.=20
>From "read_backup: begin" to "read_backup: successfully read" is 371=20
minutes.  The machine says the read-rate was 180 Kbytes/sec but some =
quick=20
math says 234496 KB in 371 minutes averages merely 6.4 KB/sec.=20
By the way, all the fragments for this image were arranged sequentially =
on=20
the same tape according to the logged positioning info so I don't think =

it's a positioning/mechanical delay.
Most of the big time delays were flanked by this pair of notes:=20
02:58:20 [25598] <2> get_tape_position_for_read: absolute block =
position=20
prior to reading is 82012=20
09:07:54 [25598] <2> read_data: stopping mpx read because 234496 Kbytes =
of=20
234496 were read=20
I've been searching the support site and found technote 243197 about=20
NET_BUFFER_SZ and adjusted that yesterday to match my SIZE_DATA_BUFFERS =

setting.  I haven't seen a significant change from this but, frankly, =
it's=20
still early - the first bptm child since this change just started about =
2=20
hours ago.
So - basically - does anybody know what's going on with this?=20
-Mark=20

<snip>

------_=_NextPart_001_01C1DA94.59491020
Content-Type: text/html;
        charset="ISO-8859-1"
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV=3D"Content-Type" CONTENT=3D"text/html; =
charset=3DISO-8859-1">
<META NAME=3D"Generator" CONTENT=3D"MS Exchange Server version =
5.5.2653.12">
<TITLE>RE: [Veritas-bu] Duplication - 6 hours for one 230M =
fragment!</TITLE>
</HEAD>
<BODY>

<P><FONT SIZE=3D2>Yeah, it's 262144 as of yesterday.&nbsp; Before this =
it was the default 32k.</FONT>
<BR><FONT SIZE=3D2>-Mark</FONT>
</P>

<P><FONT SIZE=3D2>-----Original Message-----</FONT>
<BR><FONT SIZE=3D2>From: Karl.Rossing AT Federated DOT CA [<A =
HREF=3D"mailto:Karl.Rossing AT Federated DOT CA">mailto:Karl.Rossing@Federated.=
CA</A>]</FONT>
<BR><FONT SIZE=3D2>Sent: Tuesday, April 02, 2002 1:04 PM</FONT>
<BR><FONT SIZE=3D2>To: Donaldson, Mark</FONT>
<BR><FONT SIZE=3D2>Cc: Veritasbu (E-mail)</FONT>
<BR><FONT SIZE=3D2>Subject: Re: [Veritas-bu] Duplication - 6 hours for =
one 230M fragment!</FONT>
</P>
<BR>

<P><FONT SIZE=3D2>What is your NET_BUFFER_SZ set to?</FONT>
</P>

<P><FONT SIZE=3D2>If it's not set try this:</FONT>
<BR><FONT SIZE=3D2>winnipeg:/opt/openv/netbackup&gt;$ cat =
/opt/openv/netbackup/NET_BUFFER_SZ</FONT>
<BR><FONT SIZE=3D2>262144</FONT>
</P>

<P><FONT SIZE=3D2>Previously i had it set to the default value(i think =
it's 32k). Apparently </FONT>
<BR><FONT SIZE=3D2>this needs to be set high when doing a =
dupplication.</FONT>
</P>

<P><FONT SIZE=3D2>Karl-Fr=E9d=E9ric R=F6ssing</FONT>
<BR><FONT SIZE=3D2>Technical Analyst</FONT>
<BR><FONT SIZE=3D2>Federated Insurance Companies of Canada</FONT>
<BR><FONT SIZE=3D2><A HREF=3D"http://www.federated.ca"; =
TARGET=3D"_blank">http://www.federated.ca</A></FONT>
</P>
<BR>
<BR>
<BR>

<P><FONT SIZE=3D2>&quot;Donaldson, Mark&quot; =
&lt;Mark.Donaldson AT experianems DOT com&gt;</FONT>
<BR><FONT SIZE=3D2>Sent by: =
veritas-bu-admin AT mailman.eng.auburn DOT edu</FONT>
<BR><FONT SIZE=3D2>04/02/2002 12:11 PM</FONT>
</P>

<P><FONT SIZE=3D2>&nbsp;</FONT>
<BR><FONT SIZE=3D2>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =
To:&nbsp;&nbsp;&nbsp;&nbsp; &quot;Veritasbu (E-mail)&quot; =
&lt;veritas-bu AT mailman.eng.auburn DOT edu&gt;</FONT>
<BR><FONT SIZE=3D2>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; cc: =
</FONT>
<BR><FONT SIZE=3D2>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =
Subject:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; [Veritas-bu] =
Duplication - 6 hours for one 230M fragment!</FONT>
</P>
<BR>

<P><FONT SIZE=3D2>I create offsite backups by duplicating a select set =
of backup images. </FONT>
<BR><FONT SIZE=3D2>These images were laid down as multiplexed backups =
from a half dozen </FONT>
<BR><FONT SIZE=3D2>servers in several different classes.&nbsp; I have =
only one master/media server </FONT>
<BR><FONT SIZE=3D2>(v3.4.1 on Sol 2.6), the duplications therefore are =
taking place within </FONT>
<BR><FONT SIZE=3D2>the library (DTL7000) that created them.</FONT>
<BR><FONT SIZE=3D2>When I duplicate these images, I specifically don't =
use the -mpx option on </FONT>
<BR><FONT SIZE=3D2>bpduplicate.&nbsp; It seemed to put more tape drives =
in service and I'm trying </FONT>
<BR><FONT SIZE=3D2>to keep a some of them in reserve for other =
activities.</FONT>
<BR><FONT SIZE=3D2>My problem, the duplication of 10 images (less than =
350G) has been running </FONT>
<BR><FONT SIZE=3D2>for 5 days now.&nbsp; Way too long - big time way =
too long.&nbsp; One image </FONT>
<BR><FONT SIZE=3D2>duplication (~5Gig, 280000 files) took 17 hours to =
duplicate.</FONT>
<BR><FONT SIZE=3D2>If you look at the bptm log snippet below, it's more =
than six hours from </FONT>
<BR><FONT SIZE=3D2>start to finish for one fragment. </FONT>
<BR><FONT SIZE=3D2>From &quot;read_backup: begin&quot; to =
&quot;read_backup: successfully read&quot; is 371 </FONT>
<BR><FONT SIZE=3D2>minutes.&nbsp; The machine says the read-rate was =
180 Kbytes/sec but some quick </FONT>
<BR><FONT SIZE=3D2>math says 234496 KB in 371 minutes averages merely =
6.4 KB/sec. </FONT>
<BR><FONT SIZE=3D2>By the way, all the fragments for this image were =
arranged sequentially on </FONT>
<BR><FONT SIZE=3D2>the same tape according to the logged positioning =
info so I don't think </FONT>
<BR><FONT SIZE=3D2>it's a positioning/mechanical delay.</FONT>
<BR><FONT SIZE=3D2>Most of the big time delays were flanked by this =
pair of notes: </FONT>
<BR><FONT SIZE=3D2>02:58:20 [25598] &lt;2&gt; =
get_tape_position_for_read: absolute block position </FONT>
<BR><FONT SIZE=3D2>prior to reading is 82012 </FONT>
<BR><FONT SIZE=3D2>09:07:54 [25598] &lt;2&gt; read_data: stopping mpx =
read because 234496 Kbytes of </FONT>
<BR><FONT SIZE=3D2>234496 were read </FONT>
<BR><FONT SIZE=3D2>I've been searching the support site and found =
technote 243197 about </FONT>
<BR><FONT SIZE=3D2>NET_BUFFER_SZ and adjusted that yesterday to match =
my SIZE_DATA_BUFFERS </FONT>
<BR><FONT SIZE=3D2>setting.&nbsp; I haven't seen a significant change =
from this but, frankly, it's </FONT>
<BR><FONT SIZE=3D2>still early - the first bptm child since this change =
just started about 2 </FONT>
<BR><FONT SIZE=3D2>hours ago.</FONT>
<BR><FONT SIZE=3D2>So - basically - does anybody know what's going on =
with this? </FONT>
<BR><FONT SIZE=3D2>-Mark </FONT>
</P>

<P><FONT SIZE=3D2>&lt;snip&gt;</FONT>
</P>

</BODY>
</HTML>
------_=_NextPart_001_01C1DA94.59491020--

<Prev in Thread] Current Thread [Next in Thread>