On Wednesday 07 March 2007, Kenneth Berry wrote:
>Hello list,
>
>I have been working on a performance problem for about a month and I am
>out of ideas. I hope someone can help me. What is happening is that
>backups of large DLE that start very early seem to dump at a very slow
>rate. Where as dumps of DLE on the same host, physical drive, etc. that
>start hours latter dump at an expected rate. The amstatus snapshot
>below was taken at 9:30:00. The backups start at midnight and notice
>that the dump rate for mbthome:/home/public is about 830K/sec. But DLE
>mbthome:/home_a_i which has been running for 21 mins. is at a dump rate
>of about 9500K/sec.
>
>-------------------------------------------------
>mbthome:/ 0 51860k finished (0:32:54)
>mbthome:/boot 0 8431k finished (0:29:27)
>mbthome:/home/public 0 46747710k dumping 26966780k
>( 57.69%) (0:28:31)
>mbthome:/home_a_i 0 20529050k dumping 11973280k
>( 58.32%) (9:09:34)
>mbthome:/home_j_k 0 39011140k writing to tape (9:09:34)
>mbthome:/home_l_o 0 13542210k wait for dumping
>mbthome:/home_p 0 781310k finished (1:01:03)
>mbthome:/home_q_z 0 10639260k wait for dumping
>mbthome:/home_rest 0 10k finished (0:28:59)
>mbthome:/usr 0 606581k finished (0:58:04)
>mbthome:/var 0 330491k finished (0:38:43)
>-------------------------------------------------
>
>I have changed many things and not found the solution yet. I have moved
>all the data from an hardware RAID5 to a single large drive on a
>different SCSI controller. Played with maxdumps parameter it is set at
>two for the above snap shot, for the data below the maxdump was set 1
>(default). Top utility shows very high CPU sys levels which settles
>when dumps are finished, are mine higher than expected when amdump is
>running?
>
>-------------------------------------------------
>top - 09:48:04 up 2 days, 18:43, 2 users, load average: 5.89, 5.49,
>5.37
>Tasks: 152 total, 1 running, 151 sleeping, 0 stopped, 0 zombie
> Cpu0 : 9.3% us, 61.3% sy, 2.3% ni, 0.0% id, 20.9% wa, 0.0% hi,
>6.3% si
> Cpu1 : 1.0% us, 91.1% sy, 0.7% ni, 0.0% id, 6.0% wa, 0.0% hi,
>1.3% si
>Mem: 2075988k total, 2069940k used, 6048k free, 14444k buffers
>Swap: 1020116k total, 0k used, 1020116k free, 1934516k cached
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
>21343 amanda 16 0 2716 948 1780 S 89.8 0.0 197:24.39 dumper
>21344 amanda 16 0 2976 1096 1868 S 18.5 0.1 76:27.43 dumper
>28982 amanda 16 0 3028 804 1664 S 13.7 0.0 0:18.33 sendbackup
>21341 amanda 15 0 2728 1536 2432 D 12.0 0.1 20:17.89 taper
>28984 root 16 0 6660 1900 4304 D 10.1 0.1 0:14.84 tar
>21342 amanda 15 0 2720 1532 2432 S 7.5 0.1 10:57.67 taper
>28985 amanda 16 0 2532 420 1372 S 4.6 0.0 0:06.96 tar
>-------------------------------------------------
>
>Finally, so you can see the data for a full backup run I have included
>below the report from last night's backup run. Notice that most DLE
>that have long dump times are dumped at a very slow rate. I am sure I
>didn't include something that will be need to solve this problem, please
>email me back and I will provide the requested info.
>
>
>
>-------------------------------------------------
>DUMP SUMMARY:
> DUMPER STATS TAPER
>STATS
>HOSTNAME DISK L ORIG-KB OUT-KB COMP% MMM:SS KB/s MMM:SS
>KB/s
>-------------------------- ---------------------------------
>------------
>mbtas00 / 0 2177990 816539 37.5 8:121659.0
>1:0313057.1
>mbtas00 /back01 0 138367104785625 34.6 28:372786.4
>6:2612392.8
>mbtas00 /boot 0 10880 6346 58.3 0:023898.6
>0:022919.1
>mbtas00 /opt 0 182746208971682 49.1 50:242966.9
>6:1423968.4
>mbtas00 /var 0 165760 22406 13.5 0:35 637.0
>0:0210429.8
>mbtas01 / 0 34362501438229 41.9 7:043392.3
>1:5912131.7
>mbtas01 /boot 0 10880 6346 58.3 0:024023.7
>0:041659.7
>mbtas01 /opt 0 3140260015084957 48.0 64:583869.4
>15:1716454.1
>mbtas01 /u 0 10 1 10.0 0:00 0.0 0:01
>0.9
>mbtas01 /var 0 221740 29565 13.3 0:201488.1
>0:038748.2
>mbtdb01 / 0 2090070 806698 38.6 4:432848.7
>0:1748362.3
>mbtdb01 /boot 0 10880 6344 58.3 0:015897.3
>0:023488.9
>mbtdb01 /opt 0 235088507524669 32.0 29:324246.4
>2:5343470.9
>mbtdb01 -ata/arch01 0 268880407923108 29.5 30:494285.1
>3:0243647.0
>mbtdb01 -ata/arch02 0 268880407923108 29.5 30:534275.3
>3:0343404.1
>mbtdb01 -ata/back01 0 7335320819585192 26.7 394:16 827.9
>24:0613543.6
>mbtdb01 -ata/back02 0 7054363217685384 25.1 70:474163.9
>5:5549858.2
>mbtdb01 -ata/back03 0 7045140018625938 26.4 190:151631.7
>12:5923901.9
>mbtdb01 -ata/back04 0 7047288018959600 26.9 164:061925.6
>8:1838060.7
>mbtdb01 -ata/data24 0 225760 79360 35.2 0:164897.4
>0:0157126.9
>mbtdb01 -ata/data25 0 2627960 801797 30.5 2:574527.8
>0:2040786.2
>mbtdb01 /var 0 178450 22837 12.8 0:141618.9
>0:0115461.3
>mbtdbmon / 0 2933423014565741 49.7 329:34 736.6
>14:3916575.8
>mbtdbmon /boot 0 26240 8218 31.3 0:024734.5
>0:023558.7
>mbtddb01 / 0 124253807549686 60.8 207:27 606.6
>9:4412921.1
>mbtddb01 /boot 0 32340 18843 58.3 0:082280.6
>0:028575.3
>mbtdev / 0 200110 96922 48.4 0:273527.7
>0:0426065.5
>mbtdev /boot 0 6100 3814 62.5 0:016704.0
>0:012595.2
>mbtdev /opt 0 91523407164046 78.3 19:386080.9
>9:0513156.2
>mbtdev /usr 0 1861820 695789 37.4 3:383187.3
>0:5512582.7
>mbtdev /var 0 286040 66504 23.2 0:213224.4
>0:0232729.4
>mbthome / 0 162160 51489 31.8 0:381345.5
>0:0228477.5
>mbthome /boot 0 9020 8432 93.5 0:023672.4
>0:023553.0
>mbthome -ome/public 0 4674511237080424 79.3 437:571411.1
>25:5423858.1
>mbthome /home_a_i 0 2096223015266312 72.8 63:244013.7
>10:1824709.6
>mbthome /home_j_k 0 3901114021733976 55.7 106:023416.2
>14:4224635.9
>mbthome /home_l_o 0 1354221010397536 76.8 41:004226.0
>6:5325175.6
>mbthome /home_p 0 781120 358604 45.9 1:553108.2
>0:1133516.1
>mbthome /home_q_z 0 106384606287909 59.1 29:233565.9
>3:0534043.1
>mbthome /home_rest 0 10 1 10.0 0:00 0.0 0:01
>0.8
>mbthome /usr 0 1628430 606585 37.2 5:371800.7
>0:1832936.3
>mbthome /var 0 589220 312805 53.1 1:462951.8
>0:1128095.3
>mbtns1 / 0 109530 41446 37.8 0:331261.4
>0:0220104.2
>mbtns1 /boot 0 5640 3368 59.7 0:021926.8
>0:021594.6
>mbtns1 /usr 0 1377340 520768 37.8 17:16 502.5
>0:4212254.3
>mbtns1 /var 0 432520 284235 65.7 5:54 802.2
>0:1716970.1
>mbtns2 / 0 244140 132484 54.3 0:552427.6
>0:1111976.6
>mbtns2 /boot 0 5050 4060 80.4 0:015025.0
>0:021927.8
>mbtns2 /opt 0 103040 34858 33.8 0:122896.9
>0:0214753.5
>mbtns2 /usr 0 1462440 595361 40.7 20:42 479.4
>0:4712767.9
>mbtns2 /var 0 1250850 603571 48.3 28:12 356.7
>0:4114751.6
>mbtpas02 / 0 40305402190711 54.4 79:56 456.8
>2:2714904.4
>mbtpas02 /boot 0 10260 8524 83.1 0:032582.8
>0:023824.2
>mbttdb02 / 0 30578601398209 45.7 11:342014.6
>1:0023217.1
>mbttdb02 /boot 0 26950 22452 83.3 0:073279.7
>0:038598.5
>mbttdb03 / 0 39120702074878 53.0 16:512053.0
>2:3013841.5
>mbttdb03 /boot 0 20200 16808 83.2 0:053122.1
>0:036549.9
>mbtts00 / 0 1962845010877320 55.4 96:321878.0
>15:1111940.7
>mbtts00 /boot 0 10780 6243 57.9 0:022998.0
>0:022886.2
>mbtts01 / 0 2344288011531924 49.2 113:191696.1
>7:4025089.6
>mbtts01 /boot 0 10780 6243 57.9 0:07 881.1
>0:032456.7
>mbtvm01 / 0 1268570 495797 39.1 2:263406.7
>1:107090.0
>mbtvm01 /boot 0 25150 22218 88.3 0:046238.0
>0:029008.5
>
>(brought to you by Amanda version 2.4.4p2)
>-------------------------------------------------
Warning, word-wrap turned off to preserve formatting below.
This somewhat waddles and quacks like a duck, so I expect it is, Kenneth.
The 'duck' in this case is not enough holding disk, so the bigger
dle's are going direct to tape, one at a time only, and if they are also
being compressed, the gzip being done will eat a ton of cpu. I see
several entries above with indicated compressions of over 80%. Those
should be switched to a dumptype that doesn't compress as the gain from
all that gzip wheel spinning just isn't worth it. This change alone will
cut the runtime by possibly as much as 75%.
You could also adjust some of the parameters in the format string in
your amanda.conf which controls the spacing and such of the above report
so as to better separate the data & make it easier to read. I have
used this string for several years fairly successfully:
====from my amanda.conf======
columnspec
"HostName=0:7,Disk=1:29,Level=1:2,OrigKB=1:6,OutKB=1:6,Compress=1:5,DumpTime=1:5,DumpRate=1:7,TapeTime=1:5,TapeRate=1:7"
# This is a recently added item, "displayunit". The default has always been K
till now
# One should set
displayunit "m"
# to make it use megabytes instead
===========
So I get emailed reports formatted like this:
===========
DUMP SUMMARY:
DUMPER STATS
TAPER STATS
HOSTNAME DISK L ORIG-MB OUT-MB COMP% MMM:SS KB/s
MMM:SS KB/s
---------------------------------------- ---------------------------------
-------------
coyote /GenesAmandaHelper-0.5 0 1611 709 44.0 5:1 2319.5
0:4 17141.6
coyote /bin 0 7 7 -- 0:0 1149.8
0:0 17806.5
coyote /boot 0 84 84 -- 0:0 11677.5
0:0 76451.7
coyote /dev 0 0 0 -- 0:0 717.5
0:0 13236.3
coyote /etc 0 146 17 11.8 0:3 499.1
0:0 96074.5
Those entries with -- in the compression column are not being compressed.
But I haven't figured out howto adjust the headers formatting in reporter.c
to match the data and make it stick over version changes since I play the
canary in the coal mine act in this play by running the latest snapshots of
the stable tree. Currently the 20070222 snapshot of 2.5.1p3.
I hope this helps.
Cheers, Gene
--
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
We are Microsoft. Unix is irrelevant. Openness is futile. Prepare
to be assimilated.
|