Amanda-Users

Re: amanda dumps on Sun E250

2008-03-18 16:40:18
Subject: Re: amanda dumps on Sun E250
From: Chris Hoogendyk <hoogendyk AT bio.umass DOT edu>
To: Brian Cuttler <brian AT wadsworth DOT org>
Date: Tue, 18 Mar 2008 16:32:53 -0400
hmm. a bit complicated.

I'd certainly like to have a couple of T2000's. ;-)

Anyway, I have an E250 running Amanda 2.5.1p3 with Solaris 9. I have an AIT5 tape library. AIT5 is rated at 24MB/s, which is slower than your faster tape drives. However, I'm able to drive it at nearly full speed (when I am driving it). My bottlenecks are the activity on the other servers that I'm pulling from and the network. Once I get things on the holding disk, they zing out to tape. But, the tape experiences a lot of idle time while data is being assembled for it.

I believe your speed to tape is less than mine. It looks like you are up over 500G being backed up. If I plugged that into my tape speed, I would be doing it in about 10 hours. Of course, that assumes other factors aren't bottlenecking, which you say they aren't. And your tapes should be faster.

So, where to from there? I have two 300G ultraSCSI/320 10k-rpm Seagate Cheetah holding drives. They are mounted internally. I have a PCI Ultra320 dual SCSI expansion card that I added to the E250. The tape library is connected through that. I bought it through our authorized Sun reseller, and it's a Sun branded card.

At the moment, we are running 100Mb/s ethernet using the onboard connector (hme0). We are about to switch to GigE using PCI cards that we bought from the same reseller. They are also Sun branded cards. In discussing with their engineer how to configure this, we decided that we would keep the Ultra320 SCSI card in PCI slot 3, which is 66MHz, and put the GigE card (single) into one of the other slots, which are 33MHz.

I should note that I'm not backing up the volume that you are, and I'm doing server side compression. Also, my backup server is just a backup server, built from scratch for that purpose. I don't have any leftover software/hardware/configuration stuff. Fresh install of Solaris 9. I'm running mon (from kernel.org), but that is really minimalist, and you can hardly tell it is running.

I'm inclined to think you should upgrade Amanda. While it might not affect this particular issue, 2.4.4 is a bit old. Also, the email report with taper stats by DLE and overall tape performance would be easier to grab stats from than the amstatus report, but maybe that's just me.

I'm stuck with hand-me-down E250's, because I can't get either department to squeeze any money out of their budget for upgrades. While I think a newer server would handle some things faster, I also think the E250 ought to be able to drive the tape faster than you are experiencing.


---------------

Chris Hoogendyk

-
  O__  ---- Systems Administrator
 c/ /'_ --- Biology & Geology Departments
(*) \(*) -- 140 Morrill Science Center
~~~~~~~~~~ - University of Massachusetts, Amherst
<hoogendyk AT bio.umass DOT edu>

---------------
Erdös 4



Brian Cuttler wrote:
Hi amanda users,

I'm running dumps on a SUN E250 server, this system has been
demoted from Lotus notes and local amanda server to just an
amanda server, but has picked up additional clients.

Rather than being client/server for itself it is server for
itself as well as two Lotus notes system, both Sun T2000 servers.

The bottle neck in performance is apparently amanda-work area to
tape. It takes forever to dump the data to tape once its on the
work area. I don't think this is an amanda issue, I think its a
system bus issue, backplain seems to run at only 100Mhz.

I am running both LTO (imbedded in Storedge L9 library) and LTO3
(imbedded in C2 library), I have had these drives on other system
or similar drives on other system with much better performance.

Does anyone know where to look for the proof/smoking-gun that
says "this is the wrong platform" or of any tuning I can perform,
either system-wise or amanda feature, that might improve the
throuput to tape ?

We seem to produce completed DLE on work area more quickly than
we can put to tape, the tape is busy constantly once the first
DLE starts to flush to tape. I could add more work-area, which
migh reduce I/O and CPU load on clients sooner, but will not
complete the amanda run any sooner since the bottleneck is the
tape drives.

For reference, E250 is running Solaris 5.9, the T2000 systems run
Solaris 10, Amanda server and clients are 2.4.4. The C2/LTO3 runs
amanda 5x/week and the L9/LTO runs once on the weekend. Well, that
is what we wanted, the amanda jobs are exceeding 24 hours.

amstatus notes
Using /usr/local/etc/amanda/notes/log/amdump from Mon Mar 17 19:30:00 EST 2008

nwcapp:/               0  2104256k finished (22:21:20)
nwcapp:/nexport        0    84192k finished (19:35:22)
wcapp:/                0 11179776k finished (12:54:54)
wcapp:/db              0 71562976k finished (8:18:18)
wcapp:/db2             0  9246208k finished (12:33:30)
wcapp:/export          0  5415904k finished (2:13:04)
wcnotes:/              0 18889568k writing to tape (12:54:55)
wcnotes:/export        1  7453632k finished (23:23:30)
wcnotes:/maildb2/five  0110095430k dumping 103184224k ( 93.72%) (19:35:22)
wcnotes:/maildb2/four  0 15375740k dump done (11:27:26), wait for writing to 
tape
wcnotes:/maildb2/one 0 62708540k wait for dumping wcnotes:/maildb2/three 0 43173850k dumping 2487392k ( 5.76%) (12:54:55) wcnotes:/maildb2/two 0 56212730k wait for dumping wcnotes:/space 0 895424k finished (21:52:57) wcnotes:maildbAD 1 55338550k wait for dumping wcnotes:maildbEK 1 48909540k wait for dumping wcnotes:maildbLQ 0 32725550k dump done (12:33:07), wait for writing to tape
wcnotes:maildbRZ       0  9739760k finished (2:03:09)

SUMMARY          part      real  estimated
                           size       size
partition       :  18
estimated       :  18            560839789k
flush           :   0         0k
failed          :   0                    0k           (  0.00%)
wait for dumping:   4            223169360k           ( 39.79%)
dumping to tape :   0                    0k           (  0.00%)
dumping         :   2 105671616k 153269280k ( 68.95%) ( 18.84%)
dumped          :  12 184672986k 184401149k (100.15%) ( 32.93%)
wait for writing:   2  48101290k  47701260k (100.84%) (  8.58%)
wait to flush   :   0         0k         0k (100.00%) (  0.00%)
writing to tape :   1  18889568k  18889517k (100.00%) (  3.37%)
failed to tape  :   0         0k         0k (  0.00%) (  0.00%)
taped           :   9 117682128k 117810372k ( 99.89%) ( 20.98%)
6 dumpers idle  : no-diskspace
taper writing, tapeq: 2
network free kps:    114152
holding space   :   2102533k (  0.95%)
 dumper0 busy   :  8:56:50  ( 51.61%)
 dumper1 busy   : 10:41:22  ( 61.65%)
 dumper2 busy   :  9:47:27  ( 56.47%)
 dumper3 busy   :  0:33:58  (  3.27%)
 dumper4 busy   :  7:47:41  ( 44.96%)
 dumper5 busy   :  0:39:35  (  3.81%)
 dumper6 busy   : 17:19:47  ( 99.95%)
 dumper7 busy   :  8:00:03  ( 46.15%)
   taper busy   : 15:22:25  ( 88.67%)
 0 dumpers busy :  0:00:00  (  0.00%)
 1 dumper busy  :  0:21:30  (  2.07%)        no-diskspace:  0:21:30  (100.00%)
 2 dumpers busy :  5:47:38  ( 33.42%)        no-diskspace:  5:47:23  ( 99.93%)
                                               start-wait:  0:00:15  (  0.07%)
 3 dumpers busy :  2:58:30  ( 17.16%)        no-diskspace:  2:57:59  ( 99.72%)
                                               start-wait:  0:00:30  (  0.28%)
 4 dumpers busy :  1:43:47  (  9.98%)        no-diskspace:  1:43:32  ( 99.76%)
                                               start-wait:  0:00:15  (  0.24%)
 5 dumpers busy :  3:57:23  ( 22.82%)        no-diskspace:  3:57:03  ( 99.86%)
                                               start-wait:  0:00:20  (  0.14%)
 6 dumpers busy :  1:51:45  ( 10.74%)        no-diskspace:  1:51:20  ( 99.63%)
                                               start-wait:  0:00:24  (  0.37%)
 7 dumpers busy :  0:00:38  (  0.06%)        no-diskspace:  0:00:22  ( 59.68%)
                                               start-wait:  0:00:15  ( 40.32%)
 8 dumpers busy :  0:39:04  (  3.76%)            not-idle:  0:39:04  (100.00%)


---
   Brian R Cuttler                 brian.cuttler AT wadsworth DOT org
   Computer Systems Support        (v) 518 486-1697
   Wadsworth Center                (f) 518 473-6384
   NYS Department of Health        Help Desk 518 473-0773

<Prev in Thread] Current Thread [Next in Thread>