Amanda-Users

Very Strange slowness issue

2003-05-12 09:26:04
Subject: Very Strange slowness issue
From: Tim Champ <champ AT umbc DOT edu>
To: amanda-users AT amanda DOT org
Date: Mon, 12 May 2003 09:22:55 -0400 (EDT)
Hello to all!  I'm new to the list, but have been using amanda for 3
years.  And this is the first stumper I've had.  I've read the archives,
and fixed any problems I could find there, but yet I still have this
problem.  Let me explain:

My tape server is an SGI, running IRIX.  It works fine, and fast, as do
all of my clients, expect one - a Sun Ultra 2.  The Ultra 2 has two CPU's,
and the latest PROM that Solaris 7 had to give it.  It also has client
2.4.4  (it had 2.4.0p1, but I upgraded it, yet still have same problem).

THe problem is that I get an average dump speed of 11-15kb per second when
dumping filesystems from this machine to the servers holding disk.  Now, I
did have a problem with the network, the port on the cisco was set to not
auto-neg, and was throwing all kinds of errors.  That's been fixed, and if
I transfer files between the Ultra 2 and other machines via scp or ftp, it
is very fast.  Like 7.5MB per second (which I can live with!).  When I run
"netstat -i" I get no
collisions anymore.

Also, if I run ufsdump on the machine locally, and dump from one disk to
another (on the same chain), I get 3-4MB per second.  IF I do that and
pipe it through gzip (as amanda does) I get 400KB per second.  But, if
amanda does it, I get 11KB per second.  As you might think, this means it
takes an eternity to dump a disk.

Now - these are the things that may be causing the problem:

The machine worked fine until a planned downtime where I updated the PROM,
and added a second CPU.  Also, I split up the disks among the two SCSI
busses.  Now, I get no errors from any disks, and I have syslog logging on
debug.  The CPU seems fine, as it hasn't crashed once, or had any errors.
Also, I've tried running it with the 32-bit and 64-bit kernels, with no
differences in speed.

I'm stumped.  One think I've noticed is that while amanda is dumping, it
seems to use six "dumps" which seems strange, but I've been told that it's
normal.  One spawns another, which spawns 4 more.  The sendbackup.debug
shows how long it takes, as you can see here:

sendbackup: debug 1 pid 10373 ruid 110 euid 110: start at Sun May 11
18:51:58 2003
/var/amanda/libexec/sendbackup: version 2.4.4
  parsed request as: program `DUMP'
                     disk `c0t0d0s1'
                     device `c0t0d0s1'
                     level 1
                     since 2003:5:8:3:10:18
                     options `|;bsd-auth;'
sendbackup: try_socksize: send buffer size is 65536
sendbackup: time 0.002: stream_server: waiting for connection:
0.0.0.0.42029
sendbackup: time 0.002: stream_server: waiting for connection:
0.0.0.0.42030
sendbackup: time 0.003: waiting for connect on 42029, then 42030
sendbackup: time 0.012: stream_accept: connection from 130.85.110.208.1111
sendbackup: time 0.015: stream_accept: connection from 130.85.110.208.1112
sendbackup: time 0.015: got all connections
sendbackup: time 0.020: dumping device '/dev/rdsk/c0t0d0s1' with 'ufs'
sendbackup: time 0.020: spawning /var/amanda/libexec/rundump in pipeline
sendbackup: argument list: dump 1usf 1048576 - /dev/rdsk/c0t0d0s1
sendbackup: time 0.637:  93:  normal(|):   DUMP: Writing 32 Kilobyte
records
sendbackup: time 0.641:  93:  normal(|):   DUMP: Date of this level 1
dump: Sun May 11 18:51:58 2003
sendbackup: time 0.645:  93:  normal(|):   DUMP: Date of last level 0
dump: Wed May 07 23:10:19 2003
sendbackup: time 0.650:  93:  normal(|):   DUMP: Dumping
/dev/rdsk/c0t0d0s1 (eds-sun1.engr.umbc.edu:/var) to standard output.
sendbackup: time 1.312:  93:  normal(|):   DUMP: Mapping (Pass I) [regular
files]
sendbackup: time 2.169:  93:  normal(|):   DUMP: Mapping (Pass II)
[directories]sendbackup: time 4.749:  93:  normal(|):   DUMP: Mapping
(Pass II) [directories]sendbackup: time 7.583:  93:  normal(|):   DUMP:
Estimated 154810 blocks (75.59MB) on 0.00 tapes.
sendbackup: time 7.607:  93:  normal(|):   DUMP: Dumping (Pass III)
[directories]
sendbackup: time 10.804:  93:  normal(|):   DUMP: Dumping (Pass IV)
[regular files]
sendbackup: time 856.726:  93:  normal(|):   DUMP: 18.52% done, finished
in 1:02sendbackup: time 1495.793:  93:  normal(|):   DUMP: 36.43% done,
finished in 0:43
sendbackup: time 2180.537:  93:  normal(|):   DUMP: 81.50% done, finished
in 0:08
sendbackup: time 2499.071:  61:    size(|):   DUMP: 154494 blocks
(75.44MB) on 1 volume at 31 KB/sec
sendbackup: time 2499.075:  93:  normal(|):   DUMP: DUMP IS DONE
sendbackup: time 2499.079:  93:  normal(|):   DUMP: Level 1 dump on Sun
May 11 18:51:58 2003
sendbackup: time 2499.086: pid 10373 finish time Sun May 11 19:33:37 2003


AND - here is a ps from the machine (with 6 "dump" processes running,
which I don't have a "dump" command on my machine.  I have ufsdump, which
shows as running when I do a top, but not when I do a ps):

    root 12236 12235  0 04:29:43 ?        0:03 dump 1usf 1048576 -
/dev/rdsk/c0t2d0s6
    root 12234 12232  0 04:29:33 ?        0:01 dump 1usf 1048576 -
/dev/rdsk/c0t2d0s6
    root 12235 12234  0 04:29:43 ?        0:02 dump 1usf 1048576 -
/dev/rdsk/c0t2d0s6
    root 12237 12235  0 04:29:43 ?        0:03 dump 1usf 1048576 -
/dev/rdsk/c0t2d0s6
    root 12238 12235  0 04:29:43 ?        0:03 dump 1usf 1048576 -
/dev/rdsk/c0t2d0s6
    root 13213   350  0 09:18:43 pts/0    0:00 grep dump
    root 12239 12235  0 04:29:43 ?        0:12 dump 1usf 1048576 -
/dev/rdsk/c0t2d0s6


If you want more info, let me know.  Any help is greatly appreciated, as
this has me, and the System Admin staff here at UMBC stumped.

Thanks!
Tim

______________________________________________________________________________

Tim Champ                       IT Support Specialist/Systems Administrator

        UMBC is not responsible for my opinions or comments




<Prev in Thread] Current Thread [Next in Thread>