Troubleshooting a slowdown problem?
2004-06-17 12:05:51
A couple of months ago, I added a server, centernet, to my Amanda backups.
Since that time, about once a week or two, but not everyday, the backup runs
for 16-20 hours, instead of its normal less than 8. It's running right now
since last night at 8:00pm:
amanda@admin:~ > amstatus DailySet1
Using /var/log/amanda/DailySet1/amdump from Wed Jun 16 20:00:00 EDT 2004
admin://db/c$ 0 462720k finished (20:56:41)
admin://db/e$ 1 10k finished (20:16:16)
admin://db/f$ 1 2559660k finished (20:51:25)
admin://db/f$/inetsrv/webpub/images 1 30k finished (20:16:07)
admin:sda1 0 3410k finished (20:17:09)
admin:sda3 0 3683924k wait for dumping
admin:sdb1 0 24270k finished (20:17:05)
centernet:sda1 0 4846k finished (20:06:03)
centernet:sda2 0 715715k finished (2:15:47)
centernet:sda3 0 110883k finished (20:58:42)
centernet:sda5 0 1812031k dumping 1332992k ( 73.56%)
(2:02:34)
centernet:sda6 1 73k finished (20:03:37)
centernet:sda7 0 564k finished (20:04:03)
centernet:sda9 0 30391k finished (20:18:59)
mailinglists:hda1 0 2198k finished (20:04:30)
mailinglists:hda2 0 399339k finished (23:03:41)
mailinglists:hda7 0 743827k finished (4:41:56)
SUMMARY part real estimated
size size
partition : 17
estimated : 17 11566387k
flush : 0 0k
failed : 0 0k ( 0.00%)
wait for dumping: 1 3683924k ( 31.85%)
dumping to tape : 0 0k ( 0.00%)
dumping : 1 1332992k 1812031k ( 73.56%) ( 11.52%)
dumped : 15 5057936k 6070432k ( 83.32%) ( 43.73%)
wait for writing: 0 0k 0k ( 0.00%) ( 0.00%)
wait to flush : 0 0k 0k (100.00%) ( 0.00%)
writing to tape : 0 0k 0k ( 0.00%) ( 0.00%)
failed to tape : 0 0k 0k ( 0.00%) ( 0.00%)
taped : 15 5057936k 6070432k ( 83.32%) ( 43.73%)
7 dumpers idle : no-hold
taper idle
network free kps: 26362
holding space : 34661948k ( 95.03%)
dumper0 busy : 8:13:03 ( 95.12%)
dumper1 busy : 6:24:10 ( 74.11%)
dumper2 busy : 2:52:40 ( 33.31%)
taper busy : 1:07:38 ( 13.05%)
0 dumpers busy : 0:00:00 ( 0.00%)
1 dumper busy : 0:13:42 ( 2.64%) no-hold: 0:13:42 (100.00%)
2 dumpers busy : 7:57:50 ( 92.18%) client-constrained: 5:31:58 ( 69.47%)
no-hold: 2:25:40 ( 30.49%)
start-wait: 0:00:11 ( 0.04%)
3 dumpers busy : 0:26:50 ( 5.18%) client-constrained: 0:26:47 ( 99.77%)
start-wait: 0:00:03 ( 0.23%)
amanda@admin:~ >
What can be determined from this status regarding the reasons for the backup of
centernet:sda5 to be so slow? Actually, I guess it was either centernet:sda1 or
centernet:sda3 which took over 20 hours (am I reading this correctly, or is it
20 minutes, or is this a time-of-day?).
ps on centernet doesn't show anything abnormal:
cn2:~# ps aux |grep amanda
amanda 27333 0.0 0.2 1680 636 ? S 02:01 0:00
/usr/local/libexec/sendbackup
amanda 27335 1.9 0.2 1596 600 ? S 02:01 11:08 /bin/gzip --fast
amanda 27336 0.0 0.1 1904 364 ? S 02:01 0:00 dump 0usf
1048576 - /dev/sda5
amanda 27337 0.0 0.2 1956 664 ? S 02:01 0:07 dump 0usf
1048576 - /dev/sda5
amanda 27338 0.0 0.1 1904 488 ? S 02:01 0:12 dump 0usf
1048576 - /dev/sda5
amanda 27339 0.0 0.1 1904 504 ? S 02:01 0:11 dump 0usf
1048576 - /dev/sda5
amanda 27340 0.0 0.1 1904 480 ? S 02:01 0:12 dump 0usf
1048576 - /dev/sda5
root 29630 0.0 0.1 1336 436 pts/1 S 11:43 0:00 grep amanda
cn2:~#
And the partitions on this host aren't outrageous:
cn2:~# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda2 7.3G 1.5G 5.5G 21% /
/dev/sda1 7.6M 5.6M 1.6M 78% /boot
/dev/sda3 4.6G 291M 4.0G 7% /usr
/dev/sda5 4.6G 2.0G 2.3G 46% /opt/analog/logdata
/dev/sda6 3.7G 794M 2.7G 23% /var/www/centernet/htdocs
/dev/sda7 3.7G 5.3M 3.4G 1% /var/lib/mysql
/dev/sda9 2.8G 572M 2.0G 22% /var/www/centernet/logs
cn2:~#
Here's the relevant parts of disklist and amanda.config:
amanda@admin:/etc/amanda/DailySet1 > grep centernet disklist
centernet sda1 comp-user # /boot
centernet sda2 comp-user # /
centernet sda3 comp-user # /usr
centernet sda5 comp-user # /opt/analog/logdata
centernet sda6 comp-user # /var/www/centernet/htdocs
centernet sda7 comp-user # /var/lib/mysql
centernet sda9 comp-user # /var/www/centernet/logs
amanda@admin:/etc/amanda/DailySet1 >
amanda@admin:/etc/amanda/DailySet1 > egrep -v "(^( |\t)*#|^$)" amanda.conf
org "JHU/CCP" # your organization name for reports
mailto "isgalert AT jhuccp DOT org" # space separated list of
operators at your site
dumpuser "amanda" # the user to run dumps under
inparallel 8 # maximum dumpers that will run in parallel (max 63)
dumporder "tttttttt" # specify the priority order of each dumper
netusage 25000 Kbps # maximum net bandwidth for Amanda, in KB per sec
dumpcycle 3 # the number of days in the normal dump cycle
runspercycle 3 # the number of amdump runs in dumpcycle days
tapecycle 25 tapes # the number of tapes in rotation
bumpsize 20 Mb # minimum savings (threshold) to bump level 1 -> 2
bumpdays 1 # minimum days at each level
bumpmult 4 # threshold = bumpsize * bumpmult^(level-1)
etimeout 300 # number of seconds per filesystem for estimates.
dtimeout 1800 # number of idle seconds before a dump is aborted.
ctimeout 30 # maximum number of seconds that amcheck waits
tapebufs 20
tapedev "/dev/nst0" # the no-rewind tape device to be used
rawtapedev "/dev/null" # the raw device to be used (ftape only)
tapetype Python-DDS3 # what kind of tape it is (see tapetypes below)
labelstr "^DailySet1[0-9][0-9]*$" # label constraint regex: all tapes
must match
holdingdisk hd1 {
comment "main holding disk"
directory "/var/amanda" # where the holding disk is
use -0Mb # how much space can we use on it. Use everything.
chunksize 1Gb # size of chunk if you want big dump to be
}
holdingdisk hd2 {
directory "/dumps2/amanda"
use -0 Mb
}
reserve 50 # percent
autoflush yes #
infofile "/var/log/amanda/DailySet1/curinfo" # database DIRECTORY
logdir "/var/log/amanda/DailySet1" # log directory
indexdir "/var/log/amanda/DailySet1/index" # index directory
define tapetype Python-DDS3 {
comment "Dell Python with DDS-3 tapes"
length 11570 mbytes
filemark 0 kbytes
speed 1078 kps
lbl-templ "/usr/local/etc/amanda/DailySet1/3holeJHUCCP.ps"
}
define dumptype global {
comment "Global definitions"
}
define dumptype comp-user {
global
comment "Non-root partitions on reasonably fast machines"
compress client fast
priority medium
}
define interface local {
comment "a local disk"
use 1000 kbps
}
define interface le0 {
comment "10 Mbps ethernet"
use 400 kbps
}
amanda@admin:/etc/amanda/DailySet1 >
Thanks for any suggestions on what's happening, and how to fix it. Please let
me know if there's some other diagnostic I should run to further define this
problem.
-Kevin Zembower
-----
E. Kevin Zembower
Unix Administrator
Johns Hopkins University/Center for Communications Programs
111 Market Place, Suite 310
Baltimore, MD 21202
410-659-6139
|
<Prev in Thread] |
Current Thread |
[Next in Thread>
|
- Troubleshooting a slowdown problem?,
KEVIN ZEMBOWER <=
|
|
|