Amanda-Users

Re: how to tell amdump to forget dumps

2008-06-03 02:36:31
Subject: Re: how to tell amdump to forget dumps
From: jehan procaccia <jehan.procaccia AT it-sudparis DOT eu>
To: Jean-Louis Martineau <martineau AT zmanda DOT com>
Date: Tue, 03 Jun 2008 08:28:50 +0200
Jean-Louis Martineau a écrit :
There is no clean way to stop a dle that is dumping, you must kill the dumper process. Locate which dumper process is doing the dump in the 'amdump' log file and kill the process. dumper process are not restarted, it will works only if you started with a big number of dumper process ('inparallel' setting).

Maybe it is an amstatus bug that show they are dumping while they are not.

We need a lot more information to find what is going wrong:
Do you have amanda process running on the client? Do they take cpu cycle? ps -ef
no dump process:
[root@cobra3 ~]
$ ps auwx | grep dump
nothing

Do you have amanda process running on the server? Do they take cpu cycle? ps -ef
neither on the server
[dumpy@backup ~]
$ ps auwx | grep dump   or "grep ama"

System log error
nothing relevant to amanda in /var/log/messages
amdump log file
[root@backup /var/lib/amanda/int]
$ cat log | grep p1v2f1
DISK planner cobra3 /p1v2f1
INFO planner Full dump of cobra3:/p1v2f1 promoted from 9 days ahead.

[root@backup /var/lib/amanda/int]
$ tail -5 log
SUCCESS dumper lugdunum /boot 20080602 1 [sec 0.296 kb 1 kps 3.4 orig-kb 20]
SUCCESS chunker lugdunum /boot 20080602 1 [sec 0.353 kb 1 kps 93.4]
SUCCESS taper lugdunum /boot 20080602 1 [sec 0.017 kb 64 kps 3582.0 {wr: writers 2 rdwait 0.000 wrwait 0.005 filemark 0.011}]
INFO taper Received signal 1
INFO taper Received signal 1


and for cobra3 p1v2f2 DLE  I see in amdump log

planner time 7.728: got result for host cobra3 disk /p1v2f1: 0 -> 12293555K, 1 -> 633695K, -1 -> -2K
...
pondering cobra3:/p1v2f1... next_level0 9 last_level 1 (not due for a full dump, picking an incr level)
  pick: size 633695 level 1 days 4 (thresh 2458711K, 1 days)
curr level 1 size 697064 total size 6196271 total_lev0 0 balanced-lev0size 8578837
...
cobra3 /p1v2f1 pri 1 lev 1 size 697064
...
no try degas:/data4 7 0 9 = 68
  try cobra3:/p1v2f1 15 0 9 = 244
no try pissaro:/data2 8 0 9 = 83
...
no try gaia:/boot 3 0 16 = 21
promote: moving cobra3:/p1v2f1 up, total_lev0 9913805, total_size 144024772
  try bell.int-diplomes.org:/export 3 0 16 = 21
...
DUMP cobra3 fffffeff9ffe7f0000 /p1v2f1 20080602 1 0 1970:1:1:0:0:0 9913805 9086 1 2008:5:14:19:24:39 697064 2058
...
driver: send-cmd time 905.042 to chunker6: START 20080602
driver: send-cmd time 905.042 to chunker6: PORT-WRITE 08-00018 /holddisk/disk/20080602151249/cobra3._p1v2f1.0 cobra3 fffffeff9ffe7f0000 /p1v2f1 0 1970:1:1:0:0:0 4194304 DUMP 9913952 |;auth=BSD;compress-fast;index;
chunker: pid 16059 executable chunker6 version 2.5.0p2
chunker: try_socksize: receive buffer size is 65536
chunker: bind_portrange2: trying port=533
chunker: stream_server: waiting for connection: 0.0.0.0.52796
driver: result time 905.059 from chunker6: PORT 52796
driver: send-cmd time 905.059 to dumper6: PORT-DUMP 08-00018 52796 cobra3 fffffeff9ffe7f0000 /p1v2f1 NODEVICE 0 1970:1:1:0:0:0 DUMP |;auth=BSD;compress-fast;index;

amandad.*.debug files from the clients

[root@cobra3 /var/log/amanda]
$ cat amandad.20080602152754.debug
amandad: debug 1 pid 27480 ruid 33 euid 33: start at Mon Jun 2 15:27:54 2008
amandad: version 2.4.5p1
amandad: build: VERSION="Amanda-2.4.5p1"
amandad:        BUILT_DATE="Fri Nov 24 11:04:47 CET 2006"
amandad: BUILT_MACH="Linux cobra3.int-evry.fr 2.6.9-42.0.3.ELsmp #1 SMP Mon Sep 25 17:28:02 EDT 2006 i686 i686 i386 GNU/Linux"
amandad:        CC="gcc"
amandad: CONFIGURE_COMMAND="'./configure' '--build=i686-redhat-linux-gnu' '--host=i686-redhat-linux-gnu' '--target=i386-redhat-linux-gnu' '--program-prefix=' '--prefix=/usr' '--exec-prefix=/usr' '--bindir=/usr/bin' '--sbindir=/usr/sbin' '--sysconfdir=/etc' '--datadir=/usr/share' '--includedir=/usr/include' '--libdir=/usr/lib' '--libexecdir=/usr/lib/amanda' '--localstatedir=/var/lib' '--sharedstatedir=/usr/com' '--mandir=/usr/share/man' '--infodir=/usr/share/info' '--enable-shared' '--disable-dependency-tracking' '--with-index-server=localhost' '--with-tape-server=localhost' '--with-config=DailySet1' '--with-gnutar-listdir=/var/lib/amanda/gnutar-lists' '--with-smbclient=/usr/bin/smbclient' '--with-amandahosts' '--with-user=amanda' '--with-group=disk' '--with-tmpdir=/var/log/amanda' '--with-gnutar=/bin/tar'"
amandad: paths: bindir="/usr/bin" sbindir="/usr/sbin"
amandad:        libexecdir="/usr/lib/amanda" mandir="/usr/share/man"
amandad:        AMANDA_TMPDIR="/var/log/amanda"
amandad:        AMANDA_DBGDIR="/var/log/amanda" CONFIG_DIR="/etc/amanda"
amandad:        DEV_PREFIX="/dev/" RDEV_PREFIX="/dev/" DUMP="/sbin/dump"
amandad:        RESTORE="/sbin/restore" VDUMP=UNDEF VRESTORE=UNDEF
amandad:        XFSDUMP=UNDEF XFSRESTORE=UNDEF VXDUMP=UNDEF VXRESTORE=UNDEF
amandad:        SAMBA_CLIENT="/usr/bin/smbclient" GNUTAR="/bin/tar"
amandad:        COMPRESS_PATH="/bin/gzip" UNCOMPRESS_PATH="/bin/gzip"
amandad:        LPRCMD="/usr/bin/lpr" MAILER="/usr/bin/Mail"
amandad:        listed_incr_dir="/var/lib/amanda/gnutar-lists"
amandad: defs:  DEFAULT_SERVER="localhost" DEFAULT_CONFIG="DailySet1"
amandad:        DEFAULT_TAPE_SERVER="localhost"
amandad:        DEFAULT_TAPE_DEVICE="/dev/null" HAVE_MMAP HAVE_SYSVSHM
amandad:        LOCKING=POSIX_FCNTL SETPGRP_VOID DEBUG_CODE
amandad:        AMANDA_DEBUG_DAYS=4 BSD_SECURITY USE_AMANDAHOSTS
amandad:        CLIENT_LOGIN="amanda" FORCE_USERID HAVE_GZIP
amandad:        COMPRESS_SUFFIX=".gz" COMPRESS_FAST_OPT="--fast"
amandad:        COMPRESS_BEST_OPT="--best" UNCOMPRESS_OPT="-dc"
amandad: time 0.000: got packet:
--------
Amanda 2.5 REQ HANDLE 000-00000001 SEQ 1213071295
SECURITY USER dumpy
SERVICE sendbackup
OPTIONS features=fffffeff9ffeffff07;hostname=cobra3;
DUMP /p1v2f1  0 1970:1:1:0:0:0 OPTIONS |;auth=BSD;compress-fast;index;
--------

amandad: time 0.000: sending ack:
----
Amanda 2.4 ACK HANDLE 000-00000001 SEQ 1213071295
----

amandad: time 0.004: bsd security: remote host backup.int-evry.fr user dumpy local user amanda
amandad: time 0.015: amandahosts security check passed
amandad: time 0.015: running service "/usr/lib/amanda/sendbackup"
amandad: time 0.023: sending REP packet:
----
Amanda 2.4 REP HANDLE 000-00000001 SEQ 1213071295
CONNECT DATA 35631 MESG 35632 INDEX 35633
OPTIONS features=fffffeff9ffe7f;
----

amandad: time 0.024: got packet:
----
Amanda 2.5 ACK HANDLE 000-00000001 SEQ 1213071295
----

amandad: time 0.024: pid 27480 finish time Mon Jun  2 15:27:54 2008

amdump log file from the server
shown above ...
dumper.*.debug file from the server.
I dont see these files !?

Hope it helps to debug to my problem, please let me know I you see something relevant .
thanks .

Jean-Louis

jehan procaccia wrote:
hello,
I still have problems with certain large partition backups, amdumps ends this way: sendbackup: time 33685.307: 87: normal(|): DUMP: 60.89% done at 2453 kB/s, finished in 5:59
sendbackup: time 33753.599: index tee cannot write [Broken pipe]

Although I tweak etimeout/dtimeout, firewalls setting, tcp_keepalives (http://wiki.zmanda.com/index.php/Results_missing) it still freezes on level 0 large dumps :-( . (even larges level 1 in this example)

apart from a solution to that problem (which I'll appreciate above all ;-) ) , how can I tell a running amdump to forget theese "[Broken pipe]" DLE and pass to others "wait for dumping"
here are "frozen" dumps (since 4Hours although dtimeout is 2400 !?)
Using /var/lib/amanda/int/amdump from lun jun  2 15:12:49 CEST 2008
$ amstatus int --dumping
cobra3:/p1v2f1 0 9681m dumping 6916m ( 71.44%) (15:27:54) cobra3:/p1v4f1 0 15496m dumping 7336m ( 47.35%) (15:21:44) pasargades:/var/spool/imap2 1 7854m dumping 4263m ( 54.28%) (15:25:58) pasargades:/var/spool/imap3 1 9394m dumping 5355m ( 57.01%) (15:21:44)

and here is a sample of others DLE waiting for dumps
$ amstatus int --waitdumping
Using /var/lib/amanda/int/amdump from lun jun  2 15:12:49 CEST 2008
cobra3:/p2v5f1                             1       59m wait for dumping
cobra3:/p2v5f2                             1      144m wait for dumping
cobra3:/usr                                1       27m wait for dumping
colibri:/data1                             1      521m wait for dumping
colibri:/data2                             0      215m wait for dumping
...

I would really apreciate to let theese dumps start on that amdump instead of doing an amcleanup which will use an other tape (disk vtape in my case) and start again an other amdump .

thanks for any advice .



<Prev in Thread] Current Thread [Next in Thread>