Amanda-Users

defunct jobs - again

2009-07-08 09:05:50
Subject: defunct jobs - again
From: Brian Cuttler <brian AT wadsworth DOT org>
To: amanda-users AT amanda DOT org, Chris Knight <knight AT wadsworth DOT org>
Date: Wed, 8 Jul 2009 08:54:24 -0400
Server is Solaris 10 x86 (on an X4500), Amanda 2.6.1p1
Lyra client is the same version on also running Solaris 10, but
Sparc, on a T1000.

Both Lyra:/ and Curie:/ are standard comp-root, ufs file systems.
(Other DLEs on both systems use zfs-snapshots).

(amnda client) Lyra has defunct jobs again. I removed all of
the cruft a couple of days ago and amanda ran successfully, but
the issue has recurred.

As a side issue, I don't know what curie:/ has issues, I did a
manual ufsdump yesterday, no problems at all.

I can include logs from other machine, but didn't want to overwhelm
the initial post.

                                                thank you,

                                                Brian


[lyra] ~ 98> ps -ef | grep amanda
  amanda 27802 15684   0 19:25:26 ?           5:04 
/usr/local/libexec/amanda/amandad
  amanda 28625 27802   0        - ?           0:00 <defunct>
  amanda   134 27802   0        - ?           0:00 <defunct>
  amanda 29178 27802   0        - ?           0:00 <defunct>
   brian   973   960   0 08:38:33 pts/3       0:00 grep amanda
  amanda   224 27802   0        - ?           0:00 <defunct>
  amanda 29539 27802   0        - ?           0:00 <defunct>
  amanda   710 27802   0        - ?           0:00 <defunct>
  amanda 29060 27802   0        - ?           0:00 <defunct>
  amanda 27804 27802   0        - ?           0:00 <defunct>
  amanda 27825 27802   0        - ?           0:00 <defunct>
  amanda 29010 27802   0        - ?           0:00 <defunct>
  amanda 29059 27802   0        - ?           0:13 <defunct>
  amanda   708 27802   0        - ?           0:00 <defunct>
  amanda 29177 27802   0        - ?           0:00 <defunct>
  amanda 27937 27802   0        - ?           0:00 <defunct>


----- Forwarded message from Amanda/Curie <amanda AT wadsworth DOT org> -----

FAILURE DUMP SUMMARY:
   lyra  / lev 1  FAILED [spawn /bin/gzip: dup2 out: Bad file number]
   lyra  / lev 1  FAILED [missing size line from sendbackup]
   curie / lev 0  FAILED [spawn /usr/bin/gzip: dup2 out: Bad file number]
   curie / lev 0  FAILED [spawn /usr/bin/gzip: dup2 out: Bad file number]


** I'm unworried about the strange messages in this section, this is
   UFS on a proxy server and things are in flux. Not that I'm happy
   about it, but I understand it.

STRANGE DUMP SUMMARY:
   squidzone2 /sqcache2/squidguard lev 1  STRANGE (see below)
   squidzone1 /sqcache1/var        lev 0  STRANGE (see below)
   squidzone2 /sqcache2/var        lev 0  STRANGE (see below)
   trel       /trelAM              lev 0  STRANGE (see below)


STATISTICS:
                          Total       Full      Incr.
                        --------   --------   --------
Estimate Time (hrs:min)    1:03
Run Time (hrs:min)        13:41
Dump Time (hrs:min)       27:34      17:29      10:05
Output Size (meg)      525906.2   377945.6   147960.6
Original Size (meg)    639688.7   388110.3   251578.4
Avg Compressed Size (%)    58.8       81.9       52.9   (level:#disks ...)
Filesystems Dumped           47         10         37   (1:36 2:1)
Avg Dump Rate (k/s)      5425.8     6148.1     4173.4

Tape Time (hrs:min)       12:27       8:59       3:28
Tape Size (meg)        525906.1   377945.6   147960.6
Tape Used (%)              65.5       47.1       18.4   (level:#disks ...)
Filesystems Taped            47         10         37   (1:36 2:1)
   (level:#chunks ...)
Chunks Taped                 47         10         37   (1:36 2:1)
Avg Tp Write Rate (k/s) 12020.5    11963.8    12167.9

USAGE BY TAPE:
  Label         Time      Size      %    Nb    Nc
  Curie03      12:27   525906M   65.5    47    47


FAILED DUMP DETAILS:

/--  lyra / lev 1 FAILED [spawn /bin/gzip: dup2 out: Bad file number]
sendbackup: start [lyra:/ level 1]
sendbackup: error [spawn /bin/gzip: dup2 out: Bad file number]
sendbackup: info BACKUP=/usr/sbin/ufsdump
sendbackup: info RECOVER_CMD=/bin/gzip -dc |/usr/sbin/ufsrestore -xpGf - ...
sendbackup: info COMPRESS_SUFFIX=.gz
sendbackup: info end
|   DUMP: Date of this level 1 dump: Tue Jul 07 19:56:33 2009
|   DUMP: Date of last level 0 dump: Thu Jul 02 20:13:30 2009
|   DUMP: Dumping /dev/rdsk/c0t0d0s4 (lyra:/) to standard output.
|   DUMP: Mapping (Pass I) [regular files]
|   DUMP: Mapping (Pass II) [directories]
|   DUMP: Mapping (Pass II) [directories]
|   DUMP: Mapping (Pass II) [directories]
|   DUMP: Mapping (Pass II) [directories]
|   DUMP: Mapping (Pass II) [directories]
|   DUMP: Writing 32 Kilobyte records
|   DUMP: Estimated 3395280 blocks (1657.85MB) on 0.02 tapes.
? sendbackup: index tee cannot write [Broken pipe]
?   DUMP: Write error 0 blocks into volume 1
?   DUMP: Write error on standard output
|   DUMP: Cannot recover
|   DUMP: The ENTIRE dump is aborted.
? dump (29063) /usr/sbin/ufsdump returned 3
? index index returned 1
? compress (29061) compress returned 1
sendbackup: error [dump (29063) /usr/sbin/ufsdump returned 3, compress (29061) 
compress returned 1]
\--------

/--  lyra / lev 1 FAILED [missing size line from sendbackup]
? dumper: strange [missing size line from sendbackup]
\--------

/--  curie / lev 0 FAILED [spawn /usr/bin/gzip: dup2 out: Bad file number]
sendbackup: start [curie:/ level 0]
sendbackup: info BACKUP=/usr/sbin/ufsdump
sendbackup: info RECOVER_CMD=/usr/bin/gzip -dc |/usr/sbin/ufsrestore -xpGf - ...
sendbackup: info COMPRESS_SUFFIX=.gz
sendbackup: info end
sendbackup: error [spawn /usr/bin/gzip: dup2 out: Bad file number]
|   DUMP: Date of this level 0 dump: Tue Jul 07 20:27:58 2009
|   DUMP: Date of last level 0 dump: the epoch
|   DUMP: Dumping /dev/md/rdsk/d10 (curie:/) to standard output.
|   DUMP: Mapping (Pass I) [regular files]
|   DUMP: Mapping (Pass II) [directories]
|   DUMP: Writing 32 Kilobyte records
|   DUMP: Estimated 15992982 blocks (7809.07MB) on 0.12 tapes.
? sendbackup: index tee cannot write [Broken pipe]
?   DUMP: Write error 0 blocks into volume 1
?   DUMP: Write error on standard output
|   DUMP: Cannot recover
|   DUMP: The ENTIRE dump is aborted.
? dump (17423) /usr/sbin/ufsdump returned 3
? index index returned 1
? compress (17421) compress returned 1
sendbackup: error [dump (17423) /usr/sbin/ufsdump returned 3, compress (17421) 
compress returned 1]
\--------

/--  curie / lev 0 FAILED [spawn /usr/bin/gzip: dup2 out: Bad file number]
sendbackup: start [curie:/ level 0]
sendbackup: error [spawn /usr/bin/gzip: dup2 out: Bad file number]
sendbackup: info BACKUP=/usr/sbin/ufsdump
sendbackup: info RECOVER_CMD=/usr/bin/gzip -dc |/usr/sbin/ufsrestore -xpGf - ...
sendbackup: info COMPRESS_SUFFIX=.gz
sendbackup: info end
|   DUMP: Date of this level 0 dump: Tue Jul 07 20:28:23 2009
|   DUMP: Date of last level 0 dump: the epoch
|   DUMP: Dumping /dev/md/rdsk/d10 (curie:/) to standard output.
|   DUMP: Mapping (Pass I) [regular files]
|   DUMP: Mapping (Pass II) [directories]
|   DUMP: Writing 32 Kilobyte records
|   DUMP: Estimated 15993052 blocks (7809.11MB) on 0.12 tapes.
? sendbackup: index tee cannot write [Broken pipe]
?   DUMP: Write error 0 blocks into volume 1
?   DUMP: Write error on standard output
|   DUMP: Cannot recover
|   DUMP: The ENTIRE dump is aborted.
? dump (17438) /usr/sbin/ufsdump returned 3
? index index returned 1
? compress (17436) compress returned 1
sendbackup: error [dump (17438) /usr/sbin/ufsdump returned 3, compress (17436) 
compress returned 1]
\--------


STRANGE DUMP DETAILS:

/--  squidzone2 /sqcache2/squidguard lev 1 STRANGE
sendbackup: start [squidzone2:/sqcache2/squidguard level 1]
sendbackup: info BACKUP=/usr/local/bin/gtar
sendbackup: info RECOVER_CMD=/usr/local/bin/gtar -xpGf - ...
sendbackup: info end
? gtar: ./log/squidGuard.log: file changed as we read it
| Total bytes written: 1095045120 (1.1GiB, 6.3MiB/s)
sendbackup: size 1069380
sendbackup: end
\--------

/--  squidzone1 /sqcache1/var lev 0 STRANGE
sendbackup: start [squidzone1:/sqcache1/var level 0]
sendbackup: info BACKUP=/usr/local/bin/gtar
sendbackup: info RECOVER_CMD=/usr/local/bin/gtar -xpGf - ...
sendbackup: info end
? gtar: ./logs/access.log: file changed as we read it
| Total bytes written: 8797532160 (8.2GiB, 8.7MiB/s)
sendbackup: size 8591340
sendbackup: end
\--------

/--  squidzone2 /sqcache2/var lev 0 STRANGE
sendbackup: start [squidzone2:/sqcache2/var level 0]
sendbackup: info BACKUP=/usr/local/bin/gtar
sendbackup: info RECOVER_CMD=/usr/local/bin/gtar -xpGf - ...
sendbackup: info end
? gtar: ./logs/access.log: file changed as we read it
| Total bytes written: 10004336640 (9.4GiB, 5.7MiB/s)
sendbackup: size 9769860
sendbackup: end
\--------

/--  trel /trelAM lev 0 STRANGE
sendbackup: start [trel:/trelAM level 0]
sendbackup: info BACKUP=/usr/bin/xtar
sendbackup: info RECOVER_CMD=/usr/bin/xtar -f... -
sendbackup: info end
? convertToESUTF8: unsupport composite char: 0338 found
? convertToESUTF8: unsupport composite char: 0338 found
| Total bytes written: 71725670400 (67GB, 5.2MB/s)
sendbackup: size 70044600
sendbackup: end
\--------


NOTES:
  planner: Forcing full dump of trel:/Users as directed.
  planner: Forcing full dump of trel:/trelAM as directed.
  planner: Forcing full dump of trel:/trelNZ as directed.
  planner: disk mailserv:/usr1, estimate of level 0 failed.
  planner: Incremental of lyra:/db4 bumped to level 2.
  taper: tape Curie03 kb 538527908 fm 47 [OK]
  big estimate: lyra /db4 2
                est: 411M    out 294M


DUMP SUMMARY:
                                       DUMPER STATS               TAPER STATS 
HOSTNAME     DISK        L ORIG-MB  OUT-MB  COMP%  MMM:SS   KB/s MMM:SS   KB/s
-------------------------- ------------------------------------- -------------
c110         /           1       7       1   12.8    0:25   37.3   0:00 12589.2
c110         /opt        1       0       0    1.6    0:41    0.1   0:00  804.0
curie        /           0 FAILED --------------------------------------------
curie        /export     1   96060   74970   78.0  128:10 9983.4 100:12 12769.9
curie        /thump/flar 1       0       0    --     0:01    0.7   0:00  100.8
curie        -ump/source 0   36474   29879   81.9   53:55 9458.7  39:49 12805.3
curie        -p/vmfs-bak 1       0       0    --     0:02    0.6   0:00   72.7
dnix         /dev/sda1   1    2059    2059    --     6:31 5389.6   5:08 6839.7
everest      /images3    1       0       0    --     0:01    1.3   0:00   90.9
finsen       /           1     212      39   18.4    2:35  257.1   0:05 7464.4
finsen       /export     1   33152   18256   55.1  100:53 3088.2  24:00 12981.5
gatem        /           0   43556   43556    --   208:42 3561.7  58:16 12759.0
gatem        /usr1       1       1       1    --     0:54   20.7   0:00 12746.8
h220         /           1      96       7    6.9    1:56   58.8   0:03 2659.5
h220         /opt        1     593      43    7.3    5:01  147.5   0:03 12752.2
huginn       /           1   27547   27547    --    66:44 7044.8  40:49 11519.5
ldap1        /           1     794      41    5.2    2:30  281.2   0:04 11093.2
ldap1        -xport/home 0    6041    5142   85.1   23:32 3730.1  25:02 3506.2
ldap1        /usr1       1     470      39    8.4    1:45  382.4   0:05 7549.8
lyra         /           1 FAILED --------------------------------------------
lyra         /3rdparty   1      15       1    4.9    1:51    6.7   0:00 12226.1
lyra         /db1        1   15050    2390   15.9   38:39 1055.5   3:10 12892.8
lyra         /db2        1   15648    1830   11.7   32:36  957.7   2:26 12814.4
lyra         /db3        1   21094    3367   16.0   46:42 1230.7   4:30 12780.6
lyra         /db4        2    5146     294    5.7   12:51  390.9   0:24 12521.7
lyra         -port/home0 1       2       0    8.3    0:20    6.9   0:00 10584.6
lyra         /ndevelop   0    9474    7037   74.3   56:27 2127.3   9:33 12585.3
lyra         /space      1     171      37   21.5    6:15  100.4   0:04 10393.5
mailserv     /           1    3032     653   21.5    8:33 1302.5   1:17 8737.9
mailserv     /usr1       1   26616   14147   53.2   90:15 2675.4  18:48 12840.5
muninn       /           1     576     168   29.1    6:23  449.2   0:26 6593.6
muninn       /var        1     958      80    8.3    1:44  786.6   0:07 12338.1
ngato        /           1       0       0    5.8    0:55    0.5   0:00 9098.0
nlascar      /           1       0       0   32.2    0:09   13.6   0:00 10985.5
nlascar      /boot       1       0       0    --     0:01    1.8   0:00  453.0
nlascar      /data       0     833    1658  199.0    4:40 6073.8   3:24 8325.0
nlascar      /var        1     134      39   29.4    0:29 1390.1   0:03 12027.7
panther      /           1       1       0    7.3    2:30    0.7   0:00 10512.1
panther      /data       1       0       0    --     0:01    0.7   0:00  185.5
pavlov       /           1       1       0    4.1    2:08    0.4   0:00 8320.8
squidone     /           1     159      19   11.7    5:23   59.0   0:02 8960.5
squidtwo     /           1      55       4    7.7   23:44    3.0   0:00 12647.6
squidzone1   -squidguard 1     884     884    --     2:32 5960.7   2:57 5099.4
squidzone1   -cache1/var 0    8390    8390    --    16:10 8860.9  30:37 4677.0
squidzone2   -squidguard 1    1044    1044    --     2:53 6164.6   2:48 6365.6
squidzone2   -cache2/var 0    9541    9541    --    28:16 5760.5  13:11 12354.9
trel         /Users      0    3258    2201   67.5   21:40 1734.1   3:34 10515.3
trel         /trelAM     0   68403   68403    --   218:20 5346.8  89:30 13043.5
trel         /trelNZ     0  202139  202139    --   417:28 8263.7 266:14 12958.3

(brought to you by Amanda version 2.6.1p1)

----- End forwarded message -----
---
   Brian R Cuttler                 brian.cuttler AT wadsworth DOT org
   Computer Systems Support        (v) 518 486-1697
   Wadsworth Center                (f) 518 473-6384
   NYS Department of Health        Help Desk 518 473-0773



IMPORTANT NOTICE: This e-mail and any attachments may contain
confidential or sensitive information which is, or may be, legally
privileged or otherwise protected by law from further disclosure.  It
is intended only for the addressee.  If you received this in error or
from someone who was not authorized to send it to you, please do not
distribute, copy or use it or any attachments.  Please notify the
sender immediately by reply e-mail and delete this from your
system. Thank you for your cooperation.



<Prev in Thread] Current Thread [Next in Thread>