Amanda-Users

More Debian Amanda package upgrade issues (this time with amdump)

2006-11-13 22:33:08
Subject: More Debian Amanda package upgrade issues (this time with amdump)
From: Frank Smith <fsmith AT hoovers DOT com>
To: amanda-users AT amanda DOT org
Date: Mon, 13 Nov 2006 21:23:24 -0600
Thanks to Jean-Louis for the reply on my amcheck issues.

I got some really strange results from my first amdump run last
night after my upgrade.

>> I updated my home Amanda server and clients (Debian etch
>> packages) from 2.5.0p2-2.1 to 2.5.1p1-2, and tar from 1.15.91-2 to
>> 1.16-1.

FAILURE AND STRANGE DUMP SUMMARY:
  server   /  lev 2  STRANGE
  clientA  /  lev 0  FAILED [/bin/tar returned 1]
  clientA  /  lev 0  FAILED [/bin/tar returned 1]
...

Why is clientA / reported twice? Probably because it failed twice,
see comments farther down.

FAILED AND STRANGE DUMP DETAILS:

/--  server / lev 2 STRANGE
sendbackup: start [k6:/ level 2]
sendbackup: info BACKUP=/bin/tar
sendbackup: info RECOVER_CMD=/bin/gzip -dc |/bin/tar -f - ...
sendbackup: info COMPRESS_SUFFIX=.gz
sendbackup: info end
? gtar: ./hdb/holdingdisk/daily/20061031053001: Cannot savedir: No such file or 
directory
? gtar: ./hdb/holdingdisk/daily/20061031053001/: Warning: Cannot stat: No such 
file or directory
? gtar: ./hdb/holdingdisk/daily/20061101053002: Cannot savedir: No such file or 
directory
? gtar: ./hdb/holdingdisk/daily/20061101053002/: Warning: Cannot stat: No such 
file or directory
? gtar: ./hdb/holdingdisk/daily/20061102053001: Cannot savedir: No such file or 
directory
....
many more lines like these.
For one thing, /hdb is is a mount point for a second hard disk, so
tar has been escaping the root filesystem, The other thing is that
the directories were all flushed to tape the day before, so they
ceased to exist before the time of last night's backup.  Is Amanda
trying to check everything in the level 1 index file instead of
just backing up what currently exists?

 /--  clientA / lev 0 FAILED [/bin/tar returned 1]
sendbackup: start [artair-gw-b:/ level 0]
sendbackup: info BACKUP=/bin/tar
sendbackup: info RECOVER_CMD=/bin/gzip -dc |/bin/tar -f - ...
sendbackup: info COMPRESS_SUFFIX=.gz
sendbackup: info end
? gtar: ./var/cache/bind/named.log: file changed as we read it
? gtar: ./var/lib/amavis/tmp/amavis-20061113T050013-05034/email.txt: Warning: 
Cannot stat: No such file or directory
? gtar: ./var/lib/amavis/tmp/amavis-20061113T050013-05034/email.txt: Warning: 
Cannot stat: No such file or directory
| gtar: ./var/run/amavis/amavisd.sock: socket ignored
| gtar: ./var/run/apache2/cgisock.25079: socket ignored
| gtar: ./var/run/clamav/clamd.ctl: socket ignored
| gtar: ./var/run/greylistd/socket: socket ignored
| gtar: ./var/run/mysqld/mysqld.sock: socket ignored
| Total bytes written: 4298557440 (4.1GiB, 9.5MiB/s)
sendbackup: error [/bin/tar returned 1]
\--------

/--  clientA / lev 0 FAILED [/bin/tar returned 1]
sendbackup: start [artair-gw-b:/ level 0]
sendbackup: info BACKUP=/bin/tar
sendbackup: info RECOVER_CMD=/bin/gzip -dc |/bin/tar -f - ...
sendbackup: info COMPRESS_SUFFIX=.gz
sendbackup: info end
? gtar: ./var/cache/bind/named.log: file changed as we read it
| gtar: ./var/run/amavis/amavisd.sock: socket ignored
| gtar: ./var/run/apache2/cgisock.25079: socket ignored
| gtar: ./var/run/clamav/clamd.ctl: socket ignored
| gtar: ./var/run/greylistd/socket: socket ignored
| gtar: ./var/run/mysqld/mysqld.sock: socket ignored
| Total bytes written: 4298608640 (4.1GiB, 9.0MiB/s)
sendbackup: error [/bin/tar returned 1]
\--------

OK, why did it try backing up / on clientA twice?  It didn't just
report it twice, it ran it twice (notice the slightly different
sizes and a changed temp file during one run).

Now the really odd part:
NOTES:
  planner: Last full dump of clientA:/home/user1 on tape  overwritten in 1 run.
  planner: Last full dump of clientA:/home/user2 on tape  overwritten in 1 run.
  planner: Last full dump of clientA:/home/user3 on tape  overwritten in 1 run.
  planner: Last full dump of clientA:/home/user4 on tape  overwritten in 1 run.
  planner: Last full dump of clientA:/home/user5 on tape  overwritten in 1 run.
  planner: Last full dump of clientB:/ on tape  overwritten in 1 run.
  taper: tape_rdlabel: tape open: /dev/nst0: Input/output error

I've got a tapecycle of 45, and ususally flush once a week, so
why does Amanda thing I'm about to overwrite most of my level 0s?
And why are the tape names blank?  The taper error is expected, I
didn't have a tape in the drive.

  big estimate: clientA /home/user8 1
                est: 64k    out 1k
  big estimate: clientA /home/user2 1
                est: 1056k    out 107k
  big estimate: clientA /home/user4 1
                est: 64k    out 18k
  big estimate: clientA /home/user5 1
                est: 64k    out 5k
  big estimate: clientA /home/user6 1
                est: 64k    out 1k
  big estimate: clientA /home/user7 1
                est: 64k    out 1k
  small estimate: server / 2
                  est: 224128k    out 249679k
  small estimate: clientB / 1
                  est: 1856k    out 3375k

I've never seen this estimate section before, guess I need to go read
up on it and see what it means.

DUMP SUMMARY:
                                               DUMPER STATS                 
TAPER STATS
HOSTNAME     DISK               L  ORIG-kB   OUT-kB  COMP%  MMM:SS    KB/s 
MMM:SS   KB/s
--------------------------------- ---------------------------------------- 
-------------
clientA  /                  0      N/A  2775998    --     N/A     N/A    N/A    
N/A  PARTIAL
clientA  /home/user1        1   123420    85756   69.5    0:13  6369.7   N/A    
N/A
clientA  /home/user8        1       10        1   10.0    0:00    34.5   N/A    
N/A
clientA  /home/user2        1     3710      107    2.9    0:03    42.5   N/A    
N/A
clientA  /home/user3        1    25280     8023   31.7    0:02  5163.1   N/A    
N/A
clientA  /home/user4        1      190       18    9.5    0:00   329.4   N/A    
N/A
clientA  /home/user5        1      110        5    4.5    0:00   164.4   N/A    
N/A
clientA  /home/user9        1    17160     8328   48.5    0:02  5390.3   N/A    
N/A
clientA  /home/user6        1       10        1   10.0    0:00    46.0   N/A    
N/A
clientA  /home/user7        1       10        1   10.0    0:00    36.8   N/A    
N/A
server   /                  2   601820   249679   41.5    1:30  2763.1   N/A    
N/A
clientB  /                  1    10680     3375   31.6    0:47   227.1   N/A    
N/A

(brought to you by Amanda version 2.5.1p1)

What does 'PARTIAL' mean?  ClientA / was reported as about 4.1GB
during both its failed runs above, so is the 2.7GB reported in the
dump summary the compressed size (the dumptype uses compress-fast)
or was it in reality a partial dump?

I've really been surprised by the number of issues I've encountered
making a minor version change (2.5.0->2.5.1), especially since the
upgrade from 2.4.2 to 2.5 went fairly smoothly.  I'm sure part of
the problems are a result of tar 1.16.  One relevant changelog entry
for tar (1.15.92) is:
* After creating an archive, tar exits with code 1 if some files were
  changed while being read.  Previous versions exited with code 2 (fatal
  error), and only if some files were truncated while being archived.

Does this mean that many backups will now be marked as FAILED instead
of STRANGE?

Frank

-- 
Frank Smith                                      fsmith AT hoovers DOT com
Sr. Systems Administrator                       Voice: 512-374-4673
Hoover's Online                                   Fax: 512-374-4501