Amanda-Users

Re: {Spam?} backup still messed up...

2006-08-21 09:54:47
Subject: Re: {Spam?} backup still messed up...
From: Martin Hepworth <martinh AT solid-state-logic DOT com>
To: "Amanda Users (E-mail)" <amanda-users AT amanda DOT org>
Date: Mon, 21 Aug 2006 13:47:36 +0100
Kristian Rink wrote:
Hello all;

after getting last weeks issues fixed, downgrading amanda
(now 2.5.0-20060424) as well as tar (now 1.14-2.2), Thursday and Friday
incremental dumps worked out. However, weekend dump once again didn't
to as it is supposed to. I had an eye on the amanda processes all the
weekend and it _seemed_ fine, but in the end it ain't. Some excerpts
and log dumps:


- according to amstatus, nothing has been written to tape:

backer:/tmp# /opt/sbin/amstatus --config Full
Using /var/log/amanda/Full/amdump.1 from Sa Aug 19 03:00:02 CEST 2006


[...]
backer:backervar             getting estimate
backer:jka                   getting estimate
backer:planc1    0304732110k partial estimate done
backer:planc3    0118291530k estimate done
backer:refast                getting estimate

SUMMARY          part      real  estimated
                           size       size
partition       :   7
estimated       :   2            423023640k
flush           :   0         0k
failed          :   0                    0k           (  0.00%)
wait for dumping:   0                    0k           (  0.00%)
dumping to tape :   0                    0k           (  0.00%)
dumping         :   0         0k         0k (  0.00%) (  0.00%)
dumped          :   0         0k         0k (  0.00%) (  0.00%)
wait for writing:   0         0k         0k (  0.00%) (  0.00%)
wait to flush   :   0         0k         0k (100.00%) (  0.00%)
writing to tape :   0         0k         0k (  0.00%) (  0.00%)
failed to tape  :   0         0k         0k (  0.00%) (  0.00%)
taped           :   0         0k         0k (  0.00%) (  0.00%)
  tape 1        :   0         0k         0k (  0.00%) Back00
5 dumpers idle  : not-idle
taper idle
network free kps:       600
holding space   :         0k (  0.00%)
 0 dumpers busy :  0:00:00  (  0.00%)





- Likewise, amverify (could this input/output error be caused by a
defective drive / tape or is this just the result of the tape being
empty?):


Tapes:  Back00
Errors found: Back00 ():
amrestore: missing file header block
amrestore: missing file header block
amrestore: error reading file header: Input/output error
** No header
0+0 in
0+0 out
[...]



- amandad debug log: Is the timeout caused by tar taking too long to
actually calculate the size of the DLE in question?

[...] amandad: time 17777.161: sending PREP pkt: <<<<<
OPTIONS features=fffffeff9ffeffff07;
planc3 0 SIZE 118291530
planc3 1 SIZE 2544890
planc1 0 SIZE 304732110
amandad: time 21600.749: /opt//libexec/sendsize timed out waiting for
REP data amandad: time 21600.761: sending NAK pkt:
<<<<<
ERROR timeout on reply pipe
amandad: time 21600.761: pid 23250 finish time Sat Aug 19 09:00:03 2006



- sendsize, again, is filled with error messages like those below, even
though the files definitely are around and usually an arbitrary
application also is capable of opening / reading them without a problem.


sendsize[23820]: time 9689.225: getting size via gnutar for
planc1 level 0 sendsize[23268]: time 9689.226: waiting for any estimate child: 1 running sendsize[23820]: time 9689.324: spawning /opt//libexec/runtar in pipeline sendsize[23820]: argument list: /bin/tar --create
--file /dev/null --directory /backup/PV/ --one-file-system --l
isted-incremental 
/opt//var/amanda/gnutar-lists/backer.planconnect.netplanc1_0.new
--sparse --ignore-failed-read --totals .
sendsize[23820]: time
16883.005: /bin/tar: ./docs/work/d2845/d736/qw78xmm4/file.doc: Warning:
Cannot seek to 0: Bad file descriptor
sendsize[23820]: time
16883.016: /bin/tar: ./docs/work/d2845/d754/vsbed359/class.all:
Warning: Cannot seek to 0: Bad file descriptor
sendsize[23820]: time
16883.016: /bin/tar: ./docs/work/d2845/d754/vsbed359/file.doc: Warning:
Cannot seek to 0: Bad file descriptor




Again, same questions - hardware error? Tape error? Misconfiguration?
Right now, I tried to manually dump some data to the drive, ending like
this:


backup:/tmp# tar czvf /dev/nst0 /backup/PV/
[...]
/backup/PV/docs/work/d0301/d880/t37adzo8/
/backup/PV/docs/work/d0301/d880/t37adzo8/class.all
/backup/PV/docs/work/d0301/d880/t37adzo8/file.hpgl
tar: /dev/nst0: Cannot open: Input/output error

Ideas, enlightenments, hints, ... really much appreciated... :/

Thanks and bye,
Kristian


Yeah - looks like the estimate timeout I had problems with on tar.

have a look at the client logs to see how long the estimate phase took and adjust the etimeout parameter in amanda.conf accordingly (I had to make mine for an Apple Mac - SATA drives and dual PPCs, tar is really slow;-(.


--
Martin Hepworth
Senior Systems Administrator
Solid State Logic
Tel: +44 (0)1865 842300

**********************************************************************

This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
the system manager.

This footnote confirms that this email message has been swept
for the presence of computer viruses and is believed to be clean.       

**********************************************************************


<Prev in Thread] Current Thread [Next in Thread>