Hello all;
after getting last weeks issues fixed, downgrading amanda
(now 2.5.0-20060424) as well as tar (now 1.14-2.2), Thursday and Friday
incremental dumps worked out. However, weekend dump once again didn't
to as it is supposed to. I had an eye on the amanda processes all the
weekend and it _seemed_ fine, but in the end it ain't. Some excerpts
and log dumps:
- according to amstatus, nothing has been written to tape:
backer:/tmp# /opt/sbin/amstatus --config Full
Using /var/log/amanda/Full/amdump.1 from Sa Aug 19 03:00:02 CEST 2006
[...]
backer:backervar getting estimate
backer:jka getting estimate
backer:planc1 0304732110k partial estimate done
backer:planc3 0118291530k estimate done
backer:refast getting estimate
SUMMARY part real estimated
size size
partition : 7
estimated : 2 423023640k
flush : 0 0k
failed : 0 0k ( 0.00%)
wait for dumping: 0 0k ( 0.00%)
dumping to tape : 0 0k ( 0.00%)
dumping : 0 0k 0k ( 0.00%) ( 0.00%)
dumped : 0 0k 0k ( 0.00%) ( 0.00%)
wait for writing: 0 0k 0k ( 0.00%) ( 0.00%)
wait to flush : 0 0k 0k (100.00%) ( 0.00%)
writing to tape : 0 0k 0k ( 0.00%) ( 0.00%)
failed to tape : 0 0k 0k ( 0.00%) ( 0.00%)
taped : 0 0k 0k ( 0.00%) ( 0.00%)
tape 1 : 0 0k 0k ( 0.00%) Back00
5 dumpers idle : not-idle
taper idle
network free kps: 600
holding space : 0k ( 0.00%)
0 dumpers busy : 0:00:00 ( 0.00%)
- Likewise, amverify (could this input/output error be caused by a
defective drive / tape or is this just the result of the tape being
empty?):
Tapes: Back00
Errors found:
Back00 ():
amrestore: missing file header block
amrestore: missing file header block
amrestore: error reading file header: Input/output error
** No header
0+0 in
0+0 out
[...]
- amandad debug log: Is the timeout caused by tar taking too long to
actually calculate the size of the DLE in question?
[...]
amandad: time 17777.161: sending PREP pkt:
<<<<<
OPTIONS features=fffffeff9ffeffff07;
planc3 0 SIZE 118291530
planc3 1 SIZE 2544890
planc1 0 SIZE 304732110
amandad: time 21600.749: /opt//libexec/sendsize timed out waiting for
REP data amandad: time 21600.761: sending NAK pkt:
<<<<<
ERROR timeout on reply pipe
amandad: time 21600.761: pid 23250 finish time Sat Aug 19 09:00:03 2006
- sendsize, again, is filled with error messages like those below, even
though the files definitely are around and usually an arbitrary
application also is capable of opening / reading them without a problem.
sendsize[23820]: time 9689.225: getting size via gnutar for
planc1 level 0
sendsize[23268]: time 9689.226: waiting for any estimate child:
1 running
sendsize[23820]: time 9689.324: spawning /opt//libexec/runtar
in pipeline
sendsize[23820]: argument list: /bin/tar --create
--file /dev/null --directory /backup/PV/ --one-file-system --l
isted-incremental
/opt//var/amanda/gnutar-lists/backer.planconnect.netplanc1_0.new
--sparse --ignore-failed-read --totals .
sendsize[23820]: time
16883.005: /bin/tar: ./docs/work/d2845/d736/qw78xmm4/file.doc: Warning:
Cannot seek to 0: Bad file descriptor
sendsize[23820]: time
16883.016: /bin/tar: ./docs/work/d2845/d754/vsbed359/class.all:
Warning: Cannot seek to 0: Bad file descriptor
sendsize[23820]: time
16883.016: /bin/tar: ./docs/work/d2845/d754/vsbed359/file.doc: Warning:
Cannot seek to 0: Bad file descriptor
Again, same questions - hardware error? Tape error? Misconfiguration?
Right now, I tried to manually dump some data to the drive, ending like
this:
backup:/tmp# tar czvf /dev/nst0 /backup/PV/
[...]
/backup/PV/docs/work/d0301/d880/t37adzo8/
/backup/PV/docs/work/d0301/d880/t37adzo8/class.all
/backup/PV/docs/work/d0301/d880/t37adzo8/file.hpgl
tar: /dev/nst0: Cannot open: Input/output error
Ideas, enlightenments, hints, ... really much appreciated... :/
Thanks and bye,
Kristian