Amanda-Users

2.4.5: dumping directly to tape failure: no retry?

2005-11-01 11:45:13
Subject: 2.4.5: dumping directly to tape failure: no retry?
From: Jean-Francois Malouin <Jean-Francois.Malouin AT bic.mni.mcgill DOT ca>
To: AMANDA users <amanda-users AT amanda DOT org>
Date: Tue, 1 Nov 2005 11:37:02 -0500
Hi,

[ BTW, I noticed a thread addressing this question yesterday or so 
  but I don't want to hijack it... ]

I just upgraded to 2.4.5 and noticed that if a DLE is sent directly to
tape it is not retried if it hits EOT (a full dump that won't fit in
the holdding space). I have runtapes = 10 and never run into this
problem before, or at least with the previous version I was running,
2.4.4p3-20040805. OS: irix-6.5.x with an STK L80 library and LTO1
Ultriums.

Here is some excerpts form the amdump logfile. In the following look
at yorick:/data/mril/mril2. It's the one that failed.

To me it looks like it was being dumped directly to tape label
stk_40-conf2-000028, it hit EOT and was not retried. amanda
just went on with the other DLEs in her schedule.

----
amdump logfile:
----

taper: slot 11: date 20050925 label stk_40-conf2-000028 (exact label match)
taper: read label `stk_40-conf2-000028' date `20050925'
taper: wrote label `stk_40-conf2-000028' date `20051101'
driver: result time 98.468 from taper: TAPER-OK 

    [...]

FAILED QUEUE: empty
DONE QUEUE:

    [...]

 10: yorick     /data/mril/mril2

    [...]

ANALYZING ESTIMATES...

    [...]

GENERATING SCHEDULE:
--------
DUMP yorick fffffeff9ffe7f /data/mril/mril2 20051101 2 0
1970:1:1:0:0:0 958730408789 3 2005:10:28:10:51:39 95873040 281565

    [ a full dump of ~92GB is scheduled ]
--------

    [...]

driver: send-cmd time 15078.866 to dumper0: FILE-DUMP 01-00029
/holddisk/conf2/stk_40/20051101031508/shadow._.1 shadow 
fffffeff9ffe0f / NODEVICE 1 2005:10:30:12:33:6 2097056 DUMP 265696 
|;bsd-auth;index;

    [...]

driver: finished-cmd time 15241.837 taper wrote shadow:/
driver: dumping yorick:/data/mril/mril2 directly to tape
driver: send-cmd time 15241.838 to taper: PORT-WRITE 00-00031 yorick
fffffeff9ffe7f /data/mril/mril2 0 20051101
driver: result time 15241.842 from taper: PORT 16614
driver: send-cmd time 15241.842 to dumper0: PORT-DUMP 00-00031 16614
yorick fffffeff9ffe7f /data/mril/mril2 NODEVICE 0 1970:1:1:0:0:0 DUMP
|;bsd-auth;index;

   [ Here EOT is encountered, next tape is loaded 
     and then 3 DLEs are dumped to tape but not the one
    that hit EOT ]

taper: writing end marker. [stk_40-conf2-000028 ERR kb 103821568 fm 16]
changer: opening pipe to: /opt/amanda/amanda2/libexec/chg-zd-mtx -info
changer: got exit: 0 str: 11 10 1 1
changer_query: changer return was 10 1 1
changer_query: searchable = 1
changer_find: looking for stk_40-conf2-000029 changer is searchable = 1
changer_search: stk_40-conf2-000029
changer: opening pipe to: /opt/amanda/amanda2/libexec/chg-zd-mtx
-search stk_40-conf2-000029
changer: got exit: 0 str: 12 /hw/tape/tps12d2nrnsv
taper: slot 12: date 20050926 label stk_40-conf2-000029 (exact label match)
taper: read label `stk_40-conf2-000029' date `20050926'
taper: wrote label `stk_40-conf2-000029' date `20051101'
dumper: driver: result time 23806.848 from dumper0: FAILED 00-00031
["data write : Broken pipe"]
driver: result time 23806.849 from taper: TRY-AGAIN 00-00031 [writing
file: No space left on device]
driver: error time 23806.888 serial gen mismatch
driver: dumping yorick:/data/noel/noel1 directly to tape
driver: send-cmd time 23806.888 to taper: PORT-WRITE 00-00032 yorick
fffffeff9ffe7f /data/noel/noel1 1 20051101

    [...]

driver: result time 23806.891 from taper: PORT 16475
driver: send-cmd time 23806.891 to dumper0: PORT-DUMP 00-00032 16475
yorick fffffeff9ffe7f /data/noel/noel1 NODEVICE 1 2005:10:30:15:52:23 DUMP
|;bsd-auth;index;
driver: state time 23806.891 free kps: 1023791 space: 52428800 taper:
writing idle-dumpers: 9 qlen tapeq: 0 runq: 2 roomq: 0 wakeup: 86400
driver-idle: not-idle
driver: interface-state time 23806.891 if : free 1022391 if LE0: free
400 if LOCAL: free 1000
driver: hdisk-state time 23806.891 hdisk 0: free 52428800 dumpers 0
dumper: time 8564.873taper: s: streamtream_accept: connection from
_client127.0.0.1: connected to .16650127.0.0.1taper: t.16475
ry_socksize: dumper: receive stream_clientbuffer size is : our side is
32768
0.0.0.0.16650

    [...]

driver: result time 23815.032 from dumper0: DONE 00-00032 37 36 8 [sec
7.917 kb 36 kps 4.5 orig-kb 37]
taper: reader-side: got label stk_40-conf2-000029 filenum 1
driver: result time 23819.646 from taper: DONE 00-00032
stk_40-conf2-000029 1 [sec 12.748 kb 37 kps 2.9 
{wr: writers 3 rdwait 5.444 wrwait 4.611 filemark 2.691} ]
driver: error time 23819.646 serial gen mismatch
driver: dumping yorick:/data/neuroling/neuroling1 directly to tape

    [...]

taper: DONE [idle wait: 13960.445 secs]
taper: writing end marker. [stk_40-conf2-000029 OK kb 352 fm 3]
driver: FINISHED time 23883.755
amdump: end at Tue Nov  1 09:53:11 EST 2005

I hit a bug or a feature?


TIA
jf