Amanda-Users

Re: 2.4.5: dumping directly to tape failure: no retry?

2005-11-02 18:48:51
Subject: Re: 2.4.5: dumping directly to tape failure: no retry?
From: Gene Heskett <gene.heskett AT verizon DOT net>
To: amanda-users AT amanda DOT org
Date: Wed, 02 Nov 2005 18:38:42 -0500
On Tuesday 01 November 2005 11:37, Jean-Francois Malouin wrote:
>Hi,
>
>[ BTW, I noticed a thread addressing this question yesterday or so
>  but I don't want to hijack it... ]
>
>I just upgraded to 2.4.5 and noticed that if a DLE is sent directly to
>tape it is not retried if it hits EOT (a full dump that won't fit in
>the holdding space). I have runtapes = 10 and never run into this
>problem before, or at least with the previous version I was running,
>2.4.4p3-20040805. OS: irix-6.5.x with an STK L80 library and LTO1
>Ultriums.

If it hits EOT while writing directly to the tape, then the file is
lost.  Thats been true for as long as I've been using amanda, something
like 6 or 7 years now.  So that behaviour is not new. In order for
amanda to try it on a new tape, it would have to restart the DLE from
scratch and amanda has never to my knowledge attempted to do that.

>Here is some excerpts form the amdump logfile. In the following look
>at yorick:/data/mril/mril2. It's the one that failed.
>
>To me it looks like it was being dumped directly to tape label
>stk_40-conf2-000028, it hit EOT and was not retried. amanda
>just went on with the other DLEs in her schedule.
>
>----
>amdump logfile:
>----
>
>taper: slot 11: date 20050925 label stk_40-conf2-000028 (exact label
> match) taper: read label `stk_40-conf2-000028' date `20050925'
>taper: wrote label `stk_40-conf2-000028' date `20051101'
>driver: result time 98.468 from taper: TAPER-OK
>
>    [...]
>
>FAILED QUEUE: empty
>DONE QUEUE:
>
>    [...]
>
> 10: yorick     /data/mril/mril2
>
>    [...]
>
>ANALYZING ESTIMATES...
>
>    [...]
>
>GENERATING SCHEDULE:
>--------
>DUMP yorick fffffeff9ffe7f /data/mril/mril2 20051101 2 0
>1970:1:1:0:0:0 958730408789 3 2005:10:28:10:51:39 95873040 281565
>
>    [ a full dump of ~92GB is scheduled ]
>--------
>
>    [...]
>
>driver: send-cmd time 15078.866 to dumper0: FILE-DUMP 01-00029
>/holddisk/conf2/stk_40/20051101031508/shadow._.1 shadow
>fffffeff9ffe0f / NODEVICE 1 2005:10:30:12:33:6 2097056 DUMP 265696
>
>|;bsd-auth;index;
>
>    [...]
>
>driver: finished-cmd time 15241.837 taper wrote shadow:/
>driver: dumping yorick:/data/mril/mril2 directly to tape
>driver: send-cmd time 15241.838 to taper: PORT-WRITE 00-00031 yorick
>fffffeff9ffe7f /data/mril/mril2 0 20051101
>driver: result time 15241.842 from taper: PORT 16614
>driver: send-cmd time 15241.842 to dumper0: PORT-DUMP 00-00031 16614
>yorick fffffeff9ffe7f /data/mril/mril2 NODEVICE 0 1970:1:1:0:0:0 DUMP
>
>|;bsd-auth;index;
>
>   [ Here EOT is encountered, next tape is loaded
>     and then 3 DLEs are dumped to tape but not the one
>    that hit EOT ]
>
>taper: writing end marker. [stk_40-conf2-000028 ERR kb 103821568 fm 16]
>changer: opening pipe to: /opt/amanda/amanda2/libexec/chg-zd-mtx -info
>changer: got exit: 0 str: 11 10 1 1
>changer_query: changer return was 10 1 1
>changer_query: searchable = 1
>changer_find: looking for stk_40-conf2-000029 changer is searchable = 1
>changer_search: stk_40-conf2-000029
>changer: opening pipe to: /opt/amanda/amanda2/libexec/chg-zd-mtx
>-search stk_40-conf2-000029
>changer: got exit: 0 str: 12 /hw/tape/tps12d2nrnsv
>taper: slot 12: date 20050926 label stk_40-conf2-000029 (exact label
> match) taper: read label `stk_40-conf2-000029' date `20050926'
>taper: wrote label `stk_40-conf2-000029' date `20051101'
>dumper: driver: result time 23806.848 from dumper0: FAILED 00-00031
>["data write : Broken pipe"]
>driver: result time 23806.849 from taper: TRY-AGAIN 00-00031 [writing
>file: No space left on device]
>driver: error time 23806.888 serial gen mismatch
>driver: dumping yorick:/data/noel/noel1 directly to tape
>driver: send-cmd time 23806.888 to taper: PORT-WRITE 00-00032 yorick
>fffffeff9ffe7f /data/noel/noel1 1 20051101
>
>    [...]
>
>driver: result time 23806.891 from taper: PORT 16475
>driver: send-cmd time 23806.891 to dumper0: PORT-DUMP 00-00032 16475
>yorick fffffeff9ffe7f /data/noel/noel1 NODEVICE 1 2005:10:30:15:52:23
> DUMP
>
>|;bsd-auth;index;
>
>driver: state time 23806.891 free kps: 1023791 space: 52428800 taper:
>writing idle-dumpers: 9 qlen tapeq: 0 runq: 2 roomq: 0 wakeup: 86400
>driver-idle: not-idle
>driver: interface-state time 23806.891 if : free 1022391 if LE0: free
>400 if LOCAL: free 1000
>driver: hdisk-state time 23806.891 hdisk 0: free 52428800 dumpers 0
>dumper: time 8564.873taper: s: streamtream_accept: connection from
>_client127.0.0.1: connected to .16650127.0.0.1taper: t.16475
>ry_socksize: dumper: receive stream_clientbuffer size is : our side is
>32768
>0.0.0.0.16650
>
>    [...]
>
>driver: result time 23815.032 from dumper0: DONE 00-00032 37 36 8 [sec
>7.917 kb 36 kps 4.5 orig-kb 37]
>taper: reader-side: got label stk_40-conf2-000029 filenum 1
>driver: result time 23819.646 from taper: DONE 00-00032
>stk_40-conf2-000029 1 [sec 12.748 kb 37 kps 2.9
>{wr: writers 3 rdwait 5.444 wrwait 4.611 filemark 2.691} ]
>driver: error time 23819.646 serial gen mismatch
>driver: dumping yorick:/data/neuroling/neuroling1 directly to tape
>
>    [...]
>
>taper: DONE [idle wait: 13960.445 secs]
>taper: writing end marker. [stk_40-conf2-000029 OK kb 352 fm 3]
>driver: FINISHED time 23883.755
>amdump: end at Tue Nov  1 09:53:11 EST 2005
>
>I hit a bug or a feature?
>
>
>TIA
>jf

-- 
Cheers, Gene
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
99.35% setiathome rank, not too shabby for a WV hillbilly
Free OpenDocument reader/writer/converter download:
http://www.openoffice.org
Yahoo.com and AOL/TW attorneys please note, additions to the above
message by Gene Heskett are:
Copyright 2005 by Maurice Eugene Heskett, all rights reserved.