Amanda-Users

Re: out of tape with vtapes

2006-02-09 05:23:45
Subject: Re: out of tape with vtapes
From: Paul Bijnens <paul.bijnens AT xplanation DOT com>
To: uwe.kaufmann AT infoconsult DOT nu
Date: Thu, 09 Feb 2006 11:14:33 +0100
uwe.kaufmann AT infoconsult DOT nu wrote:

I have a general question concerning the vtapes mechanisms and vtape length.

I configured an amanda server ("minerva", amanda-2.4.5-2, SuSE 10.0) with
"chg-disk" and 14 slots.

In the beginning there are a lot of "lev 0 failed", obviously, because I am
dealing with >100G data and each slot has only 18G (to fit on tape later
on).

But where does the "lev 0 FAILED [out of tape]" comes from?. I thought that
amanda would calculate properly the relation between data and tapelength.

Amanda does an estimate.  And one of the main reasons that initial
estimates are wrong, is the compression.
When you have a dumptype with compression enabled, the estimate phase does:
- Run the dump program that gives an esimate of the UNcompressed backup
- multiply that value with the expected compression that was learned
  from history

For a DLE that has history (i.e. not a new DLE), you can see that value
with the command "amadmin Config info", giving an output like:

 $ amadmin test info katastrov /space

 Current info for katastrov /space:
   Stats: dump rates (kps), Full:  588.0, 603.0, 575.0
                     Incremental:  386.0, 230.0, 195.0
           compressed size, Full:  41.6%, 41.6%, 39.1%
                     Incremental:  20.6%,  9.1%, 20.1%
   Dumps: lev datestmp  tape             file   origK   compK secs
           0  20060202  Daily-05          114 6985920 2903984 4931
           1  20060207  Daily-08           95 1415860  284916 1455
           2  20060209  Daily-10           99 1375710  283200  733

The above example has 3 historical values for compression ratio for a full backup and 3 other for incremental backup.
The new expected compression ratio is calculated as an weighted average
of those three values, with the newest ratio having the most weight.
That prediction usually is very good, unless the contents of the DLE
suddenly change in nature: e.g. zip-files replacing very compressable
subdirectory trees (what people tend to do when they are told to clean up :-) ).

When there is no historical data, amanda assumes a 50% compression.
That may be tuned by the amanda.conf parameter "comprate".

Now, let's see what happened in your config, below.


Thank you for any enlightenment of my poor spirit.

BTW: Great wiki and docs, thanks to all who spent effort, really helpful!

Cheers
Uwe


Additional info
--------------------
The report:

These dumps were to tape slot3.
*** A TAPE ERROR OCCURRED: [[writing file: No space left on device]].
Some dumps may have been left in the holding disk.
Run amflush to flush them to tape.
The next tape Amanda expects to use is: a new tape.
The next new tape already labelled is: slot4.

FAILURE AND STRANGE DUMP SUMMARY:
  hercules   /samba/LAGERVERSAND lev 0 FAILED [dumps too big, 109335 KB, but
cannot incremental dump new disk]
  hercules   //Edi/TC4000/IBDasi lev 0 FAILED [dumps too big, 111909 KB, but
cannot incremental dump new disk]
  hercules   /KT_EIN_PRO lev 0 FAILED [dumps too big, 576825 KB, but cannot
incremental dump new disk]
  hercules   /US_LE lev 0 FAILED [dumps too big, 584480 KB, but cannot
incremental dump new disk]
  hercules   /K_EIN_NO4 lev 0 FAILED [dumps too big, 596835 KB, but cannot
incremental dump new disk]

The above errors are to be expected indeed, as you already explained.


  hercules   /KONST lev 0 FAILED [out of tape]

And one DLE did not fit on the tape.
(I'll explain below, how we even improve on this, by letting a smaller
DLE fail instead of this one!)




STATISTICS:
                          Total       Full      Incr.
                        --------   --------   --------
Estimate Time (hrs:min)    0:01
Run Time (hrs:min)         4:14
Dump Time (hrs:min)        4:09       4:09       0:00
Output Size (meg)       18852.6    18851.7        0.9
Original Size (meg)     36786.0    36780.9        5.1
Avg Compressed Size (%)    51.2       51.3       16.7   (level:#disks ...)

So the Original size (= before compression) was estimated
and turned out to be 36786 Mbytes.
Assuming a default compression rate, because these were all
new DLE's) of 50%, Amanda expected to generate only 18393 Mbytes
of output.  (We should actually use the values of the estimate
run, which can again be a little bit different than the backup run;
you find those values in the "amdump.1" file).

Your tapetype was defined as

> define tapetype HARD-DISK {
>     comment "Dump onto hard disk vtapes"
>     length 18432 mbytes
> }


So, Amanda scheduled up to 18393 Mbytes. The rest was left for the
next run ("dumps too big... but cannot incremental dump new disk").
Actually a pretty good approximation.

In real life the compression turns out to be only 51.2%, (comparing
100 * after / before % -- beware that many compression programs
report the complement of this: what was saved, instead of what is left).


Filesystems Dumped           24         16          8   (1:8)
Avg Dump Rate (k/s)      1290.8     1291.0      412.0

Tape Time (hrs:min)        0:10       0:10       0:00
Tape Size (meg)         16074.0    16073.1        0.9
Tape Used (%)              87.3       87.3        0.0   (level:#disks ...)
Filesystems Taped            23         15          8)
Avg Tp Write Rate (k/s) 28260.4    28271.6     3385.5

USAGE BY TAPE:
  Label       Time      Size      %    Nb
  slot3       0:10 16459733k   87.3    23

This is the size of all the usable backup images.  Amanda did
try to write one more, but while doing that, it bumped into end of
tape and that backup image is not usable for a restore, so not counted
in the percentage above.


NOTES:
  planner: Adding new disk hercules:/samba/CONTROLLING.
[...]
  planner: Adding new disk hercules:/K_EIN_REST.
  taper: tape slot3 kb 18874336 fm 24 writing file: No space left on device

Here you see when Amanda hit the end of tape really: at 18874336 kb
or 18431 Mbyte (compared this to length 18432 Mbyte in the tapetype).


  driver: going into degraded mode because of tape error.

DUMP SUMMARY:
                                      DUMPER STATS                  TAPER
STATS HOSTNAME DISK L ORIG-kB OUT-kB COMP% MMM:SS KB/s MMM:SS
KB/s
----------------------- ------------------------------------------
--------------
hercules   -00/IBDasi 0 FAILED
--------------------------------------------------
hercules   /KONST     0    7004350    2845316  40.6  48:12   984.0  FAILED
------

This DLE did make it to holdingdisk, but did not make it to tape.
As you can see, it was 2.8 Gbyte.

[...]

(brought to you by Amanda version 2.4.5)


--------------------
My amanda.conf:

org "normal"          # your organization name for reports
dumpuser "amanda"     # the user to run dumps under
[...]
--------------------
The slots so far:

minerva:/amandatapes/normal # du -hx --max-depth=1 .
19G     ./slot1  <= first amdump
8.5G    ./slot2  <= amflush
19G     ./slot3  <= second amdump
2.8G    ./slot4  <= amflush

So you had to flush 2.8 Gbyte (the one DLE above) to the fourth tape.

The reason was that Amanda's estimate was a few KBytes off: 50%
instead of 51.2%.  Next time amanda will estimate better.
But there is always a little chance of errors, of course.

But, as I said, we can improve the setup a little bit more.
Note that the last failed image was 2.8 Gbyte.  And that whole image
must be put on tape again on the next tape.  It would be better if
Amanda put the larger dumps not at the end of the tape, if possible.

Just add the amanda.conf parameter "taperalgo largestfit" to
the setup.  For a complete explanation, also read:

http://wiki.zmanda.com/index.php/Filling_a_tape_to_100%25


--
Paul Bijnens, xplanation Technology Services        Tel  +32 16 397.511
Technologielaan 21 bus 2, B-3001 Leuven, BELGIUM    Fax  +32 16 397.512
http://www.xplanation.com/          email:  Paul.Bijnens AT xplanation DOT com
***********************************************************************
* I think I've got the hang of it now:  exit, ^D, ^C, ^\, ^Z, ^Q, ^^, *
* F6, quit, ZZ, :q, :q!, M-Z, ^X^C, logoff, logout, close, bye, /bye, *
* stop, end, F3, ~., ^]c, +++ ATH, disconnect, halt,  abort,  hangup, *
* PF4, F20, ^X^X, :D::D, KJOB, F14-f-e, F8-e,  kill -1 $$,  shutdown, *
* init 0, kill -9 1, Alt-F4, Ctrl-Alt-Del, AltGr-NumLock, Stop-A, ... *
* ...  "Are you sure?"  ...   YES   ...   Phew ...   I'm out          *
***********************************************************************


<Prev in Thread] Current Thread [Next in Thread>