Amanda-Users

Re: determinig runtime and tape usage

2003-12-29 10:21:39
Subject: Re: determinig runtime and tape usage
From: Gene Heskett <gene.heskett AT verizon DOT net>
To: Georg Rehfeld <georg.rehfeld AT gmx DOT de>
Date: Mon, 29 Dec 2003 10:18:05 -0500
On Monday 29 December 2003 09:03, Georg Rehfeld wrote:
>Hello Karsten, Gene, all,
>
>Gene Heskett wrote:
>> On Saturday 20 December 2003 02:30, Georg Rehfeld wrote:
>>> Hallo Karsten, dear Amanda users,
>>>
>>>> i am wondering if it is possible to check how much data is
>>>> already written in the curent run of amdump/amflush.
>>>>
>>>> In the output of amstatus is just an overwiev per filesystem,
>>>> but i am interested how much of that particular filesystem is
>>>> already on tape and for example the estimated time.
>>>
>>> ... advertisement about amstatus.cgi cut ...
>>>
>>> But: amstatus, while a backup is running
>>> - CAN look at the file size of the holding disk files, even when
>>> they are just written, with some error, sure, but good enough for
>>> a valuable estimate most of the time.
>>> - CAN estimate (very rough) times of dumps done directly to tape
>>>   from calculations of average dump/tape speed and the estimated
>>>   size of the dump. This may be _very_ rough, as
>>> stop/rewind/start times of a tape not streaming are simply
>>> unknown to amanda at all.
>>
>> Yes, this would be the great unknown if the machine is so slow as
>> to not be able to keep the drive streaming.  Certainly not a
>> problem here with a 400k/sec drive and a 1450mhz machine. But,
>> given the write speed obtained on those disklist entries that have
>> been finished, it seems to me that a reasonable estimate of the
>> current files progress, based on current time-start time and that
>> transfer rate, could be guessed at and displayed.
>>
>> That would not take any mods to amanda. :-)
>
>I just found out, that amdump files already contain estimates about
>the time needed to dump each filesystem. They are in the GENERATING
>SCHEDULE: section in the DUMP lines, the 10th value is the estimated
>dump time in seconds. The Amanda planner uses collected statistics
>from previous dumps for this, if available, else uses a default
>dump rate for calculation.
>
>Unless the default must be used the esimated times are very near
>to reality, far better than all estimating possible from other
> values available to amstatus. amstatus did not use these times so
> far.
>
>For display of progress in % and time to finish a dump I will use
>these times (in the CGI part only for now).
>
>There seem to be no similar figures for taping and the tape speed
>seems to vary largely:
>
>size  speed    speed
>   MB   MB/s    graph
>
>  102    1.8    *******
>  108    2.0    ********
>  113    1.7    *******
>  141    2.4    **********
>  227    1.9    ********
>  228    2.1    ********
>  251    1.0    ****
>  253    2.1    ********
>  292    7.0    ****************************
>  294    1.9    ********
>  328    3.2    *************
>  339    1.9    ********
>  450    2.9    ************
>  473    4.0    ****************
>  587    3.4    *************
>  590    5.5    **********************
>  837    4.7    *******************
>1029    2.4    **********
>3132    6.5    **************************
>3456    8.6    **********************************
>3512    7.4    ******************************
>
>(All sizes reasonable large, all taping done from holding disk, all
>values from one backup run)

This seems very unusual to me.  If the drive is capable of 400k/s like 
mine is, then it seems to me the actual speed of taper itself should 
be considerably more consistent than that shown here.  Something else 
is going on, or possibly even the data is being miss-interpreted.  

One source of errors might be that if this next file to be written is 
also being gzipped, and there is no other choice of files remaining 
in the holding disk for taper to write, it could be adding the 
"waiting for gzip" time to its write time.  Here, in observing my own 
runs, I have noted that the drive has stopped, and will occasionally 
sit waiting for data for 10 or more minutes.  When this occurs, I can 
find an instance of gzip running, and taper is sleeping.  This may 
happen more than once per run, and depending on the mix of gzipped vs 
raw in this run, usually does.

The intrigueing thing is that a slow client (500mhz) is all done, 
while the server itself is waiting for an available file to tape.  
Both use gzip, compress client best for those DLE's that are 
compressible.

One might be able to determine if this is the case by compareing who 
has a file lock (by parsing the output of lsof|grep prgname) on the 
holding disk file, vs who is sleeping with no visible file locks.  If 
taper has none because the file it wants is not yet closed by gzip, 
then it should not be considered to have actually started maybe?

This could be alleviated if taper did not record its start time until 
it has successfully opened the file, however thats just a swag on my 
part, I have not walked thru that code.  Thats a question Jean-Louis 
could probably answer immediately, else walk thru the taper code and 
see.

Just a few thoughts.  Possibly wrong...

>The ratio of tape speeds is nearly 1:9, a tendency is: larger files
>tape faster, when looking at only sizes > 500 MB the ratio is only
>1:3.6
>
>Thus estimating tape time/percentage remains to be _very_ rough IMO
>without modifying Amanda to write taper progress to the amdump file,
>when taping from holding disk.
>
>When taping directly from partition, the situation might be much
> better: the mentioned estimated dump times for those partitions
> respect non streaming tapes already (with fast tapes) and respect
> slow tapes also, where the dumper must wait for the tape. This is
> what I believe, though I havn't verified it enough to be sure.
>
>regards
>
>Georg

-- 
Cheers, Gene
AMD K6-III@500mhz 320M
Athlon1600XP@1400mhz  512M
99.22% setiathome rank, not too shabby for a WV hillbilly
Yahoo.com attornies please note, additions to this message
by Gene Heskett are:
Copyright 2003 by Maurice Eugene Heskett, all rights reserved.


<Prev in Thread] Current Thread [Next in Thread>