Amanda-Users

Re: hardware vs software compression (was Re: amflush/amcheck not in sync?)

2003-04-25 11:11:56
Subject: Re: hardware vs software compression (was Re: amflush/amcheck not in sync?)
From: Paul Bijnens <paul.bijnens AT xplanation DOT com>
To: smt AT corning DOT com, Russell Adams <RLAdams AT kelsey-seybold DOT com>
Date: Fri, 25 Apr 2003 17:06:59 +0200


I do know nothing about VMS, but I believe there is some version perl
running on it.  I have attached a perl script that I used to measure
the throughput in different cases.  This script was the idea that
led to the implementation of the "-c" option to the latest amtapetype.

The documentation is in the beginning of the file, or type
"gendata -h" to see the options.

Run it a few times with a scratch tape and compare the results:

        gendata -v -n 200    > /your/tape/device-nohwc
        gendata -v -n 200 -C > /your/tape/device-nohwc
        gendata -v -n 200    > /your/tape/device-withhwc
        gendata -v -n 200 -C > /your/tape/device-withhwc

(maybe run it too with "-c" instead of "-C", and/or increase the size
"-n 200" if your computer is very fast).

You can also run, just for comparison, to see how fast it could be
without sending the data to the tapedevice:

        gendata -v -n 200 | cat > /dev/null
        gendata -v -n 200 | gzip | cat > /dev/null
        gendata -v -n 200 -c | gzip | cat > /dev/null
        gendata -v -n 200 -C | gzip | cat > /dev/null
        gendata -v -n 200 -C | gzip > /your/tape-withorwithout-hwc
        gendata -v -n 200 | wc
    etc.

The above is Unix syntax and progs, as you probably already guessed.
I don't know what you have to do, (or if it is possible) to get
the equivalent of output redirection and pipes on VMS. Maybe it's just a starting point. Feel free to modify the script, or throw it in /dev/null.

--
Paul Bijnens, Xplanation                            Tel  +32 16 397.511
Technologielaan 21 bus 2, B-3001 Leuven, BELGIUM    Fax  +32 16 397.512
http://www.xplanation.com/          email:  Paul.Bijnens AT xplanation DOT com
***********************************************************************
* I think I've got the hang of it now:  exit, ^D, ^C, ^\, ^Z, ^Q, F6, *
* quit,  ZZ, :q, :q!,  M-Z, ^X^C,  logoff, logout, close, bye,  /bye, *
* stop, end, F3, ~., ^]c, +++ ATH, disconnect, halt,  abort,  hangup, *
* PF4, F20, ^X^X, :D::D, KJOB, F14-f-e, F8-e,  kill -1 $$,  shutdown, *
* kill -9 1,  Alt-F4,  Ctrl-Alt-Del,  AltGr-NumLock,  Stop-A,  ...    *
* ...  "Are you sure?"  ...   YES   ...   Phew ...   I'm out          *
***********************************************************************

#!/usr/bin/perl

# (c) 1996 Paul Bijnens
# This program may be freely distributed.

# generate stream of random characters that can not be compressed
#       (compress will do about -32% on the outputstream)
# give feedback


# bug: even without opt_n it will stop after MAXINT Mbyte

# When using perl 4, yes this program is that old :-)
#   do "getopts.pl" || die "Can't include getopts.pl";
#   do Getopts('qvcCn:b:')  ||  die <<'EOS';


use Getopt::Std;


getopts('qvcCn:b:')  ||  die <<'EOS';
Usage: gendata [-vcCq] [-b #] [-n #]
  -q    be quiet, no summary at end
  -v    verbose - give feedback every 10MB (time & bytes)
  -c    generate compresseable data (compress: about 42% compression)
  -C    generate very compresseable data (compress: about 85% compression)
  -b #u output block size (def. 5k, max 1Mbyte); unit can be empty, "k" or "M"
  -n #  stop after # Mbyte outputbytes
EOS

$lagspan = 5;

$n = int($opt_n);

$obs = 0;
if ($opt_b) {
    ($obs, $unit) = ($opt_b =~ /^([\d.]+)(.*)/);
    if ($unit eq "") {
        ; # ok
    } elsif ($unit =~ /^[kK]$/) {
        $obs *= 1024;
    } elsif ($unit =~ /^[mM]$/) {
        $obs *= 1024 * 1024;
    } else {
        die "$0: Bad unit in output block size\n";
    }
}

$obs = 5 * 1024  if ($obs <= 0);

#
# generate some data in $buffer
#

srand();

for ($i = 0; $i < 32*1024; $i++) {
    $buf1 .= chr(rand(256));
    $buf2 .= chr(rand(256)) unless ($opt_C);
    $buf3 .= chr(rand(256)) unless ($opt_C);
    $buf4 .= chr(rand(256)) unless ($opt_C);
}

if ($opt_C) {
    $buf = unpack("b*", $buf1);
} elsif ($opt_c) {
    $buf = unpack("h*", $buf1 . $buf3 . $buf2 . $buf4);
} else {
    $buf = $buf1 . $buf3 . $buf2 . $buf4;
    $buf .= $buf;
}

undef $buf1;
undef $buf2;
undef $buf3;
undef $buf4;

$buf .= $buf;
$buf .= $buf;

$obs = length($buf)  if ($obs > length($buf));  # Max 1 Mbyte now

$buf .= substr($buf, 0, $obs);  # make sure we can write complete blocks
                                # in 1 syswrite call

#
# look at the clock and start emitting data as fast as we can
#

$starttime = time;

unless ($opt_q) {
    $SIG{'INT'} = 'atend';
    $SIG{'PIPE'} = 'atend';
    $SIG{__DIE__} = 'atend';
}

# loop 10MB at a time

$s = 1024 * 1024;
while (1) {
    shift(@times)  if (scalar(@times) >= $lagspan);
    push(@times, time);
    &progress  if ($opt_v);
   
    for (1..10) {
        $s -= 1024 * 1024;
        do {
            syswrite(STDOUT, $buf, $obs, $s) || die("$!\n");
        } while (($s += $obs) < 1024 * 1024);
        if (--$n == 0) {
            &atend  unless($opt_q);
            exit 0;
        }
    }
}

&atend  unless $opt_q;
exit;


#######################################################################


#
# show some progress on the screen
#
sub progress {
    $lag = $times[$#times] - $times[0];
    # ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) =
    ($sec,$min,$hour,,,,,,) =
                localtime(time);
    printf STDERR "%.2d:%.2d:%.2d %6dM  -  %6.3f MB/s\n",
                $hour,$min,$sec, ($opt_n - $n),
                ($lag == 0) ? 0 : 10 * ($#times) / $lag ;
                #($lag == 0) ? 0 : 10 * (scalar(@times)-1) / $lag ;
}


#
# show final statistics
#
sub atend {

    &progress  if ($opt_v);

    $elapsedtime = time - $starttime;

    printf STDERR "Total bytes output: %dMB%s\n",
        ($opt_n - $n),
        $opt_C ? " (high compressable)": ($opt_c?" (compressable)":"");
    printf STDERR "Elapsed time: %dh %dm %ds - average throughput %.3f MB/s\n",
        int($elapsedtime / 3600),
        int(($elapsedtime / 60) % 60),
        int($elapsedtime % 60),
        ($elapsedtime == 0) ? 0 : ($opt_n - $n) / $elapsedtime;
    exit;
}