ADSM-L

Re: Using the DBI::TSM module, memory problem

2006-12-04 14:06:13
Subject: Re: Using the DBI::TSM module, memory problem
From: "Allen S. Rout" <asr AT UFL DOT EDU>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Mon, 4 Dec 2006 14:05:24 -0500
>> On Mon, 4 Dec 2006 19:46:07 +0100, Hans Christian Riksheim <HCR AT STERIA 
>> DOT NO> said:

> I essentially want to "stream" the output of the sql through a
> filter while it is returned from the TSM-server, not glob it up in
> memory first and then process it later. Have not found out yet with
> DBI.

And you won't.

TSM::Function.pm:

    138 sub tsm_execute {
    139     my ($sth,$statement)=@_;
    140
    [...]
    160     while (<$ch>) {
    161         $errstr.=$_ if m/^[A-Z][A-Z][A-Z]\d\d\d\d[^I]/;
    162         chomp;

In Perl, that means it's going to read until it's done.  I've noodled
around a bit thinking about ways to buffer or stall that reading, but
I can't guaruntee that you won't just have the entire stream pile up
somewhere else in the system;  will the server stop the SELECT process
just becaus the thing listening to dsmadmc's STDOUT has stalled? Ew.

My recommendation is that, if you're going to do big dataset stuff,
don't bother with PERL for your primary analysis (and I'm a perl
addict, BTW).  do something like


dsmadmc -dataonly=yes -comma | awk  [...]

Many of the questions I ask of my server come down to

dsmadmc | cut [somehow] | sort | uniq -c | sort -rn

which streams nicely.  Or you can even

dsmadmc | bzip2 -c > tempfile

and then mess with that.  Remember, database selects tend to compress
VERY nicely.



> Some database guy suggested use of cursors which make it possible to
> fetch one row at a time and have the database keep track of your
> progress, but I don't know the first thing about cursors and I don't
> think the TSM database is designed for those needs anyway.

Agreed. The ODBC path you're already working at is probably a better
solution for the big queries.


- Allen S. Rout