ADSM-L

Re: Using the DBI::TSM module, memory problem

2006-12-05 15:36:29
Subject: Re: Using the DBI::TSM module, memory problem
From: Laurent Bendavid <bendavid.laurent AT FREE DOT FR>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Tue, 5 Dec 2006 21:35:58 +0100
Allen S. Rout a écrit :
On Mon, 4 Dec 2006 19:46:07 +0100, Hans Christian Riksheim <HCR AT STERIA DOT 
NO> said:

I essentially want to "stream" the output of the sql through a
filter while it is returned from the TSM-server, not glob it up in
memory first and then process it later. Have not found out yet with
DBI.

And you won't.

TSM::Function.pm:

    138 sub tsm_execute {
    139     my ($sth,$statement)=@_;
    140
    [...]
    160     while (<$ch>) {
    161         $errstr.=$_ if m/^[A-Z][A-Z][A-Z]\d\d\d\d[^I]/;
    162         chomp;

In Perl, that means it's going to read until it's done.  I've noodled
around a bit thinking about ways to buffer or stall that reading, but
I can't guaruntee that you won't just have the entire stream pile up
somewhere else in the system;  will the server stop the SELECT process
just becaus the thing listening to dsmadmc's STDOUT has stalled? Ew.

My recommendation is that, if you're going to do big dataset stuff,
don't bother with PERL for your primary analysis (and I'm a perl
addict, BTW).  do something like


dsmadmc -dataonly=yes -comma | awk  [...]

Many of the questions I ask of my server come down to

dsmadmc | cut [somehow] | sort | uniq -c | sort -rn

which streams nicely.  Or you can even

dsmadmc | bzip2 -c > tempfile

and then mess with that.  Remember, database selects tend to compress
VERY nicely.



Some database guy suggested use of cursors which make it possible to
fetch one row at a time and have the database keep track of your
progress, but I don't know the first thing about cursors and I don't
think the TSM database is designed for those needs anyway.

Agreed. The ODBC path you're already working at is probably a better
solution for the big queries.


- Allen S. Rout



Hello,
I had two problems when I wrote this module.

First, some people use this module on Windows platform. So, I store all the result of dsmadmc command in memory with qx() statement.

Second, DBI need to return the dataset as a memory perl structure. For instance, for AIX, il you need to retrieve more than 256Mo you could get few problems.

It's not a Perl limitation but the or my implementation for DBI (any good Idea is accepted).

In Perl, you could do it as is (only on UNIX) :

my $cmd = "dsmadmc ....";

open my $cmd_h, "$cmd |" or die "Cannot fork: $!\n";
while (<$cmd_h>) {
   # You have one line in $_
   # Do what you want with $_ but don't store it in memory
   # write result in an other file... print it to STDOUT, ...
}
close $cmd_h or die "Error in '$cmd' command: $!\n";