ADSM-L

Re: [ADSM-L] SQL QUERY FOR AMOUNT OF ACTIVE VS INACTIVE DATA

2016-04-29 15:54:04
Subject: Re: [ADSM-L] SQL QUERY FOR AMOUNT OF ACTIVE VS INACTIVE DATA
From: Roger Deschner <rogerd AT UIC DOT EDU>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Fri, 29 Apr 2016 14:51:52 -0500
This is pretty much what I do as well. A huge advantage in doing it this
way, at the filespace level, is that it is MUCH faster than counting
individual files. We have about 1 billion files divided between 3 TSM
servers, compared to only several thousand filespaces.

I call it CHURN rather than ABUSE. Some servers and some applications
are going to have a high churn ratio (occupancy/filespace), with a lot
of inactive versions as a rule, such as email servers. The ratio can be
very high (5+) and still be reasonable for an email server's mail
storage filespace. Our email servers are in a management class where the
copy groups have a higher RETEXTRA setting, so when a user calls and
says "Thunderbird ate my inbox! It happened last week. Help!" I can
restore it for them from the inactive versions. OTOH, the ratio for a
typical Windows PC client workstation should be around 0.6 to 0.8, and
for a relatively static file server should be close to 1.0. I study
these numbers a lot, and the sample numbers Maurice has below are in the
same range that I see. I only get suspicious when ((occupancy /
filespace) > (RETEXTRA / 2)). YMMV.

I may use your SELECT as being much simpler than what I do to arrive at
the same number. Thanks, Maurice van 't Loo!

Roger Deschner      University of Illinois at Chicago     rogerd AT uic DOT edu
======I have not lost my mind -- it is backed up on tape somewhere.=====


On Thu, 28 Apr 2016, Maurice van 't Loo wrote:

>Hello Gary,
>
>Just guessing the actual reason, it might be that they want to know the
>amount of TSM storage compared with the amount of storage on the clients.
>While counting the number of objects doesn't give you much information,
>maybe it's best to compare the filespaces with occupancy. That is what I
>call the "abuse factor"
>
>NODE
>FS
>FS_GB              OCC_GB                 ABUSE
>-------------------------------
>-------------------------------------------------------------
>-----------------     ---------------     -----------------
>XXXXXXXXXXXXXXXXXXXXXX
>/csminstall/AIX/images
>35.06              149.36                   4.2
>XXXXXXXXXXXXXXXXXXX
>/nim/aix54
>39.13               39.15                   1.0
>XXXXXXXXXXXXXXXX
>/csminstall/AIX/aix610
>37.07               36.99                   0.9
>XXXXXXXXXXXXXXXX
>/home/dvpt
>14.07               35.39                   2.5
>XXXXXXXXXXXXX
>/build
>24.28               33.24                   1.3
>XXXXXXXXXXXXXXXX
>/db2/dvptdb
>1.57               30.48                  19.3
>XXXXXXXXXXXXXXXXXX
>/home/dvpt
>30.57               30.42                   0.9
>XXXXXXXXXXXXXXXXX
>/csminstall/AIX/products
>25.65               25.77                   1.0
>XXXXXXXXXXXXXX
>/build
>22.53               21.89                   0.9
>XXXXXXXXXXXX
>/build
>22.53               21.89                   0.9
>
>Copy/paste this in mono-type font (notepad) to get it better readable.
>I use this to hunt for missing excludes (mssql databases not excluded) but
>I can also use it to calculate how much space active and inactive I roughly
>have. Of course filespaces with excludes gives some mismatch.
>
>SQL used for above output:
>select cast(substr(f.NODE_NAME,1,30) as char(30)) as
>NODE,cast(substr(f.FILESPACE_NAME,1,30) as char(30)) as
>FS,dec(f.CAPACITY*f.PCT_UTIL/100/1024,14,2) as
>FS_GB,dec(sum(o.PHYSICAL_MB)/1024,12,2) as
>OCC_GB,dec(dec(sum(o.PHYSICAL_MB),14,1)/dec(f.CAPACITY*f.PCT_UTIL/100,16,1),14,1)
>as ABUSE from filespaces as f,occupancy as o where f.NODE_NAME=o.NODE_NAME
>and f.FILESPACE_NAME=o.FILESPACE_NAME and f.CAPACITY>0 and f.PCT_UTIL>0 and
>o.STGPOOL_NAME in (select stgpool_name from stgpools where
>pooltype='PRIMARY') and o.TYPE='Bkup' group by
>o.NODE_NAME,o.FILESPACE_NAME,f.NODE_NAME,f.FILESPACE_NAME,f.CAPACITY,f.PCT_UTIL
>order by 4 desc fetch first 10 rows only
>
>For all nodes ordered by nodename:
>select cast(substr(f.NODE_NAME,1,30) as char(30)) as
>NODE,cast(substr(f.FILESPACE_NAME,1,30) as char(30)) as
>FS,dec(f.CAPACITY*f.PCT_UTIL/100/1024,14,2) as
>FS_GB,dec(sum(o.PHYSICAL_MB)/1024,12,2) as
>OCC_GB,dec(dec(sum(o.PHYSICAL_MB),14,1)/dec(f.CAPACITY*f.PCT_UTIL/100,16,1),14,1)
>as ABUSE from filespaces as f,occupancy as o where f.NODE_NAME=o.NODE_NAME
>and f.FILESPACE_NAME=o.FILESPACE_NAME and f.CAPACITY>0 and f.PCT_UTIL>0 and
>o.STGPOOL_NAME in (select stgpool_name from stgpools where
>pooltype='PRIMARY') and o.TYPE='Bkup' group by
>o.NODE_NAME,o.FILESPACE_NAME,f.NODE_NAME,f.FILESPACE_NAME,f.CAPACITY,f.PCT_UTIL
>order by 1
>
>Regards,
>Maurice van 't Loo
>
>http://mvantloo.nl/maupack.php
>Personal pack of selects (in scripts)
>
>
>2016-04-22 16:35 GMT+02:00 Schneider, Jim <JSchneider AT essendant DOT com>:
>
>> You can also start the session with -virtualnodename.  It works with dsmc
>> or dsmj, and avoids the need for the proxy setting.
>>
>> Jim Schneider
>> Essendant
>>
>> -----Original Message-----
>> From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf 
>> Of
>> Skylar Thompson
>> Sent: Friday, April 22, 2016 9:30 AM
>> To: ADSM-L AT VM.MARIST DOT EDU
>> Subject: Re: [ADSM-L] SQL QUERY FOR AMOUNT OF ACTIVE VS INACTIVE DATA
>>
>> You can also GRANT PROXY and then use -ASNODE from one of your own nodes,
>> using your node's password. I think the general node type has to match
>> (i.e. any UNIX can proxy to any UNIX, but not Windows).
>>
>> On Fri, Apr 22, 2016 at 02:20:38PM +0000, Schneider, Jim wrote:
>> > Use a server you can access and modify the nodename in the options file,
>> assuming you know the password.
>> >
>> > Jim Schneider
>> > Essendant
>> >
>> > -----Original Message-----
>> > From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On 
>> > Behalf
>> > Of Lee, Gary
>> > Sent: Friday, April 22, 2016 9:11 AM
>> > To: ADSM-L AT VM.MARIST DOT EDU
>> > Subject: Re: [ADSM-L] SQL QUERY FOR AMOUNT OF ACTIVE VS INACTIVE DATA
>> >
>> > Wish I could do that.  This comes from three levels above me in
>> management.
>> > Trying to buy more storage to sell to departments.
>> > Don't ask me, I have no clue what they are doing.
>> >
>> > I'll look into the q backup on client side, but don't have access to all
>> of them.
>> >
>> >
>> > -----Original Message-----
>> > From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On 
>> > Behalf
>> > Of Skylar Thompson
>> > Sent: Friday, April 22, 2016 10:00 AM
>> > To: ADSM-L AT VM.MARIST DOT EDU
>> > Subject: Re: [ADSM-L] SQL QUERY FOR AMOUNT OF ACTIVE VS INACTIVE DATA
>> >
>> > If you have access to the clients, you can use QUERY BACKUP and parse
>> the A/I column.
>> >
>> > Honestly, though, when we've gotten this query, I've managed to push
>> this back on the customers; it's not TSM's problem what's active or
>> inactive, it's the customers' applications that are actually responsible
>> for it.
>> > Obviously you need a pretty good relationship with your customers to
>> make that case, but in the end it's caused our customers to think more
>> carefully about workflow in general.
>> >
>> > On Fri, Apr 22, 2016 at 01:51:20PM +0000, Lee, Gary wrote:
>> > > Just got a request for the amount of active versus inactive data on
>> our tsm servers.
>> > >
>> > > Is there a better way than traversing the backups table and summing?
>> > > That would be a mighty long query.
>> > >
>> > > We have three servers, and approximately 300 clients about 200 tB
>> total data.
>> > >
>> > > Thanks for any suggestions.
>> >
>> > --
>> > -- Skylar Thompson (skylar2 AT u.washington DOT edu)
>> > -- Genome Sciences Department, System Administrator
>> > -- Foege Building S046, (206)-685-7354
>> > -- University of Washington School of Medicine
>> >
>> > **********************************************************************
>> > Information contained in this e-mail message and in any attachments
>> thereto is confidential. If you are not the intended recipient, please
>> destroy this message, delete any copies held on your systems, notify the
>> sender immediately, and refrain from using or disclosing all or any part of
>> its content to any other person.
>>
>> --
>> -- Skylar Thompson (skylar2 AT u.washington DOT edu)
>> -- Genome Sciences Department, System Administrator
>> -- Foege Building S046, (206)-685-7354
>> -- University of Washington School of Medicine
>>
>