Veritas-bu

[Veritas-bu] Monitoring perfomance at the buffer level

2004-08-12 08:59:46
Subject: [Veritas-bu] Monitoring perfomance at the buffer level
From: dave.markham AT icl DOT net (Dave Markham)
Date: Thu, 12 Aug 2004 13:59:46 +0100
Ed Wilts wrote:

>On Tue, Aug 10, 2004 at 03:59:35PM -0600, Mark.Donaldson AT cexp DOT com wrote:
>  
>
>>Here's a quick and dirty script that sweeps the bptm logs on a media server
>>for a supplied policy name and reports the "fill_buffer, waiting on empty
>>buffer" and "write_backup, waiting on full buffer" statistics.
>>
>>Output looks like this:
>>
>>    
>>
>>>policy_perf Hot_PRD
>>>      
>>>
>>## Gathering data..........Done.
>>## Write to buffer waiting on available buffer:
>>Min: 0  Avg: 356  Max: 5877 with 285 samples
>>
>>## Write to tape waiting on full buffer:
>>Min: 0  Avg: 43373  Max: 290583 with 7 samples
>>    
>>
>
>I've added a section to optionally pass in a date so I can go back
>through previous days logs and here's a sample:
>
>[root@osiris ewilts]# ./perf.sh osiris-vpn 081004
>Using /usr/openv/netbackup/logs/bptm/log.081004
>## Gathering data.................................................Done.
>## Write to buffer waiting on available buffer:
>Min: 0  Avg: 1216  Max: 42479 with 150 samples
>
>## Write to tape waiting on full buffer:
>Min: 317  Avg: 34382  Max: 157723 with 48 samples
> 
>  
>
>>If the Write to Buffer is waiting for an available empty buffer a whole
>>bunch, then perhaps you should increase your buffer count.  If you're tape
>>writing process waiting on a full buffer a lot, then you're starving your
>>tape drives and you should find a way to increase the delivery of client
>>data to your media server or increase your multiplexing factor.
>>    
>>
>
>So what's a "whole bunch"?  Is what I'm seeing an issue I should deal
>with?  Don't things like incrementals really slow down the tape
>processing?
>
>Can it be broken down by host instead of by policy?  Having multiple
>hosts per policy would make it difficult to target a system to fix.
>There's also the minor issue of not knowing which hosts or policies even
>have buffer messages in bptm.  The script is an excellent start though.
>
>My overall issue is that although we have GigE connections between many
>hosts and the media servers, and trying to drive 8 SDLT220 drives in an
>L700, we almost never exceed 11MBs of traffic coming into the media
>servers. It's like there's a cap there that we just haven't been able to
>remove. 
>
>Thanks,
>        .../Ed
>
>  
>
I just saw this script at beginning of this thread and my current today 
log file has no policy info so i quickly added a dirty way to check 
yesterdays log with a -y flag

policy=$1
yesterday=$2


today=`date +%m%d%y`
TMPFILEf=/tmp/`basename $0`.tmp.f
TMPFILEw=/tmp/`basename $0`.tmp.w

[ -f $TMPFILEf ] && rm -f $TMPFILEf
[ -f $TMPFILEw ] && rm -f $TMPFILEw

[ $2 = "" ] && yesterday = "undef"

if [ $yesterday = "-y" ]
then
        if [ -n `echo $today | grep ^0` ]
        then
                today1=`expr $today - 1`
                today=0$today1
        else
        today=`expr $today - 1`
        fi
fi

echo "## Gathering data.\c"
...............script continues



On answer to your capping have you check the gigabit ndd settings for 
the device ? This may help :-

This is nicked from a script i wrote to auto set device paremeters on 
boot up or if ran manually to check them.

do_check_ge()
{
        DEV=$1
        INST=$2

        ndd -set $DEV instance ${INST}
        echo "+-------------------+"
        if [ `ndd $DEV link_status` = 0 ];then
                echo "$DEV${INST} status is down";else
                echo "$DEV${INST} status is up"
        fi

        if [ `ndd $DEV link_speed` = 1000 ];then
                echo "$DEV${INST} link speed 1000 Mbps";else
                echo "$DEV${INST} link not up"
        fi

        if [ `ndd $DEV link_mode` = 0 ];then
                echo "$DEV${INST} link mode Half-Duplex";else
                echo "$DEV${INST} link mode Full-Duplex"
        fi

        if [ `ndd $DEV adv_1000autoneg_cap` = 0 ];then
                echo "$DEV${INST} Auto-Negotiation-OFF";else
                echo "$DEV${INST} Auto-Negotiation-ON"
        fi
        if [ `ndd $DEV adv_pauseTX` = 0 ];then
                echo "$DEV${INST} Transmit PAUSE Not Capable(default)";else
                echo "$DEV${INST} Transmit PAUSE Capable"
        fi
        if [ `ndd $DEV adv_pauseRX` = 0 ];then
                echo "$DEV${INST} Receive PAUSE Not Capable";else
                echo "$DEV${INST} Receive PAUSE Capable(default)"
        fi
}

do_set_ge()
{
        DEV=$1
        INST=$2
        ndd -set $DEV instance ${INST}
        echo "Setting $DEV${INST} adv_1000autoneg_cap 0"
        ndd -set $DEV adv_1000autoneg_cap 0
        echo "Setting $DEV${INST} adv_1000fdx_cap 1"
        ndd -set $DEV adv_1000fdx_cap 1
        echo "Setting $DEV${INST} adv_1000hdx_cap 0"
        ndd -set $DEV adv_1000hdx_cap 0
        echo "Setting $DEV${INST} adv_pauseTX 0"
        ndd -set $DEV adv_pauseTX 0
        echo "Setting $DEV${INST} adv_pauseRX 1"
        ndd -set $DEV adv_pauseRX 1
}

# Workings

case "$1" in
'check')

### Ge gigabit interface different
        GE_=`nawk '$NF == "\"ge\"" {print $2}' /etc/path_to_inst | uniq`
        if [ "$GE_" != "" ];then
                for x in ${GE_};do
                        do_check_ge /dev/ge $x
                done

                ANS=`ckyorn -p "Do you want to force all nics 1000 Mbps 
, Full-Duplex, Auto Negotiation off?~"`
                if [ $ANS = y ] || [ $ANS = Y ] || [ $ANS = YES ] || [ 
$ANS = yes ];then
                        echo "Setting Interfaces"
                        for x in ${GE_};do
                                do_set_ge /dev/ge $x
                        done
                fi
        fi
;;

'start')

### Ge gigabit interface different
        GE_=`nawk '$NF == "\"ge\"" {print $2}' /etc/path_to_inst | uniq`
        if [ "$GE_" != "" ];then
                for x in ${GE_};do
                        do_set_ge /dev/ge $x
                done
        fi

;;

*)
        echo "Usage: $0 { check | start }"
        exit 1
esac
exit 0



Thanks