Veritas-bu

Re: [Veritas-bu] Drive utilization in Netbackup

2009-08-14 12:26:48
Subject: Re: [Veritas-bu] Drive utilization in Netbackup
From: "Iverson, Jerald" <Jerald.Iverson AT invesco DOT com>
To: <veritas-bu AT mailman.eng.auburn DOT edu>
Date: Fri, 14 Aug 2009 11:23:40 -0500
we've been running the "vmoprcmd -d ds" command cron'd in a script every
5 mins for over 6 years.  the script will also up a drive once. if the
drive goes down again then we need to look at it before it is up'd
again.  running it on 1 media server at each site shows the status of
all drives on all media servers at that site.  i can look at the output
in a semi-graphical fashion and get an indication of what the drives are
doing (B=backup, D=duplication, 0=catalog tape, d=down).  the scripts
are modified "mark" scripts (tracking & graphing drive utilization).  i
believe his graphing put the time horizontally, i like to have mine
scroll down (matrix-like):

                a a a a a a a a a a a a a a a a a a a a
date   time     0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1
                0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9
090814 06:45:01 B B - d - - - B B - - B B B B - B - - B
090814 06:50:02 B B - d - - - B B - - B B B B - B - - B
090814 06:55:01 B B - d - - - - B - - B B B B - - - - B
090814 07:00:01 B B - d - - - B B - - B B B B - B - - B
090814 07:05:01 B B D d B B D B B - B B B B B - B - - B
090814 07:10:01 B B D d B B D B B - B B B B B - B - - B
090814 07:15:01 B B D d B B D B B - B B B B B - B - - B
090814 07:20:01 B B D d B B D B B - B B B B B - B - - B
090814 07:25:01 B B D d - - - B B - B B B B B - B - - B
090814 07:30:01 B B - d - - - B B - B B B B B - B - - B
090814 07:35:01 B B - d - - - B B - B B B B B - B - - B
090814 07:40:01 B B - d - - - B B - B B B B B - B - - B
090814 07:45:01 B B - d - - - B B - B B B B B - B - - B
090814 07:50:01 B B - d - - - B B - B B B B B - B - - B
090814 07:55:01 B B - d - - - B B - B B B B B - B - - B
090814 08:00:01 B B - d - - - B B - B B B B B - B - - B
090814 08:05:01 B B - d - - B B B - B B B B B - B - - B
090814 08:10:01 B B - d - - B B B - B B B B B - B - - B
090814 08:15:01 B B - d - - B B B - B B B B B - B - - B
090814 08:20:01 B B - d 0 0 B B B - B B B B B - B - - B
090814 08:25:01 B B - d 0 - B B B - B B B B B - B - - B
090814 08:30:01 B B - d 0 - B B B - B B - B B - B - - B
090814 08:35:01 B - - d 0 - B B B - B B - B B - B - - B
090814 08:40:01 B - - d 0 - B B B - B B - B B - B - - B
090814 08:45:01 B - - d 0 - B B B - B B - B B - B - - B
090814 08:50:01 B - - d 0 - B B B - B B - B B - B - - B
090814 08:55:01 B - - d 0 - B B B - B B - B B - B - - B
090814 09:00:01 B - - d 0 - B B B - - B - B - - B - - B
090814 09:05:01 B - D d 0 - B B B - - B - B - - B - - B
090814 09:10:01 B - D d 0 - B B B - - B - B - - B - - B
090814 09:15:01 B - D d 0 - B B B - - B - B - - B - - B
090814 09:20:01 B - D d 0 B - B B - - B - B - - B - - B
090814 09:25:01 - - D d 0 B - B B - - B B B - - B - - B
090814 09:35:01 - - D d 0 B - B B - - B B B - - B - - -
090814 09:40:01 - B D d 0 B - B B - B B B B - - - - - -
090814 09:45:01 - B D d 0 B - B B - B B B B - - B - - -
                              3 f f       i r r 3 r r f
                              0 a a       s 2 2 0 2 2 i
                              5 s s       i d 0 5 0 0 l
                              0 9 9       l 2 0 0 0 0 e
                              a 6 6       o     b     0
                                0 0       n     /     3
                                b a       7     r      
                                                2      
                                                d      
                                                2      

i can also sum the usage to show management that we possibly need more
drives.  if backups should only run from 6pm-6am, then a drive usage
shouldn't be used more than 50%.  we have a ndmp filer drive almost
always in use:

start day: 090803, end day: 090810, start time: 18:00:00, end time
18:00:00.

             time pts                  drive used
drive        in range day:hr:mn       pts  day:hr:mn   %
adic_drv00      2,016   7:00:00        144   0:12:00  07%  
adic_drv01      2,016   7:00:00      1,385   4:19:25  68%  
adic_drv02      2,016   7:00:00      1,517   5:06:25  75%  
adic_drv03      2,016   7:00:00          0   0:00:00  00%  
adic_drv04      2,016   7:00:00      1,432   4:23:20  71%  
adic_drv05      2,016   7:00:00      1,791   6:05:15  88%  
adic_drv06      2,016   7:00:00      1,241   4:07:25  61%  
adic_drv07      2,016   7:00:00      1,859   6:10:55  92%    fas3050a
adic_drv08      2,016   7:00:00        994   3:10:50  49%    fas960b
adic_drv09      2,016   7:00:00      1,119   3:21:15  55%    fas960a
adic_drv10      2,016   7:00:00      1,296   4:12:00  64%  
adic_drv11      2,016   7:00:00      1,264   4:09:20  62%  
adic_drv12      2,016   7:00:00        453   1:13:45  22%  
adic_drv13      2,016   7:00:00        585   2:00:45  29%    isilon7
adic_drv14      2,016   7:00:00        635   2:04:55  31%    r2d2
adic_drv15      2,016   7:00:00        632   2:04:40  31%    r200
adic_drv16      2,016   7:00:00      1,108   3:20:20  54%    fas3050b /
r2d2
adic_drv17      2,016   7:00:00        637   2:05:05  31%    r200
adic_drv18      2,016   7:00:00        624   2:04:00  30%    r200
adic_drv19      2,016   7:00:00        527   1:19:55  26%    file03

because we use linux for our media servers, every mount/unmount command
is logged in /var/log/messages for the media server that controls the
robot arm:
Aug 14 10:32:01 md1 tldcd[2698]: Processing UNMOUNT, TLD(0) drive 8,
slot 313, barcode LB3008
Aug 14 10:32:23 md1 tldcd[2698]: Processing MOUNT, TLD(0) drive 8, slot
313, barcode LB3008
Aug 14 10:33:26 md1 tldcd[2698]: Processing UNMOUNT, TLD(0) drive 9,
slot 441, barcode LB3537
Aug 14 10:33:48 md1 tldcd[2698]: Processing MOUNT, TLD(0) drive 9, slot
67, barcode 000029
Aug 14 10:35:56 md1 tldcd[2698]: Processing UNMOUNT, TLD(0) drive 9,
slot 67, barcode 000029
Aug 14 10:36:30 md1 tldcd[2698]: Processing MOUNT, TLD(0) drive 9, slot
237, barcode LB3056

so i wrote another script to gather that data and calculate drive usage
at a more accurate level.  most %'s are pretty close to the 5 minute
cron gatherer script data.  a couple are off (above) due to downed drive
on one media server and not another:

start day: 08/03/2009, time: 18:00:00, end day: 08/10/2009, time
18:00:00.
start: 1249340400, end: 1249945200, first: 1249203815, last: 1250186891.
drive        day:hr:mn:sec  day:hr:mn:sec   %
adic_drv00     7:00:00:00     2:15:47:54  37%  
adic_drv01     7:00:00:00     5:12:01:03  78%  
adic_drv02     7:00:00:00     5:07:24:36  75%  
adic_drv03     7:00:00:00       00:00:00  00%  
adic_drv04     7:00:00:00     5:00:57:27  71%  
adic_drv05     7:00:00:00     6:05:27:05  88%  
adic_drv06     7:00:00:00     4:09:18:36  62%  
adic_drv07     7:00:00:00     6:10:50:26  92%    fas3050a
adic_drv08     7:00:00:00     3:12:02:02  50%    fas960b
adic_drv09     7:00:00:00     3:22:55:32  56%    fas960a
adic_drv10     7:00:00:00     4:16:25:26  66%  
adic_drv11     7:00:00:00     4:11:16:09  63%  
adic_drv12     7:00:00:00     5:06:48:56  75%  
adic_drv13     7:00:00:00     2:02:22:25  29%    isilon7
adic_drv14     7:00:00:00     2:05:36:55  31%    r2d2
adic_drv15     7:00:00:00     2:06:19:19  32%    r200
adic_drv16     7:00:00:00     3:23:25:58  56%    fas3050b / r2d2
adic_drv17     7:00:00:00     2:07:00:43  32%    r200
adic_drv18     7:00:00:00     2:05:33:14  31%    r200
adic_drv19     7:00:00:00     1:21:01:04  26%    file03

i'd recommend to start gathering the data.  it doesn't cost anything.
then you can figure out how you want to display it, analyze it, and show
trends.



-----Original Message-----
From: veritas-bu-bounces AT mailman.eng.auburn DOT edu
[mailto:veritas-bu-bounces AT mailman.eng.auburn DOT edu] On Behalf Of
judy_hinchcliffe AT administaff DOT com
Sent: Friday, August 14, 2009 9:34 AM
To: william.d.brown AT gsk DOT com; veritas-bu AT mailman.eng.auburn DOT edu
Subject: Re: [Veritas-bu] Drive utilization in Netbackup

Note: NOM does not seem able to do the drive use reports if the drives
are SSO-ed

-----Original Message-----
From: veritas-bu-bounces AT mailman.eng.auburn DOT edu
[mailto:veritas-bu-bounces AT mailman.eng.auburn DOT edu] On Behalf Of
william.d.brown AT gsk DOT com
Sent: Friday, August 14, 2009 3:10 AM
To: veritas-bu AT mailman.eng.auburn DOT edu
Subject: Re: [Veritas-bu] Drive utilisation in Netbackup

Well NOM can do this for you if your servers are 6.x.  It has a couple
of 
built-in reports, one for 'drives in use' (as in currently) and one for 
'drive usage' for which you set a time frame, like 'last 24 hours' or 
'last 7 days'.

I agree that I've seen people use DIY with vmoprcmd triggered at regular

intervals.  Also, some people use that just to 'UP' down drives 
automatically every hour or so.  It woould work for pre-6.x versions as 
well, but is a bit more challenging than NOM if you have a number of 
domains.

Another approach if you use FC connected drives is think what tools you 
have that can extract reports from the SAN switches.  If you use Brocade

you could run their free SAN Health tool for a limited period.  Once you

work out which ports the drives are connected to you can make a sensible

extract from the Excel it generates.  What that gives, which is
otherwise 
hard to get, is the actual throughput at the fibre port, so it doesn't 
matter how many jobs are MPX to the drive.

You can infer utilisation (as in used/idle) from seeing the peaks & 
troughs in the graphs.

Other framework products can also get this information using SNMP.

William D L Brown


veritas-bu-bounces AT mailman.eng.auburn DOT edu wrote on 12/08/2009 14:52:53:

> Hi Michael,
> 
> Funny you ask, I am looking at the same task.  It has been suggested
in 
> the list in the past, to run;
> 
> vmoprcmd -d ds
> 
> To check for AVR,TLD (or whether a tape is in the drive).
> 
> The problem here though is you would have to sample often on each
media 
> server.
> 
> Anyone else have any suggestions on this one?
> 
> Justin.
> 
> On Wed, 12 Aug 2009, michael.ketley AT orange-ftgroup DOT com wrote:
> 
> >
> > Does anyone have a useful script or a simple way of extracting
'drive
> > utilisation' from Netbackup.
> >
> > We use ACS, and up until recently could extract the information
using Bocada
> > reporting, unfortunately this has stopped working.
> >
> > Not after an exact science but some suggestions as to improving my
current
> > methods.
> >
> > Many thanks
> > Mike
****************************************************************
Confidentiality Note: The information contained in this message, and any 
attachments, may contain confidential and/or privileged material.  It is 
intended solely for the person(s) or entity to which it is addressed.  Any 
review, retransmission, dissemination, or taking of any action in
reliance upon this information by persons or entities other than the intended 
recipient(s) is prohibited.  If you received this in error, please contact the 
sender and delete the material from any computer.
****************************************************************

_______________________________________________
Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu