Networker

[Networker] Utilization of Tapes

2004-07-25 14:47:28
Subject: [Networker] Utilization of Tapes
From: "Eichelberger, Jon" <jon.eichelberger AT SAP DOT COM>
To: NETWORKER AT LISTMAIL.TEMPLE DOT EDU
Date: Sun, 25 Jul 2004 20:47:04 +0200
Hi, All.

I have a problem with the utilization of the tapes in my jukeboxes.
Our policy here is to mark all used tapes from the night before as read-only 
every morning
and offsite them.  The problem is, sometimes a tape is being sent offsite
and is not coming back for a month or more even though - for example -
< 1% of the tape capacity has been used.

>From my observations this is being caused by the fact that I have too much 
>hardware (not really).
What I mean is that I have a networker server with 4 dedicated tape drives and 
6 storage
nodes, 2 or which have 6 tape drives each and 4 or which have 5 tape drives 
each.  Just
to distribute the load of the backups, client A has storage node 1 as the first 
node in its
list, client B has storage node 2 as the first node in its list, and this goes 
round and round for
300 clients.  So when client A needs to do a backup, it loads a tape from pool 
XXX and does
the backup on storage node 1.  Since this tape is still mounted (but idle) when 
client B needs to do a
backup to pool XXX, a different tape is labeled and mounted on storage node 2 
since client B has
storage node 2 at the top of its list of storage nodes.

What happens in the above example is that I end up sending two slightly used 
tapes offsite
instead of one fairly well used tape.  Multiply this to deal with 6 storage 
nodes and about 10 pools.
This leads to a lack of useable tapes in my jukebox.

My ideas are:
1. Find some way to eject a tape as soon as it is done being used so, for 
example,
storage node 2 will mount the tape that client A (storage node 1) already wrote 
on.
I have found no way yet to know from a programmable/command line level to sense 
that a tape
drive is in the "writing, done" state.  I could write something to eject the 
tapes if I could
know this.  The grenade approach would be to just try to unload the drives and 
get an
error if they were still busy.  BTW, I don't use AlphaStor or SmartMedia.  We 
have our reasons why.
2. Restrict which tape drives can be used for a particular pool.  This leads to 
the problem
that a failed drive can cause a real bottleneck and once you restrict the tape 
drives that a given
pool can use you often have to do this for most/all of the pools so a request 
for a pool YYY tape
does not take one of the few drives dedicated to pool XXX.  I have no good 
rules of heuristics for
setting this up and I don't even know if this will help.
3. I thought about having all clients have the same storage node list in the 
same order:
storage node 1, storage node 2,...
but I ruled that out because Networker 6.1.3 on Solaris 8 (that's what I use) 
seems to really
fixate on the fist storage node on the list.  I don't know what 7.x does - 
probably nothing
different.  It would be nice if the sessions per drive were really obeyed and 
Networker would just
roll down to the next storage node without waiting for a timeout.  I'd also end 
up working a few
tape drives to death while others sat idle.

So, any ideas?

Jon

--
Note: To sign off this list, send a "signoff networker" command via email
to listserv AT listmail.temple DOT edu or visit the list's Web site at
http://listmail.temple.edu/archives/networker.html where you can
also view and post messages to the list.
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=

<Prev in Thread] Current Thread [Next in Thread>