Veritas-bu

Re: [Veritas-bu] VTL & NetBackup Best Practice

2008-03-07 14:58:52
Subject: Re: [Veritas-bu] VTL & NetBackup Best Practice
From: "Meidal, Knut" <kmeidal AT amgen DOT com>
To: "veritas-bu AT mailman.eng.auburn DOT edu" <veritas-bu AT mailman.eng.auburn DOT edu>
Date: Fri, 7 Mar 2008 11:14:40 -0800

Since Stuart is soliciting feedback from his former co-workers…:

 

We do have VTLs in our environment, and a rather complex setup. I’m not sure if our environment would qualify as any “best practices”, however.

 

We’re on NetBackup 5.1, Solaris master, 4 Linux media servers, 4 Windows media servers. 13 NetApp filers are SAN connected and zoned to the same VTL.

We have 6 NetApp NearStore VTL700s with a total of 256TB of useable disk. That is before the data compression is factored in.

On the back-end we have a 7-node Decru DataFort cluster and a 20-drive ADIC Scalar i2000.

 

Each of the 8 media servers are controlling 3 virtual libraries. These 24 virtual libraries are spread among the 6 VTL controllers for load balancing and redundancy.

In addition, we have one virtual library that is dedicated to NDMP usage. I’ll get back to that in a moment.

 

Starting out with this new environment, the tape type on the VTLs were 40GB in size. The idea being precisely as others have pointed out; a higher number of small tapes will waste less space,and ease congestion, making it feasible forego multiplexing. Also, in the beginning, NetBackup Vault was in charge of duplicating images from VTL tapes to physical LTO tapes. A 16-drive SSO pool was configured.

Conceptually this had a few good points:

  • NetBackup would have control and knowledge of all copies, virtual and physical
  • Prioritizations could be made for the Vaulting and offsite copying
  • Restores could be done directly from physical tape without first importing to a VTL
  • The large number of virtual libraries (24) means that a given VTL outage will not take down the entire environment or cripple a media server

 

There were however issues with this, as Stuart points out. The BPDUPLICATE function in NetBackup 5.1 is very slow and inefficient. We had a lot of small images, and there was about 14 seconds of housekeeping time for each of the images, so for small or empty images, the thruput is very low. I cannot comment on the BPDUPLICATE function in later versions, or how it behaves with DSU image duplication.

We converted our environment from Vault to using the “Direct Tape Copy” feature of our VTLs.

Performance skyrocketed, we saw increase in total thruput at the time from about 1TB/day with Vault to about 24TB/day with DTC. How often do you get to experience a 24 TIMES performance increase? Essentially free, too! We are currently sending approx 44TB/day to tape, as an average for a month.

 

Good and bad things about using the DTC function:

Good:

  • Performance is good
  • No performance penalty on the media servers, they have to do less work

Bad:

  • NetBackup loses track of what data is on disk or physical tape. To NetBackup, it’s all on a tape…
  • Little possibility to prioritize which tapes or policies are cloned first
  • Requires additional steps to import data to VTL before restore can start

For us, at present time, the benefits of the performance outweighs the downsides. Your mileage may vary.

 

Continuing my musings on “best practices”; I’m undecided about “many small, self-contained virtual libraries” vs “fewer, larger, shared virtual libraries”.

There are good and bad points to both approaches, we do value the redundancy of having many, smaller virtual libraries. There is additional operator involvement in this, like having to assign tapes to all of them, balance the assignment against expected usage etc.

We have written (well, Stuart actually) scripts that will assign scratch tapes in the libraries as needed, based on low/high watermarks. That way, managing 6 or 24 virtual libraries doesn’t make much of a difference.

 

Back to the case for NetApp filers and backups with NDMP for a minute:

VTLs are pretty much a perfect complement to filers. What we have, is 13 SAN-attached NetApp filers, that have been zoned in (round-robin) to 4 front-end ports on a VTL.

I have set up ONE large virtual library, with 52 virtual drives. My Solaris master server is the robot control host of this large library, but does NOT have access to any of the drives.

Each filer has exclusive control of 4 virtual drives.

There is NO drive sharing taking place in this scenario. Even if certain OnTap and NetBackup combinations can handle drive sharing, I believe that having dedicated virtual drives in a shared library makes more sense. All the benefits, none of the downsides.

I have one Storage unit configured per filer, utilizing the virtual drives each filer owns in the shared library.

This simplifies scratch tape management, and seems to work great.

A nice side-effect is that when doing re-directed restores from a filer to another, the destination filer mounts the virtual tape directly in its own assigned drive and reads the data directly over the SAN. No data is sent over the network.

 

Now; is this something that could be considered “best practice”?

It’s working well for us, and I would recommend a setup like that for others.

The downside is that if the connection between VTL and the Solaris machine or a filer is disrupted, there is a greater impact to all backup operations on that virtual library, as the robotics will be down.

 

It’s quite possible that similar ‘semi-sharing’ could work well for regular media servers also. The library, slots and scratch tapes are shared, but drives aren’t.

I would recommend testing it and see if it works, and if not, just configure a sensible number of self-contained virtual libraries as needed.

 

Knut Meidal

 

 


From: veritas-bu-bounces AT mailman.eng.auburn DOT edu [mailto:veritas-bu-bounces AT mailman.eng.auburn DOT edu] On Behalf Of Stuart Liddle
Sent: Thursday, March 06, 2008 3:51 PM
To: veritas-bu AT mailman.eng.auburn DOT edu
Subject: Re: [Veritas-bu] VTL & NetBackup Best Practice

 

OK….since I did not hear from any of my former co-workers at my previous job on this subject, I will chime in. 

 

We were using VTL’s and we did do multiplexing.  We kept the number of drives per VTL down to 8 and we had 6 VTL’s.  All of the VTL’s were connected to one ADIC i2000 tape library with LTO-3 tape drives.  The physical library was split up into partitions, one for each VTL.

 

At first we were using Vault to go to physical tape and we had set up our virtual tapes to be smaller than physical tape.  This did not work very well (it was slow and it did not scale) and we ended up switching to a method where we did the copy to physical tape off of the back end of the VTL (NetApp, in case you were wondering).  This Direct Tape Copy method has worked very well and we were getting tape drive speeds around 50MB/sec as opposed to Vault which was around 10MB/sec avg.  (BTW, at last count, we were doing over 1PB per month to our VTL’s or somewhere around 300TB/week.)

 

As another poster stated, you do need to over-subscribe, but that’s really not a problem.  Restores from VTL are very quick if the tapes have not expired off.  If they have expired off, all you need to do is to start the import of the physical tape (the barcodes of the virtual tapes are the same as the physical tapes that they get copied off to) and as soon as the import has started you can begin the restore.  I set up a script to check for available tapes per VTL and then assigned new tapes as they were available in the physical library.

 

-Stuart Liddle

 

_______________________________________________
Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu