Networker

Re: [Networker] Maximum number of save sessions?

2007-03-19 13:53:50
Subject: Re: [Networker] Maximum number of save sessions?
From: Dave Mussulman <mussulma AT UIUC DOT EDU>
To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
Date: Mon, 19 Mar 2007 12:46:59 -0500
On Fri, Mar 16, 2007 at 09:06:31PM -0400, George Sinclair wrote:
> I'm still testing out these LTO3 drives, and I've found - and it's 
> probably no surprise - that in order to push the drives to a reasonable 
> performance level (even 50-60 MB/s), I have to increase the target 
> sessions to about 12. This wasn't the case before with the older LTO1s, 
> where we typically used 4-5 target sessions, and we're getting decent 
> performance from our SDLT-600 drives, running on the other snode, at 5 
> sessions each for a total of 20. But we're backing up directly over 
> gigabit ethernet, so we don't have a front end VTL, or some such thing, 
> where the network can be taken out of the equation - at lease not yet.

That's consistent with my deployment of LTO3.  I'm currently running
with two LTO3 drives and have parallelism set to 16 to get "decent"
utilization speed on the drives.  Whether one or both drives are
running, under load we're getting near the 100Mbps I would expect from a
single-gigE connection the server.  That was after tuning parallelism
down and cranking it back up again to find a sweet spot.


> I found that once I hit 16 sessions, the LTO-3 drive could easily top 80 
> MB/s, but at 12, it averaged anywhere from 50-60 MB/s, hitting upwards 
> of 73 MB/s sometimes.  Of course, it screams when backups are run from 
> the host itself (99 MB/sec or better), but with gigabit ethernet, I'm 
> probably lucky to get 95 MB/s coming in to the snode period. The SCSI 
> HBAs can handle the load (they're dual channel 320 MB/s each, total = 
> 640 MB/s), and the snode host can handle it, but I can't push data fast 
> enough to make the drives really burn unless I up the target sessions. 
> If I do that, though, then I would quickly exceed the 32 session limit 
> when carried out over 4 drives. That's a bummer. I know upping the 
> target sessions will increase recovery time, and not sure if the faster 
> read speed on LTO-3 drives would compensate?
> 
> I'd like to add 4 more drives to the library for a total of 8, mostly to 
> allow cloning operations that might run in parallel with the backups. I 
> thought having more drives would help out. Right now, with 4 drives, all 
> the drives are typically in use once the backups are running, so cloning 
> other tapes would have to be done during the day, which is fine but it 
> could overlap into the evening and effect backups or vice versa since 
> various drives that might otherwise be available would then be in use. 
> But even with only 4 drives, it seems I would be limited to 8 sessions 
> each, for a total of 32?

The total number of target sessions for all of your drives can exceed
the maximum parallelism set in the server config; the catch being that I
think Networker will only allow the server max parallelism.  You could
configure a scenario where some drives never get used, under load,
because a single drive is taking most of the allowed sessions.  (I'm
trying to recall if I've ever seen this enforced - I know that I've
streamed to three devices, each configured for 16 target sessions but I
didn't count to see if it was artificially limiting it to 32.  I'm
running 7.2.1 on the server.)

I know that even with a "high" level of parallelism to the drive, with
LTO3 restore speeds aren't too bad.  (Compared to DLT.)  That's an easy
thing to test in your environment and see if your configuration meets
your restore objectives.  I'm comfortable with where I have it
configured.

Looking at the larger picture, at FastEthernet speeds it's going to
take 10+ sessions at 10 or 11Mbps to max out an LTO3 drive.  Given that
you can probably only get those speeds on a full, you're going to need
more than 10 sessions, and of course, overall you're limited by the
server's networking speeds.  That won't scale by adding more drives --
in fact, it gets worse.  It sounds like you've figured all this out, but
are waiting for the obvious point: you should be backing up to
server-local disk first, and staging to tape.  That's the best
environment that makes sure data is available as quickly as LTO3 can
consume it, and that slow clients don't impact your tape environment (by
backing up slowly and shoe-shining the drive -- although that's supposed
to be better in LTO than DLT, and by monopolizing a tape mount for a
single slow session that could be used for something else.)  If you want
to add $36k (4*$9k, the approx. cost of an HP drive the last time I was
quoted,) to improve your backups environment, you're better off putting
some sort of staging disk or VTL in the middle.  It's a win-win design.

Of course, you'll need to look into EMC licensing nickel and diming that
out (for either advanced disk storage or more library options,) which
will eat into the cost.  Knowing I was switching to a D2D2T environment
is one of the reasons we're migrating to TSM -- those were features we
got "for free" and the data migration management through TSM is easier.
Just showing how different products do different things, TSM demands
smart D2D2T because it doesn't do multiplexing.  (Yet, it seems many TSM
deployments have large libraries -- hundreds of slots, tens of drives --
which are probably useful for other aspects of TSM.)

Dave

To sign off this list, send email to listserv AT listserv.temple DOT edu and 
type "signoff networker" in the body of the email. Please write to 
networker-request AT listserv.temple DOT edu if you have any problems with this 
list. You can access the archives at 
http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER

<Prev in Thread] Current Thread [Next in Thread>