ADSM-L

Re: Slow restore for large NT client outcome.. appeal to Tivoli

2000-09-21 03:29:16
Subject: Re: Slow restore for large NT client outcome.. appeal to Tivoli
From: Mike Glassman - Admin <admin AT IAA.GOV DOT IL>
Date: Thu, 21 Sep 2000 08:57:47 +0200
Kelly,

I don't know regarding Arcserve as you couldn't pay me to go near it, but BE
I do know.

Restore of a 600MB directory (talking small here) on ADSM to a Netware
server takes up to (no exageration here) 6 hours. And this is after we made
all sorts of changes (not me, our AS400 guy as that's where it sits, I just
complain) to the system.

Under BE, the same 600MB takes under 45 minutes.

In both cases we are talking about a backup system sitting on another system
and not the backed up one.

Mike

> -----Original Message-----
> From: Kelly J. Lipp [SMTP:lipp AT storsol DOT com]
> Sent: ã ñôèîáø 20 2000 23:38
> To:   ADSM-L AT VM.MARIST DOT EDU
> Subject:      Re: Slow restore for large NT client outcome.. appeal to
> Tivoli
> 
> Could someone with experience doing large restores with ArcServe or
> BackupExec provide some performance numbers?  I've been in shops where the
> backups were taking a very long time.  Longer than my TSM backup took.  I
> never witnessed a restore but how can it be better.
> 
> I want the facts.  I'm tired of hearing about how much faster ArcServe and
> BackupExec are (in theory) compared to TSM in reality.
> 
> I'm sick and tired of it and I won't take it anymore!
> 
> This is what happens when you TSM 24 hours per day.  Your brain.  Your
> brain
> on TSM.  Not a pretty picture.
> 
> Kelly J. Lipp
> Storage Solutions Specialists, Inc.
> PO Box 51313
> Colorado Springs CO 80949-1313
> (719) 531-5926
> Fax: (719) 260-5991
> Email: lipp AT storsol DOT com
> www.storsol.com
> www.storserver.com
> 
> 
> -----Original Message-----
> From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU]On Behalf Of
> Keith E. Pruitt
> Sent: Wednesday, September 20, 2000 12:03 PM
> To: ADSM-L AT VM.MARIST DOT EDU
> Subject: Re: Slow restore for large NT client outcome.. appeal to Tivo
> 
> 
> Jeff, we too have a problem with small files. At first I thought it was a
> Netware thing because the servers we have the greatest amount of files on
> reside
> on the Netware servers.
> But reading emails from several users I see that I may have a future
> problem
> on
> the NT side. We store Word and WordPerfect docs on two Netware 5 machines
> and
> each server holds about 1.8 Million files apiece. Needless to say these
> files
> are not that big. It took over 11 hours to back each of the servers up and
> they
> total around 30GB per server. We were forced to perform a Full backup
> because
> our director and other new admins don't understand and feel comfortable
> with
> the
> "incremental forever" logic. I would hate to see what a restore would look
> like.
> In contrast, we just backed up a directory on an NT server we are using
> for
> our
> Backoffice conversion and that dir totals 35GB. That took 2h20m. We also
> performed a large restore from one AIX machine to another one of about
> 25GB.
> Less than 2 hours to restore. We have tweaked our Netware and AIX ADSM
> server
> according to performance guides and other suggestions and still have
> issues
> with
> small files.
> 
> We will be moving our documents from Netware to NT soon and our NT guys
> like
> to
> refer to ADSM as crap. They are used to Arcserve but our now raving about
> BackupExec. It is going to be extremely difficult to explain if our huge
> machine
> can't keep up with their backup server. I know that overall ADSM is a
> better
> and
> more stable product but what do you do when you have a mixture of servers
> with
> large databases(ADSM's favorite) and (the more common) servers with small
> files
> that Arcserve and others like? I'm hoping another ADSM/TSM user has some
> tricks
> or tweaks that can help in this area. Anyone from any universities out
> there?
> 
> ____________________Reply Separator____________________
> Subject:    Slow restore for large NT client outcome.. appeal to Tivoli
> 
> Author: Jeff Connor <connorj AT NIAGARAMOHAWK DOT COM>
> Date:       09/20/2000 12:21 PM
> 
> Our NT group was a hard sell for replacing Arcserve with TSM.
> Since the switch, I have taken quite a beating about TSM restore
> performance.  Our NT admins take the position, "we'll try TSM but
> if the performance doesn't improve we are going with a tried and
> true solution like Compaq Enterprise Backup.  TSM seems to us
> like a UNIX product trying to make it in the NT space.  It is not
> typically selected by companies for NT backup and recovery".
> Not a word for word quote but generally sums up their position.
> The Compaq solution would use Arcserve from what I've been told.
> 
> I know Tivoli/IBM have tried to address the small files issue
> with things like small file aggregation but I haven't noticed
> much improvement from version to version for big restores of
> servers with small files.  I've heard different reasons for slow
> performance with small files over the years like the amount of
> TSM database lookups, NT file system processing/inefficiencies,
> etc.    I have
> suggested to our NT admins that we break that big D: partition
> into multiple smaller partitions so I can collocate by filespace
> and restore multiple drives concurrently.  Frankly, they are not
> interested in changing the way they configure their servers to
> accommodate the backup software.  They feel they would not have
> to do this with Arcserve or other more common NT backup products.
> I've tried tests using share names for folders and performing
> backups/restores using the UNC name, collocating the data by
> filespace and running concurrent restores.  My tests showed
> improved elapsed time but this scheme would be tough to maintain.
> In a full server restore scenario  I'd need to create the folders
> and shares for the target restore which means we'd need to keep
> track of that info some place.  I'd constantly have to monitor
> growth in all the folders to make sure I've carved up the drive
> in fairly equal parts to optimize for restore, etc.  Not a good
> solution either.
> 
> Does anyone else see the poor performance for restoring clients
> with lots of small files and feel that this is a problem Tivoli
> needs to address?  I do.  If this issue is not resolved then I
> won't be able to keep using TSM to backup our NT servers.
> 
> Thanks,
> Jeff Connor
> Niagara Mohawk Power Corp.
> 
> 
> ---------------------- Forwarded by Jeffrey P Connor/IT/NMPC on
> 09/20/2000 10:32 AM ---------------------------
> 
> 
> Jeffrey P Connor
> 09/13/2000 01:20 PM
> 
> To:   ADSM-L AT VM.MARIST DOT EDU
> cc:
> 
> Subject:  Slow restore for large NT client.. help!
> 
> 
>      We are in the process of restoring a subdirectory of a very
> large NT client file space (D:) and it is running really slow.  I
> thought I'd see if any of you have some ideas as to where we can
> look for bottlenecks.
> The client config is:
>      Compaq proliant 5500
>      400MB RAM
>      two 400MHz Xeon processors.
>      ~160GB of disk in a Compaq disc array made up of 18.2GB
> drives
>      Windows NT 4.0 SP6a
>      TSM client for NT 3.7.2.01
>      Applicable TSM client options:
>           tcpwindowsize 63
>           tcpbuffsize         31
>           tcpnodelay       yes
>           txnbytelimit       25600
> 
> TSM server config
>      TSM for OS/390 V3.7.1.0
>      OS/390 2.6
>      9672-R55
>      TSM server DB cache hit ratio 98.5%
>      ApplicableTSM server options:
>           TXNGROUPMAX 256
>           Databufferpoolsize  262144
> 
> 
> 
> Network path:
>      NT Client ----100Mbit Ethernet --> Switch -- 100Mbit
> Ethernet--> Cisco 7513 rtr -- 155Mbit ATM -> Cisco 5500 atm
> switch -->IBM 2216 -->ESCON --> S/390 TSM Server
> 
> 
> Now that you have the background here's what we are seeing.
> 
> Only 4.7GB have been transfered in 4 hours.  We are attempting to
> restore one subdirectory on the D: drive first.
> TSM command line client command entered was:
>      RES -subd=y \\filecluster2\d$\groups\ugitoper\*
> The D: drive has approximately 2,000,000 files.  Lots of small
> files.  NT client is a file and print server.
> A network sniffer trace shows mostly large chunks of data sent,
> no restransmits, then the NT client appears to throttle back,
> decreasing the tcpwindow size as if it could not accept the data
> as fast as TSM was sending it.  Windows sizes go to zero at times
> then bounces back to large window size(64512).
> NT perfmon shows plenty of memory and cpu with minimal disk
> queueing.
> 
> 
> 
> This brings me to my question.  What tools can I use or what
> metrics in perfmon can I check to see "under the covers"
> to determine what is slowing us down.  The network support staff
> feels the network bandwidth is there and feel the NT client is
> throttleling things back.  The NT support staff says the NT
> client machine is not overwhelmed in terms of CPU, Memory, disk,
> etc. they feel TSM is the problem.
> 
> What could be the bottleneck on the NT client and what tool can I
> use to find it?
> 
> Thanks in advance for your assistance,
> Jeff Connor
> Niagara Mohawk Power Corp