Slow AIX Restores

littlepunk

ADSM.ORG Member
Joined
Sep 30, 2005
Messages
14
Reaction score
0
Points
0
Location
Wellington, New Zealand
Website
Visit site
Hi,

I'm looking after a customer who is backing up files from an AIX server (V 5.3) using latest BA Client to a Windows TSM Server (V 5.5.0). They do file level and TDP for Oracle backups.

Backups are fast. Restores (File level or TDP) are very slow and only running at 14 GB/hr. I get about 60GB/hr for restores to other Windows servers. It doesn't matter if I'm restoring from diskpool or tape pool, throughput is the same. FTP and other file copies to and from the AIX server give good results. They also use the Oracle TDP which also suffers slow performance.

We've had IBM in and run instrument tracing (both ends) and no one is any the wiser.

We've checked CPU/Memory/Disk IO/Network and can't find any problems or slowness except when restoring. The objects being backed up are large-ish files (between 20 and 200GB). We've been restoring one file at a time for testing. Network is Gb all the way (OS level file copies bear this out).

We've been through the various Performance and Tuning guides and experimented with setting like TCPWINDOWSIZE etc but nothing has made a significant difference.

I've seen a number of other articles complaining of restore performance and no solutions so I'm not very encouraged.

Any help would be greatly appreciated!!
 
Are you forcing TSM to recreate directory structures? Are you overwriting existing files?

-Aaron
 
What happens if you use "dd" to create a file of similar size in the same filesystem you are restoring to, what sort of performance do you get then?

What shows up in nmon on the client and tsm server when doing the restore?
 
Both file-level and TDP are slow, so it's probably not a DIRMC issue. Do you use separate node names for the file-level and TDP backups? How is the collocation set up. If you don't separate the 'nodes', and collocation is not set, you could be mixing files on tape and get poor performance. Did you do a trace? (You'll want to do it on both the client and server at the same time.)
 
Thanks for the questions, answers below ...

To keep testing simple we are restoring a single file into an empty folder, no folder restores. We were varying the file size to see if that made a difference. It doesn't.

dd or any other disk/file level operation gives excellent performance on the same filesystem.

I'm checking with the AIX admin around NMON. We did run PerfMon on the TSM servers and also went through some instrument tracing. Nothing obvious there.

Yes we use separate node names for the file level and TDP backups. During testing we have been restoring from diskpools and tapepools (after forcing migration). Other than waiting for media mounts the throughput has been poor in both scenarios. When performing these same tests on a Windows server and using same disk and tape pools the performance has been fine.

We've just upgraded the client slightly to 5.5.2.0 but hasn't helped.

Keep the questions coming, it's good to exercise the grey matter.
 
What tsm comms options do you have set? Primarily tcp window size etc.

What sort of connection is this on, gigabit on both client and server? You said you checked ftp both get and put, so it could be tsm specific network settings maybe.
 
Do you have the summary trace output from the TSM client? That should give us an indication on where to investigate further.
 
Thanks for all the responses and sorry for the late reply. The problem has been solved. It was a matter of juggling the TCPWINDOWSIZE setting at the client and server ends.

Setting this to 512 on our AIX clients and our TSM Server (Windows) had a dramatic effect and brought restores speeds back to what we were expecting.

Interestingly most of our Windows clients still have a value of 64 set but the restores behave normally.
 
Back
Top