ADSM-L

Re: restore speed question

2005-12-22 09:13:31
Subject: Re: restore speed question
From: Leigh Reed <L.Reed AT MDX.AC DOT UK>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Thu, 22 Dec 2005 14:12:55 +0000
Sorry Alex, I understand the deal now.

I have restored SQL and Exchange databases from LTO tape to NTFS
filesystems (of similar size to same filesystem) and also monitored the
IP network throughput and FC throughput during this operation. I have
found the performance to be uniform across the duration.

As Richard has stated, you are into elimination territory now. Previous
to Richard's post, I was going to suggest checking LTO drive firmware
and seeing if there was any known bugs in your release, or moving the
data from the current tape to another, to eliminate that specific piece
of media. However, as Richard suggested, moving the data to disk and
then restoring from disk, takes the tape drives right out of the
equation and moves the focus straight onto the network/target machine
disk & filesystem.

I know you have clarified the situation, but just so I'm sure the 32GB
file that you are restoring is a single 32GB NTFS file, your not
restoring a TSM image or backupset.

Leigh

-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf Of
Alexander Lazarevich
Sent: 22 December 2005 13:09
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Re: [ADSM-L] restore speed question

Thanks for the responses all, but it's not a tape mounting issue. I
wasn't
clear enough in my original post, but I am watching the actlog while the
restore is taking place, and I'm sitting next to the library, so I can
tell when it's doing anything: remounting, rewinding, etc. What I'm
saying is this:

The server is restoring a single 32GB file, and starts doing so at
30+MB/sec. At some point, DURING the restore of that SAME 32GB file, the
server suddenly slows down the restore, to 200-300K/sec. The server has
NOT switched tapes, and is NOT rewinding even the SAME tape. It is still
restoring that same 32GB file, but suddenly does so at a slower speed.

I know the drives have some kind of burst speed and normal speed. Maybe
something is wacked out with that function?

Any other ideas?

Alex

On Thu, 22 Dec 2005, Leigh Reed wrote:

> Alex
>
> I hate restores that don't go as fast as I want them to, especially
when
> it's 3 o'clock in the morning, so I'll have a stab at what might be
> wrong. The nature of your problem does seem very intermittent and the
> fact that some times you do achieve an acceptable speed makes it
> difficult.
>
> Firstly, I think you need to know what primary pool tapes your data is
> across. As Troy mentioned, if you are not collocating (or collocating
by
> group), then the data is going to be spread across a large number of
> tapes. Even if you are collocating (all data on one tape), remember
that
> you are restoring the active data only, the tape will contain all the
> previous and deleted versions (depending upon your backup copy group
> parameters). During the restore, the tape will have to skip between
> these; while this is happening, your aggregate network performance
will
> decrease, as nothing is being restored.
>
> The following command will list the primary volumes that the node data
> is across
>
> select volume_name from volumeusage where node_name='xxxxxxx' and
> copy_type='backup' and stgpool_name='PRIMARY_TAPE_POOL' group by
> volume_name
>
> If this returns a large number of tapes, then you have 2 options
> available to you. Use a 'multi-thread' restore, by increasing the
> resourceutilization setting in the client dsm.opt file and also
> increasing the MAXNUMMP parameter. This will enable you to restore
> multiple tapes concurrently (depending on the number of drives that
you
> have available). Please note that multi-threading only works with No
> Query Restores.
>
> The second option is as Troy alluded to with a MOVE NODEDATA, but if
> memory servers me right, the elusive 'Active only' switch is still not
> available, therefore the tape restore will still have to skip through
> the data that is not active.
>
> If all of the above is completely evident to you, then we are back to
> the old favourite; try FTP'ing a large directory of files from the TSM
> server to the target restore server, this should test out your network
> and filesystem performance.
>
> The only other suggestion would be to take a look at what your TSM
> server is doing at the time of the restore.
> - are you doing the restore at night when a large number of backups
are
> occurring
> - is expiration running at the time of the restore
> - during the restore, keep issuing 'q sess' commands and see if the
> restore is 'clocking up' recw, sendw, commw time.
>
> One other thing I have just remembered, if you are doing a full BMR
and
> you have restored the OS first and rebooted, your restored OS may have
> virus scanning enabled and if it is set to scan on write, when you
> restore the remaining drive(s), every file will be scanned before it
is
> written, this will definitely slow down your restore. Task manager
> should show the virus scanner chewing up CPU.
>
> HTH
> Merry Xmas One and All.
> Leigh
>
>
> -----Original Message-----
> From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf
Of
> Alexander Lazarevich
> Sent: 21 December 2005 19:47
> To: ADSM-L AT VM.MARIST DOT EDU
> Subject: [ADSM-L] restore speed question
>
> TSM server 5.3.1 on 2K server. Libraries are one Overland Neo 4100
with
> 2
> LTO2 drives, and an Overland Neo 4100 with 2 LTO3 drives.
>
> I'm restoring a windows client workspace. Client is running TSM backup
> client version 5.3.0. Originally, the client was 5.1.9.0, and it was
> with
> this version that we first created the backup of the workspace drive
on
> the client.
>
> Now I'm trying to restore that workspace filespace to the new system.
> The
> restore started fine, 30MB+/sec in our GigE network. But at times the
> restore speed slows to a halt, and restore speeds are less than
1MB/sec,
> sometimes only 200-300K/sec. Then a little later it will start to go
> 30MB/sec again. It is switching back and forth, sometimes in the
middle
> of
> a file! The server is currently trying to restore an 86MB file, but
it's
> doing so only at 300K/sec. Being that the workspace is 244GB, this is
> unacceptable speed.
>
> The client data is on the LTO2 library. There is absolutely nothing
> unusal in the logs that would indicate any kind of problem on the
drive
> or
> the tape. No errors are being reported whatsoever.
>
> The client filespace is NOT compressed on the client side. Compression
> happens on the drives (HP).
>
> The client hardware is excellent, dual AMD opteron, with 3G SATA
drives
> in
> striped RAID, XP Pro. Plus the client at times restores at 30+MB/sec
so
> I
> know it can do it.
>
> Network on the client is not busy, and the network switch is not
> saturated, in fact there is very little network activity.
>
> It just seems like the server decides to go fast at times and then
> sometimes very slow. But with nothing in the logs I don't know what to
> troubleshoot. Any idea where to start troubleshooting this problem?
> Anyone
> seen this type of behavior before?
>
> Thanks!
>
> Alex
>

<Prev in Thread] Current Thread [Next in Thread>