ADSM-L

Very slow restores (days), hours to locate files

2005-07-06 10:12:18
Subject: Very slow restores (days), hours to locate files
From: Robin Sharpe <Robin_Sharpe AT BERLEX DOT COM>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Wed, 6 Jul 2005 10:10:28 -0400
Hi guys,

We're having problems restoring some windows servers (W2K)...
The servers in question had some disk problems and are being rebuilt, so
the Windows admins are restoring the C: drive.  It is an 8GB drive and less
than 50% used, so only 4GB to restore.  It has taken several days to
restore.  I know one of our problems is that the data is spread over
hundreds of volumes (literally... I counted 310 from a volumeusage query).
Another problem is that we have an overflowed library, but we have loaded
all of the tapes from the Windows storage pool.  What I don't understand is
why it takes so long to locate a file once the tape is mounted.  We have
seen the same tape mounted for hours before any data is transferred.  Here
is an excerpt from a "q se f=d" of a restore that is running right now:

               Sess Number: 1,143
              Comm. Method: TCP/IP
                Sess State: Run
                 Wait Time: 0 S
                Bytes Sent: 670.9 M
               Bytes Recvd: 58.2 K
                 Sess Type: Node
                  Platform: WinNT
               Client Name: WANO01
       Media Access Status: Current input volume(s):  200658,(2279 Seconds)
                 User Name:
 Date/Time First Data Sent:
    Proxy By Storage Agent:

This restore has been running for almost 12 hours now (they have been
restarting them periodically).  There has been NO DATA transferred from
that tape in the 38 minutes it has been mounted... I know this from doing
an lsof command and looking at the offset which indicates the number of
bytes transferred.

I know that when I restore a single file, it can be found within seconds of
mounting a tape (these are all LTO-2)... so, why does it take so long in
this case?  Is TSM actually reading the entire tape?  If so, wouldn't I see
lots of data being transferred?  Or is there some kind of SCSI command that
allows the drive to read and compare the data it gets?  I thought TSM
stored actual locations of the files in the DB, so it could quickly find
any file (or aggregate) without reading the whole tape... I've been
searching the literature, and I can't find any details on this.

The TSM server is on HP-UX 11i, IBM LTO-2 drives, fiber attached, in a STK
L700 library.  Also, my DB is huge (314GB), and we are currently (for the
last year) unable to delete anything, so we have many versions of volatile
files.  We are planning to split our environment into several TSMs, and in
the short term, our windows admins will start doing weekly selective
backups of the C: drives to consolidate active versions on few tapes.

Thanks for any thoughts on this....

Robin Sharpe
Berlex Labs