ADSM-L

Re: Very slow restores (days), hours to locate files

2005-07-06 13:50:01
Subject: Re: Very slow restores (days), hours to locate files
From: Matthias Feyerabend <M.Feyerabend AT GSI DOT DE>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Wed, 6 Jul 2005 19:49:39 +0200
You are at which firmware on IBM LTO2 FC ?
We had last year big problems with firmware 38D0 , until we switched to
4770.
You can see that problem if you use tapeutil and skip to EOD and see how
long it takes.
Should be some minutes, was with our firmware one hour and more.
Just a guess.

Regards Matthias

Robin Sharpe wrote:

Hi guys,

We're having problems restoring some windows servers (W2K)...
The servers in question had some disk problems and are being rebuilt, so
the Windows admins are restoring the C: drive.  It is an 8GB drive and less
than 50% used, so only 4GB to restore.  It has taken several days to
restore.  I know one of our problems is that the data is spread over
hundreds of volumes (literally... I counted 310 from a volumeusage query).
Another problem is that we have an overflowed library, but we have loaded
all of the tapes from the Windows storage pool.  What I don't understand is
why it takes so long to locate a file once the tape is mounted.  We have
seen the same tape mounted for hours before any data is transferred.  Here
is an excerpt from a "q se f=d" of a restore that is running right now:

              Sess Number: 1,143
             Comm. Method: TCP/IP
               Sess State: Run
                Wait Time: 0 S
               Bytes Sent: 670.9 M
              Bytes Recvd: 58.2 K
                Sess Type: Node
                 Platform: WinNT
              Client Name: WANO01
      Media Access Status: Current input volume(s):  200658,(2279 Seconds)
                User Name:
Date/Time First Data Sent:
   Proxy By Storage Agent:

This restore has been running for almost 12 hours now (they have been
restarting them periodically).  There has been NO DATA transferred from
that tape in the 38 minutes it has been mounted... I know this from doing
an lsof command and looking at the offset which indicates the number of
bytes transferred.

I know that when I restore a single file, it can be found within seconds of
mounting a tape (these are all LTO-2)... so, why does it take so long in
this case?  Is TSM actually reading the entire tape?  If so, wouldn't I see
lots of data being transferred?  Or is there some kind of SCSI command that
allows the drive to read and compare the data it gets?  I thought TSM
stored actual locations of the files in the DB, so it could quickly find
any file (or aggregate) without reading the whole tape... I've been
searching the literature, and I can't find any details on this.

The TSM server is on HP-UX 11i, IBM LTO-2 drives, fiber attached, in a STK
L700 library.  Also, my DB is huge (314GB), and we are currently (for the
last year) unable to delete anything, so we have many versions of volatile
files.  We are planning to split our environment into several TSMs, and in
the short term, our windows admins will start doing weekly selective
backups of the C: drives to consolidate active versions on few tapes.

Thanks for any thoughts on this....

Robin Sharpe
Berlex Labs




--
--
Matthias Feyerabend                     | M.Feyerabend AT gsi DOT de
Gesellschaft fuer Schwerionenforschung  | phone +49-6159-71-2519
Planckstr. 1                            | privat +49-6151-718781
D-62291 Darmstadt                       | fax   +49-6159-71-2519