ADSM-L

Re: Curious - Any Suggestions

2003-06-26 14:52:46
Subject: Re: Curious - Any Suggestions
From: David Longo <David.Longo AT HEALTH-FIRST DOT ORG>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Thu, 26 Jun 2003 14:51:37 -0400
Looking at your data briefly, I have seen this happen a couple of times.
It was with an IBM 3575-L32 library with C-XL drives connected to
Rs6000 F50, AIX 4.3.3. and TSM 3.7.4.0.  Tapes would sit at a spot for
long time (hour or so) but there would be no errors in actlog or AIX errpt -
no messages at all.  Basically it was a drive problem -  like it was running
in real slow motion.

IBM CE didn't beleive me and ran diagnostics with his tape, instead
of taking a couple of minutes or so for diag, it was running 10-15 minutes
and giving no error and not finishing.  He replaced drive and o.k.

I know I had this at least twice.


David B. Longo
System Administrator
Health First, Inc.
3300 Fiske Blvd.
Rockledge, FL 32955-4305
PH      321.434.5536
Pager  321.634.8230
Fax:    321.434.5509
david.longo AT health-first DOT org


>>> MTailor AT CARILION DOT COM 06/26/03 02:37PM >>>
Hello, everyone!

ITSM 5.1.6.4
AIX 4.3.3.10
IBM pSeries 6M1 w/8GB RAM & 6-processors
Diskpool: 455GB
Tape Library: IBM 3494 w/6 - 3590H1A-FC drives; each drive is direct
attached to 6M1
Nightly backup: 1-2TB - 1.5TB

Running a backup stg tapepool copypool maxpr=2 and have this strange
situation:

tsm: TSM01>q pro

 Process     Process Description      Status

  Number
--------     --------------------
-------------------------------------------------
     222     Backup Storage Pool      Primary Pool TAPEPOOL, Copy Pool
COPYPOOL, Files
                                       Backed Up: 196796, Bytes Backed
Up:
                                       385,138,382,699, Unreadable
Files: 0, Unreadable
                                       Bytes: 0. Current Physical File
(bytes):
                                       1,677,934,931~Current input
volume:
                                       T11272.~Current output volume:
T10122.\


tsm: TSM01>q sess

  Sess     Comm.      Sess         Wait       Bytes       Bytes
Sess      Platform     Client Name
Number     Method     State        Time        Sent       Recvd
Type
------     ------     ------     ------     -------     -------
-----     --------     --------------------
10,310     Tcp/Ip     Run          0 S        3.1 K         334
Admin     AIX          ADMIN

tsm: TSM01>q db f=d

          Available Space (MB): 64,948
        Assigned Capacity (MB): 41,940
        Maximum Extension (MB): 23,008
        Maximum Reduction (MB): 10,000
             Page Size (bytes): 4,096
            Total Usable Pages: 10,736,640
                    Used Pages: 4,526,680
                      Pct Util: 42.2
                 Max. Pct Util: 42.2
              Physical Volumes: 26
             Buffer Pool Pages: 32,768
         Total Buffer Requests: 252,290,353
                Cache Hit Pct.: 99.51
               Cache Wait Pct.: 0.00
           Backup in Progress?: No
    Type of Backup In Progress:
  Incrementals Since Last Full: 0
Changed Since Last Backup (MB): 853.36
            Percentage Changed: 4.83
Last Complete Backup Date/Time: 06/26/03   00:30:54

tsm: TSM01>q log f=d

       Available Space (MB): 8,024
     Assigned Capacity (MB): 8,024
     Maximum Extension (MB): 0
     Maximum Reduction (MB): 8,020
          Page Size (bytes): 4,096
         Total Usable Pages: 2,053,632
                 Used Pages: 281
                   Pct Util: 0.0
              Max. Pct Util: 15.5
           Physical Volumes: 4
             Log Pool Pages: 512
         Log Pool Pct. Util: 1.80
         Log Pool Pct. Wait: 0.00
Cumulative Consumption (MB): 482,846.17
Consumption Reset Date/Time: 03/17/03   12:13:14

 . . . given the above, this system is sitting essentially idle; but in
the q pro output above the Files Backed Up has been sitting at the same
number with the same Current Physical File for more than 30-minutes.
Surely, this should have completed by now!?!

JFYI, given that I issued the command with maxpr=2 there should have
been two processes; one of the processes completed successfully.

There are no errors in the actlog.  When I go through the 3494's web
interface, there are no commands in queue.  All drives are available,
all paths are available, no errors on FC adapters, or network
interfaces.  No errors on either volume.  I know this should really not
be affected, but here it is anyway: (a) CPU Utilization for last
20-minutes on all processors is less than 1%  (b) no page faults, page
ins or outs, and (c) all hard drives are at 0% busy.

Anyone else experienced something like this?  Any suggestions?

TIA

Mahesh
##############################################################
This message is for the named person's use only.  It may 
contain confidential, proprietary, or legally privileged 
information.  No confidentiality or privilege is waived or 
lost by any mistransmission.  If you receive this message 
in error, please immediately delete it and all copies of it 
from your system, destroy any hard copies of it, and notify 
the sender.  You must not, directly or indirectly, use, 
disclose, distribute, print, or copy any part of this message
if you are not the intended recipient.  Health First reserves
the right to monitor all e-mail communications through its
networks.  Any views or opinions expressed in this message
are solely those of the individual sender, except (1) where
the message states such views or opinions are on behalf of 
a particular entity;  and (2) the sender is authorized by 
the entity to give such views or opinions.
##############################################################

<Prev in Thread] Current Thread [Next in Thread>