ADSM-L

Re: [ADSM-L] Client fails with "Server media mount" when no drives are available

2008-06-19 12:19:44
Subject: Re: [ADSM-L] Client fails with "Server media mount" when no drives are available
From: "Schneider, John" <John.Schneider AT MERCY DOT NET>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Thu, 19 Jun 2008 11:18:26 -0500
Well,
        As it is, we have 128 virtual drives, which we thought would be
enough.  But sure, we could define and configure some more.
        But that actually leads us to another problem we are grappling
with.  When we get periods of high numbers of simultaneous tape mounts,
we sometimes get:

06/16/08 21:14:58     ANR8925W Drive CDL-LTO1-059 in library  CDL-LIB1
has not  
                       been confirmed for use by server EPCTLF01 for
over 900   
                       seconds. Drive will be reclaimed for use by
others.      
06/16/08 21:14:58     ANR8925W Drive CDL-LTO1-100 in library  CDL-LIB1
has not  
                       been confirmed for use by server MDCTSM03 for
over 900   
                       seconds. Drive will be reclaimed for use by
others.      
06/16/08 21:14:58     ANR8925W Drive CDL-LTO1-127 in library  CDL-LIB1
has not  
                       been confirmed for use by server MDCTSM04 for
over 900   
                       seconds. Drive will be reclaimed for use by
others.      
06/16/08 21:14:58     ANR8925W Drive DRIVE_F1_D04 in library  SUN0092
has not   
                       been confirmed for use by server MDCTSM03 for
over 900   
                       seconds. Drive will be reclaimed for use by
others.      
06/16/08 21:14:58     ANR8925W Drive LTO4_F1_D03 in library  SUN2079 has
not    
                       been confirmed for use by server MDCTSM03 for
over 900   
                       seconds. Drive will be reclaimed for use by
others.      

We get them for both virtual and real tapes both.  Because the tape
drive that the library master gets the error for is still actually in
use, the TSM library master and client get into a fight over the drive,
and get errors, and take the paths off line.  It is a huge mess. In
several cases the drives end up in "RETRY DISMOUNT FAILURE" state, and
we end up having to restart the TSM instances to clear it.

In looking at the IBM knowledge base, this seems like it might be
addressed by either of these PTFs:

IC54647: ANR8925W - TSM LIBRARY MANAGER INVALIDLY ATTEMPTING TO RECLAIM
DRIVE(S) BEING USED BY LIBRARY CLIENT(S)
IC52528: LIBRARY MANAGER SERVER ISSUES ANR8925W DRIVE IN LIBRARY NOT
CONFIRMED DUE TO THREAD NOT STARTING ON LIBRARY CLIENT.

Has anybody with a lot of simultaneous tape mounts seen this same
problem, and tried these fixes?

Best Regards,

John D. Schneider 
Phone: 314-364-3150 
Cell: 314-486-2359 
Email:  John.Schneider AT Mercy DOT net 


-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf Of
David E Ehresman
Sent: Thursday, June 19, 2008 10:12 AM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Re: [ADSM-L] Client fails with "Server media mount" when no
drives are available

I have no answer to your question.  But one of the beauties of a VTL is
that you can define additional drives as needed.  So why not define as
many drives as you have concurrent sessions?

David

>>> "Schneider, John" <John.Schneider AT MERCY DOT NET> 6/18/2008 6:18 PM >>>
Greetings,
    I am trying to understand the behavior I am seeing in a Lanfree
client.  This is a TSM 5.3.5.0 Lanfree client, with a TSM 5.3.4.0 AIX
client, connecting to a TSM 5.4.2.0 AIX server.  The TSM server in turn,
is a library client, and gets it's tape mounts from another TSM 5.4.2.0
server acting as the library master for a virtual tape library.  
    The lan-free client is configured to write it's data to a virtual
tape device.  Sometimes, because of the number of simultaneous clients
running, all the virtual tape drives happen to be in use, and the TSM
server produces:
 
06/17/08 21:08:05     ANR8447E No drives are currently available in
library     
                       CDL-LIB1. (SESSION: 4856)

06/17/08 21:08:08     ANR9790W Request to mount volume *SCRATCH* for
library    
                       client EPCTLF01 failed. (SESSION: 4856)    
 
The device class for these tape libraries are set for a mount wait of
120 minutes?  How come the mount fails right away when there aren't
enough tape drives?  Why doesn't it wait the 120 minutes, which would be
plenty of time.
 
On the client side, we get:
 
06/17/08   21:08:09 Normal File-->     7,340,032,000
/tsmprdsgf/epic/prdsgf03/er
1ept01/CACHE.DAT  ** Unsuccessful **
06/17/08   21:08:09 ANS1228E Sending of object
'/tsmprdsgf/epic/prdsgf03/er1ept0
1/CACHE.DAT' failed
06/17/08   21:08:09 ANS1312E Server media mount not possible
 
06/17/08   21:08:34 Normal File-->     1,048,576,000
/tsmprdsgf/epic/prdsgf03/er
1fsd/CACHE.DAT  ** Unsuccessful **
06/17/08   21:08:34 ANS1114I Waiting for mount of offline media.
06/17/08   21:09:06 Retry # 1  Normal File-->     1,048,576,000
/tsmprdsgf/epic/
prdsgf03/er1fsd/CACHE.DAT [Sent]
06/17/08   21:09:40 Normal File-->     1,048,576,000
/tsmprdsgf/epic/prdsgf03/er
1inp/CACHE.DAT [Sent]

 
If you look at the log, a single file fails to backup because of a
"Server media mount not possible", which eventually causes the whole
backup to fail with a RC=12.  But notice that only 30 seconds later,
another stream in the backup needs a virtual tape drive, and one happens
to be available, and the tape mount works and the rest of the backup
proceeds.
 
Is something in my setup missing?  Why is it a mount fails when there
aren't enough tape drives?  Why doesn't the client just go into
MediaWait status, and sit tight until a tape drive frees up?  Does the
Lan-free configuration have something to do with it?  There are lots of
times with our "real" IBM3584 library when there aren't enough tape
drives to go around, and TSM processes wait until one frees up.  What is
different here?
 

Best Regards,

John D. Schneider
Lead Systems Administrator - Storage
Sisters of Mercy Health Systems
3637 South Geyer Road
St. Louis, MO  63127
Phone: 314-364-3150
Cell: 314-486-2359
Email:  John.Schneider AT Mercy DOT net 

 
This e-mail contains information which (a) may be PROPRIETARY IN NATURE
OR OTHERWISE PROTECTED BY LAW FROM DISCLOSURE, and (b) is intended only
for the use of the addressee(s) named above. If you are not the
addressee, or the person responsible for delivering this to the
addressee(s), you are notified that reading, copying or distributing
this e-mail is prohibited. If you have received this e-mail in error,
please contact the sender immediately.
This e-mail contains information which (a) may be PROPRIETARY IN NATURE OR
OTHERWISE PROTECTED BY LAW FROM DISCLOSURE, and (b) is intended only for the
use of the addressee(s) named above. If you are not the addressee, or the
person responsible for delivering this to the addressee(s), you are notified
that reading, copying or distributing this e-mail is prohibited. If you have
received this e-mail in error, please contact the sender immediately.

<Prev in Thread] Current Thread [Next in Thread>