ADSM-L

Re: Idle system fails with Media mount not possible

2002-12-17 19:13:22
Subject: Re: Idle system fails with Media mount not possible
From: David Longo <David.Longo AT HEALTH-FIRST DOT ORG>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Tue, 17 Dec 2002 19:12:06 -0500
You didn't mention the Maint Level of AIX 4.3.3.  If you have less
than ML 10, then I would upgrade to that as one step.

David Longo

>>> Todd_Lundstedt AT VIA-CHRISTI DOT ORG 12/17/02 05:15PM >>>
How strange.. I just went through something similar.  Running on AIX
4.3.3,
TSM 4.2.1.7.  What are you running?
The short answer was to reboot the AIX operating system, and
everything
started working fine.. The long answer follows (well, not really an
answer,
just my situation, and what I tried to resolve it).

Server
AIX 4.3.3
TSM 4.2.1.7

Nodes
W2K
Storage Agent 4.2.1.7
BA Client 4.2.1.32
TDP for SQL 2.2
SQL 2000
and

WinNT4
Storage Agent 4.2.1.7
BA Client 4.2.1.15
TDP for SQL 1.1
SQL 6.5

Relevant TSM server storage as follows...
diskpool_sql_meta (no next storage pool intended only for the
*/.../meta/.../* info)
diskpool_sql (next storage pool is ltotape_sql, intended for smaller
databases)
ltotape_sql (collocation of FILESPACE since /stripes=2 backups are
kept
here)

The SQL 2000 server had been having issues over the last few months
where
backups to ltotape_sql with /stripes=2 of a 265GB database would fail
with
a "server media mount no possible" error, but /stripes=1 differential
backups would back up fine.  Oddly, increasing the Maximum Mount Points
for
the node by one would allow the /stripes=2 backup to succeed, but the
next
time a /stripes=2 backup would run, it would fail (until I increased
the
MMP again).  I had 5 drives, all free and unused and 7 MMP for the
node
when... this new wrinkle occurred.

The SQL 6.5 server started having problems backing up certain
databases:
the smaller system databases; master, model, msdb, pubs, tempdb, with
and
error message of "server media mount not possible".  All the DBs on
this
server have a destination of ltotape_sql.  Like you, plenty of room in
the
storage pool, plenty of scratch.

Called support

Got level one.. told him a few things.. he didn't even want to try it..
and
immediately escalated to level two.  While I waited for a call back
from
level two, the following occurred.

I noticed that there are some databases in diskpool_sql that haven't
migrated to ltotape_sql.  Kicking off a migration gets a similar error
message "media mount not possible", which, oddly, is the same message I
got
from the storage agent when backing up tapes to ltotape_sql.

I carefully detailed what it took to migrate those 3 files from
diskpool_sql to ltotape_sql, which is a whole other chapter by itself,
involving changing maxscratch up and down, moving data, and a few
other
hoops, and I was unable to get some tapes to "move" with a move data
command (tapes that had only one master or msdb or tempdb type database
on
them).

Level two calls back.  I go through the entire situation, including
the
fact of the Max Mount Point having to change every time I did I
/stirpes=2
backup (I wasn't sure if that was a related issue or not).  She is
baffled,
and wants to think it over and search databases etc to see what she
can
come up with.  Within 30 mins, she calls back and asks me to reboot the
TSM
server's OS (uptime reported a whopping 82 days), just to see what
would
happen.  I do.  Migrations go.  Backups /stripes=1 go.  Backups
/stripes=2
go (even with MMP set back to 4 for that node, instead of 7 ( with only
5
tape drives remember).  This was Friday.

Sunday night, the TSM server did something odd (haven't reported this
to
TSM support yet).  It just stopped.  It showed link status on the
fiber
cards, and network cards, but you couldn't ping it, the server console
wouldn't wake up, nothing.  Even the display on the front was dark, but
the
power light was on steady like it was operational, not flashing like
it
would be if you did a proper shutdown.  I "reset" it Monday morning
when I
found it that way, and then had to do a clean shutdown and power on to
get
the fiber cards to see the library correctly.  Very weird.

So, I am taking Monday morning (yesterday) as the start time to see
how
long it takes until I have to increase my MMP on the one node just to
get a
/stripes=2 backup.

The saga continues...






                    "Conko,
                    Steven"              To:     ADSM-L AT VM.MARIST DOT EDU
                    <sconko AT ADT DOT CO       cc:
                    M>                   Fax to:
                    Sent by:             Subject:     Idle system fails
with Media mount not possible
                    "ADSM: Dist
                    Stor Manager"
                    <ADSM-L AT VM DOT MAR
                    IST.EDU>


                    12/17/2002
                    03:19 PM
                    Please respond
                    to "ADSM: Dist
                    Stor Manager"






strange one... and ive looked at everything i can think of.

In client dsmerror.log:

12/17/02   15:01:54 ANS1228E Sending of object
'/tibco/logs/hawk/log/Hawk4.log' failed
12/17/02   15:01:54 ANS1312E Server media mount not possible

12/17/02   15:01:57 ANS1312E Server media mount not possible



In activity log:

ANR0535W Transaction failed for session 1356 for node
SY00113 (AIX) - insufficient mount points available to
satisfy the request.


There is NOTHING else running on this TSM server. All 6 drives are
online.
The backup is going to a 18GB diskpool that is 8% full, there are
plenty of
scratch tapes, i set max mount points to 2. keep mount point=yes. it
starts
backing up the system then just fails... always at the same point. the
file
its trying to back up does not exceed the max size. all drives are
empty,
online. diskpool is online. i see the sessions start and then just
after a
minute or 2 just abort.

any ideas?


"MMS <health-first.org>" made the following
 annotations on 12/17/2002 07:13:49 PM
------------------------------------------------------------------------------
This message is for the named person's use only.  It may contain confidential, 
proprietary, or legally privileged information.  No confidentiality or 
privilege is waived or lost by any mistransmission.  If you receive this 
message in error, please immediately delete it and all copies of it from your 
system, destroy any hard copies of it, and notify the sender.  You must not, 
directly or indirectly, use, disclose, distribute, print, or copy any part of 
this message if you are not the intended recipient.  Health First reserves the 
right to monitor all e-mail communications through its networks.  Any views or 
opinions expressed in this message are solely those of the individual sender, 
except (1) where the message states such views or opinions are on behalf of a 
particular entity;  and (2) the sender is authorized by the entity to give such 
views or opinions.

==============================================================================