ADSM-L

Re: [ADSM-L] 6.1 experience so far

2009-12-11 23:56:06
Subject: Re: [ADSM-L] 6.1 experience so far
From: Zoltan Forray/AC/VCU <zforray AT VCU DOT EDU>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Fri, 11 Dec 2009 23:55:22 -0500
Good luck on the moving of the 40-million objects. You have read about my
trials and tribulations trying to do the same.

Be sure to be on 6.1.2.1 level of the server, since it addresses problems
with "long running transactions" locking/killing the logs.

Also, as you have seen from my (and confirmed by another) earlier post
today,  you will have times with the server just going off on itself for
hours at a time, when dealing with clients with lots and lots of objects.

You seemed to have discovered the same problems/bugs we have.

There are still way too many issues to be resolved, especially when trying
to aggressively use 6.1, which is why we are staying in psuedo-production
until at least another service-pack!



From:
Sam Sheppard <SHS AT SDDPC.SANNET DOT GOV>
To:
ADSM-L AT VM.MARIST DOT EDU
Date:
12/11/2009 07:44 PM
Subject:
[ADSM-L] 6.1 experience so far
Sent by:
"ADSM: Dist Stor Manager" <ADSM-L AT VM.MARIST DOT EDU>



After viewing the experiences of others on the list (particularly Mr.
Forray's) and fearing I would jinx myself, I hesitated to post this, but
decided to go ahead and post our adventures so far.

We had a visit from our Servergraph rep a couple of weeks ago and during
the conversation discovered that we seemed to be alone, at least among
their Southern California customers, in implementing TSM Version 6 in
production.  We began in September and started with Version 6.1.2.  We
are approaching completion of our project to migrate our existing TSM
5.5.3 servers, two on z/OS and one on Solaris, to TSM Version 6 on a new
AIX 6.1 P-520 server.

Our total database size for the three existing servers is about 120GB.
We are sharing a 3494 ATL with 8 TS1120 drives between the Solaris box
and the Version 6 server, with the Version 6 server acting as the
library manager. So we may be somewhat on the small end of the average
customer.

Since we started on a fresh box, it looks like we have avoided many of
the pitfalls associated with upgrading in place from version 5, but we
did experience what in hindsight look like fairly minor problems:

    IC62978 - active logs fill up due to DB2 table reorg processes. Fix
    was to specify the undocumented ALLOWTABLEREORG NO option.

    IC63373 - while running a large image backup (around 600GB) and
    several other clients, received message ANS1316e and ANR0526W,
    indicating recovery log out of space, even though we have 30GB and
    it's not even close to full. Solution is to do the following to
    change a DB2 variable from its standard setting:

      1. Use the following db2 command to determine the number of log
      volumes used:
         db2 get db cfg for TSMDB1
      2. Multiply the value for the LOGPRIMARY parameter by 90%.  This
      value should be reflected in NUM_LOG_SPAN.

      Update NUM_LOG_SPAN by issuing the following db2 command:
         db2 update db cfg for TSMDB1 using NUM_LOG_SPAN <newValue>
      You may need to restart the TSM server, which will restart the
      db2 database as well.

    IC63637 - We have a large (30-40TB) amount of archived data to move
    from our existing server(s) to version 6. The good news is that the
    large archived image backups exported server-to-server very fast,
    around 60MB/sec. The bad new is, the Version 6 library manager
    function periodically reclaims a tape drive being used by the
    library client, in our case, causing the large EXPORT/IMPORT process
    being run to fail and mark the file being exported at the time to be
    flagged, causing a copy pool tape to be requested if the process is
    restarted. The fix for this was to install version 6.1.2.1 and then
    replace the DSMSERV module with a fix version.

    Database backups suddenly failed for 5 days in a row, but then
    started working again when support requested various documentation.
    Looks like DB2 communicates with the TSM server with its own OPT
    file, specifying 'localhost' as the TCPSERVERADDRESS, which appeared
    to be failing even though all other functions in the TSM server were
    working fine. Waiting for reoccurence.

    Export Node function apparently does not copy the MAXNUMMP setting.

    A (relatively) long list of quirks in the ISC, which we forced
    ourselves to use while our Servergraph license was updated. Some
    of these were only related to Firefox 3.5.4. The worst was a Java
    problem that 'unchecked' the 3 'enable sessions' boxes in the
    'Sessions' display of the Server Properties window when you left the
    display and then came back, causing all sessions to be disabled
    necessitating a server restart. Using IE, however, the ISC has
    become almost bearable and performs much better than previous
    versions.

    The Operational Reporter is not officially supported in Version 6,
    something we missed, but is easily modified to supply most of the
    info needed.

We have not seen the dreaded huge increase in database size and after
the setting of the ALLOWREORGTABLE option, we haven't had any log
problems either. We are currently running full database backups on
Monday, Wednesday, and Friday, with incrementals in between. Full DB
backup of the 45GB database takes about 6 minutes to a TS1120 drive.
As noted, the current size of our DB is around 45GB with about 2/3 of
our 350 client having been moved. However, the largest of them, several
Windows file/print servers containing in the neighborhood of 40 million
files, are still to be moved. We begin testing next week on an NDMP
solution for these, or perhaps experiment with the new SnapDiff feature.

Sam Sheppard
San Diego Data Processing Corp.
(858)-581-9668

<Prev in Thread] Current Thread [Next in Thread>