ADSM-L

Re: Copy Pool He&%$##

1997-02-28 19:58:40
Subject: Re: Copy Pool He&%$##
From: "Dwight E. Cook" <decook AT AMOCO DOT COM>
Date: Fri, 28 Feb 1997 18:58:40 -0600
Item Subject: Copy Pool He&%$##
     I'm gonna have fun here ;-) but I will provide useful info...
     1) you prefer a 12GA or 20GA shotgun NO NO no no no...OK really...
     ******read #5 for sure *******

     1) I'd run a list at the end of the backup stgpool process that would
     do a "q vol acc=offsite" and then feed that list that would do an
     "update vol xxxxxx acc=readw"  this should let them be used for future
     writes...
     2) HOW DID YOU GET THAT MUCH INFO INTO EXCEL without it blowing up?
        I was even wishing for a COBOL compiler the other day working
        with accounting records and EXCEL.
     3) Broke a BIG rule, can't remember the exact manual & section but
        there is something about everything downstream of a collocated pool
        must be yada yada yada.... I guess it says that 'cause no matter
        how you set it, you're gonna end up collocated. (so it seems)
     4) I'll bet if you look into your copypool you will see when a backup
        data set is active, there is a copy, when it rolls 1st level
        inactive - there is a copy, when it rolls 2nd level inactive -
        there is a copy...  NOT that I know for sure but guess-duh-mating
        how they might re-use routines and try to be slick... well, let's
        just say I've got so many bullet holes in my feet I learned my
        lesson...as soon at the thought "Hey, this is gonna be cool" enters
        my mind EXTREAM PAIN rushes through my extremities.
     5) I bet the TAPE NEEDED is to somehow flag it to return it...
        I wonder, since you are talking 3480 are you MVS ?
        **** YOU EVER DIG DEEP INTO (oh god i've been away too long) THE
        DAMN DCB I think of a running process's DD, THIS IS WHY REFER BACKS
        (used to anyway) in JCL didn't work... They used a fixed length
        array of something like 4 or 5 so if you wrote more than 5 or 6
        tapes (file after file to a tape in following job steps) you would
        hit a step that would have NO IDEA WHAT TAPE TO MOUNT ! ! ! ! ! !
        That was under MVS/XA and took me about 1/2 a week to find %-(
        The only way to that 6th tape could be found was at the end of
        the 5th.  Ahh, just caught the end of your message, yep MVS
        They probably have seen the error in their way and just said
         "Call for them all, one of'ems sure to be what we need"
        Right up there with "Oh multivolume dataset, mount all volues at
        the same time, we'll need'em eventually"
     6) Boy I'm glad I've been holding off on this :-O

     But really, anywhere else and you would have to pay for this much
     entertainment!

     I think enough has been exposed already to make it function as needed.
     Try the "reset to readw" asap and I'll bet it will reuse those
     tapes...  I'l like to hear.  I'd try it on my test box but all I've
     got there  is a 10 tape 8mm stacker, blah!
     later
          Dwight


______________________________ Reply Separator _________________________________
Subject: Copy Pool He&%$##
Author:  ADSM-L at unix,sh/DD.RFC-822=ADSM-L AT VM.MARIST DOT EDU
Date:    2/28/97 1:47 PM


Date:     February 28, 1997            Time:    13:32
From:    Jerry Lawson
    The Hartford Insurance Group
    (860) 547-2960    jlawson AT itthartford DOT com
-----------------------------------------------------------------------------
Can anyone shed some light on what is going on - this has been an experience
Can anyone shed some light on what is going on - this has been an experience
to say the least - one I don't want to repeat.

We started implementing copypools in mid December, and have the activity
almost complete.  Then yesterday I found that we had more tapes in our
copypool than in our primary pool!  Since we haven't actually sent anything
off-site yet (although the tapes are marked as being off-site), we started to
become concerned.  Here is what I found.

1.  For a pool of servers, I now have 2600+ tapes in the copypool.  The
primary pool has considerably less.  The reason for this (of course) is that
we have the reclamation threshold for the copypool set at 100%.

2.  I did a Q vol on the copypool, and spooled the output to disk, then took
the output into an Excel spreadsheet.  Of the  2600+ tapes, over 1100 had less
than 10% active data, of these approximately 700 had less than 1% used data -
many showed 0%, and when I looked, they contained 1 file.

At this point I should probably add that we collocate the primary pool, but do
not collocate the copypool.

At the same time, we took an error on a primary pool tape - a damaged header,
so the 731 files on this tape need to be recovered from the copypool.  We did
a Restore Vol... Preview =yes, and found that we needed to bring back 22
volumes.  This has me thoroughly confused because the damaged volume is from a
server that backs up "Mass Quantities" of data every night - last night the
volume was 4.3G.  When we reclaim the DASD space, we have to be writing volume
after volume (of 3480 cartridges) to the tape pool.  If the copypool tape was
made from the DASD pool, or from the tape, I would not have expected it to be
spread over so many different volumes.  ---- I should add that the copypool
tapes have many different creation dates and times; it just doesn't make any
sense to me.

Anyway, back to my copypool.

3.  We decided that as a test, we would set the migration threshold to 99%, and
see what happened.  A reclamation process kicked off, but much to my surprise,
we received what seemed to be 700 messages indicating that the following
volumes were needed for the reclamation of a copypool tape -- but it didn't say
which tape!  We guessed at the first tape in the list, but were wrong.  We were
asked for mounts of primary pool tapes (of course), and the reclaim processed
for a while.  It seemed to finish normally in about 10 minut es, and after
about 3 mounts.  We of course got 700 more messages stating that we didn't need
the following tapes anymore.   :-)

4.  About a minute later, another reclaim kicked off, and we got the 700
messages again.  The operators must love us!  :-)

5.  We decided to be bold and let it run, and after about 2 hours, we found
that we had indeed reclaimed about 200 tapes - the number of "Empty" tapes had
risen by that amount (We return Empty tapes on a daily basis, so there are
almost always some tapes here, but not that many.  This functions OK, it
seems.)

*****  My first question  ****  Why do we get so many "Tape needed" messages
for reclaims, when apparently they are not needed?  We certainly haven't
mounted that many tapes.

6.  As we were poking through the logs, we stumbled on a series of ANR1163W
messages, indicating that we have data on tapes in the copypool that no longer
exists in the primary pool.  The message and it's description seems clear
enough, but I got about 700 of these messages, too!  Now we have  had an occas
ional tape go bad on us (I think a total of 4) and we recovered from them, but
I don't think they were tapes that had been backed up to a copy pool - the one
I mentioned earlier was our first attempt at recovery.

*****  My second question ******  What is going on with the "not in primary
pool" messages?  It appears to me that the messages are being dumped on every
tape outstanding, and not just the bad one(s).  Is there any known problems
with this?  I can't think of any other reason (other than a damaged primary
tape) for data to reside only in a copy pool.

The good news is that since we have started this (about 3 hours ago) we have
added approximately 300 tapes to the status of "empty" - these will be
returned to scratch status this evening.

*****  My last question - how are other people doing with copypools - in
regards to tapes with not a lot of data on them?  ******

oh yes - I know that Inquiring minds want to know - my server is MVS, it is at
release 2.1.0.7  - yes I know that's old :-(
*****************************************************************************
Jerry Lawson
The Hartford Insurance Group
jlawson AT thehartford DOT com

Any idiot can face a crisis.  It's the day to day stuff that really wears you
down.

                        Anton Chekov
<Prev in Thread] Current Thread [Next in Thread>