ADSM-L

Re: Backup of Tape Pool Still Running!

2002-12-14 08:58:54
Subject: Re: Backup of Tape Pool Still Running!
From: Richard Sims <rbs AT BU DOT EDU>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Sat, 14 Dec 2002 08:58:04 -0500
>Did anyone come across this issue, we have a scheduled back that starts
>nat 6am, and usually ends by 9. I just logged on to check on other processes
>and noticed that the schedule is still running but the byte count is not
>changing:
>Primary Pool UNIX_TAPEPOOL, Copy Pool DRM_TAPE_-
>                               POOL, Files Backed Up: 2179, Bytes Backed Up:
>
>                               3,930,293,749, Unreadable Files: 0, Unreadable
>
>                               Bytes: 0. Current Physical File (bytes):
>
>                               3,122,565,063
>
>I issued a 'can pr' command and was prompted that the cancel is already
>pending!
>Thus, I can't cancel it. I hate to just bounce the TSM server without knowing
>what might have caused this error! any help is appreciated!

The presence of a prior cancel indicates another chef in the kitchen, as
indicated in another response.  (You should find out who, and what compelled
them to try to cancel a storage pool backup, of all things.  Check your server
Activity Log for indications of who, when.  Maybe signs of pre-emption?)

But the cancel did not "take", and that can be an indication of something
fouled.  Your query output does not reflect input-output volumes, which would
appear at the bottom of that query response:
 - If they were there but not included in the posting:
   The obvious first thing to do is a Query Volume F=D on the input and output
   volumes to determine when I/O last occurred.  Examine any operating system
   error log you have, which may reflect issues with media or hardware.  Have
   your operator check both drives to assure they are active, with those volumes
   in them.  And, of course, a 3 GB file takes some time to copy, and TSM
   doesn't like to fulfill Cancels until such a unit of work is complete.
 - If there were no volumes in the response:
   This would indicate that TSM thinks it has performed the initial aspects
   of the cancel.  It may well be the 4.2 defect as indicated in the prior
   response.  Check also that TSM is not waiting for a problematic volume
   dismount (have the operator check the drive, or perform an 'mtlib' or like
   query, for the volumes as reflected in Activity Log still being mounted).

 Richard Sims, BU

<Prev in Thread] Current Thread [Next in Thread>