ADSM-L

Re: TSM Scheduler shows backups completed successfully dsmerror. log shows backups failed ?

2002-12-30 17:57:30
Subject: Re: TSM Scheduler shows backups completed successfully dsmerror. log shows backups failed ?
From: Andrew Raibeck <storman AT US.IBM DOT COM>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Mon, 30 Dec 2002 15:56:35 -0700
Hi Kathleen,

Most of the text in my response to Shekhar was geared toward
ACTION=COMMAND events because scripting often trips people up. If you are
just doing ACTION=INCREMENTAL, then the APAR I mentioned does not apply to
you.

The 5.1 client should, by design, report a more reliable completion status
for TSM-scheduled events. The completion status includes both a general
status of "Complete" or "Failed", plus a return code of 0, 4, 8, or 12
(use QUERY EVENT F=D to see the return code, reported as "result code" in
the output). A status of "Complete" can have an RC of 0, 4, or 8, and a
status of "Failed" has an RC of 12 (this is described in the client
documentation I referred to in my original response on this thread). RC 0
means everything went A-OK, RC 4 means everything went A-OK except for
skipped files, and RC 8 means everything went A-OK except for one or more
warning messages that were issued (ANSnnnnW). RC 8 can also encompass
skipped files (i.e. if one or more ANSnnnnW messages were issued and some
files were skipped). Status of "Complete" with RC 4 or 8 means that all
file systems were processed successfully, but there are some issues that
you might want to investigate.

This being the case, the problem where CANCEL SESSION causes the event
status to show as "Complete" is a bug, because the client was not able to
process all file systems; the status should be "Failed" with RC 12 (I
suppose a status of "Cancelled" would be even better!). This bug is
documented in APAR IC35292 (not yet fixed).

Regards,

Andy

Andy Raibeck
IBM Software Group
Tivoli Storage Manager Client Development
Internal Notes e-mail: Andrew Raibeck/Tucson/IBM@IBMUS
Internet e-mail: storman AT us.eyebm DOT com (change eye to i to reply)

The only dumb question is the one that goes unasked.
The command line is your friend.
"Good enough" is the enemy of excellence.




Kathleen Hallahan <Kathleen_Hallahan AT FREDDIEMAC DOT COM>
Sent by: "ADSM: Dist Stor Manager" <ADSM-L AT VM.MARIST DOT EDU>
12/30/2002 15:00
Please respond to "ADSM: Dist Stor Manager"


        To:     ADSM-L AT VM.MARIST DOT EDU
        cc:
        Subject:        Re: TSM Scheduler shows backups  completed successfully 
dsmerror. log
shows backups failed ?



I did see your response, and I'm currently working on getting more detail
on the APAR you mentioned.  My case is a little different in that the
reporting inconsistencies are occuring during regular incremental
schedules, not where action=command; I neglected to mention that
previously.   We do not actually run any backups using that format.  I
don't know if that APAR will end up applying to our environment, as a
result.

So I'm not sure if my problem fits in with your response particularly, as
it was oriented towards the detail of Shekhar's problem.  However, it does
seem to me--and maybe this is my own frustration talking--to be yet one
more example of how problematic event reporting has become at the 5.1
level
of TSM.  At 4.1, I could run a q ev command and be reasonably comfortable
that I understood what had succeeded or failed the night before; that is
no
longer the case.  I have even seen recent cases where I have cancelled an
active backup session on the TSM server, only to have it report to me as
Completed.  These used to report as Failed every time.

I've had PMR's and DCR's opened in an attempt to make this more manageable
from the perspective of a large environment, and have been turned down;
the
expectation, as I understand it, is that the client logs are the
definitive
source for backup success/failure information.

As a side note, I agree with you that any event reporting as In Progress
is
considered a problem until proven otherwise, and that is how we treat
them.
That is where the workaround comes in; because In Progress is not treated
as an exception by TSM, we either have to q ev and scan through over 800
entries to find them, or run an AIX script to search for them.

I'd appreciate any further thoughts you have on the matter.

Thanks!

Kathleen




                    Andrew Raibeck
                    <storman AT US DOT IB       To:     ADSM-L AT VM.MARIST DOT 
EDU
                    M.COM>               cc:
                    Sent by:             Subject:     Re: TSM Scheduler
shows backups  completed successfully
                    "ADSM: Dist           dsmerror. log         shows
backups failed ?
                    Stor Manager"
                    <ADSM-L AT VM DOT MAR
                    IST.EDU>


                    12/30/2002
                    04:13 PM
                    Please respond
                    to "ADSM: Dist
                    Stor Manager"






Did you happen to see my response to the original response to Shekhar
earlier today, which covers a lot of this. I agree that the "In Progress"
indicator is very vague; in the mean time, any "In Progress" status should
be treated as suspicious. Otherwise it would help to know how your
problems fit in with the context of my earlier response.

Regards,

Andy

Andy Raibeck
IBM Software Group
Tivoli Storage Manager Client Development
Internal Notes e-mail: Andrew Raibeck/Tucson/IBM@IBMUS
Internet e-mail: storman AT us.eyebm DOT com (change eye to i to reply)

The only dumb question is the one that goes unasked.
The command line is your friend.
"Good enough" is the enemy of excellence.




Kathleen Hallahan <Kathleen_Hallahan AT FREDDIEMAC DOT COM>
Sent by: "ADSM: Dist Stor Manager" <ADSM-L AT VM.MARIST DOT EDU>
12/30/2002 09:24
Please respond to "ADSM: Dist Stor Manager"


        To:     ADSM-L AT VM.MARIST DOT EDU
        cc:
        Subject:        Re: TSM Scheduler shows backups  completed
successfully dsmerror. log
shows backups failed ?



I might try running this for a few days to see what kind of results I get.
However, we have filespaces added and removed pretty frequently, so I'm
not
sure yet if my results would be valid.  I will see what it does, though.

What I'd really like to see, of course, is the TSM server report the
actual
status of backups more accurately, rather than having to devise
workarounds.  So far that functionality doesn't seem to be planned for the
product, however, at least based on my discussions with IBM/Tivoli.





                    "Nelson, Doug"
                    <DNelson@CHITT       To:     ADSM-L AT VM.MARIST DOT EDU
                    ENDEN.COM>           cc:
                    Sent by:             Subject:     Re: TSM Scheduler
shows backups  completed successfully
                    "ADSM: Dist           dsmerror. log         shows
backups failed ?
                    Stor Manager"
                    <ADSM-L AT VM DOT MAR
                    IST.EDU>


                    12/30/2002
                    10:56 AM
                    Please respond
                    to "ADSM: Dist
                    Stor Manager"






We have a script that runs as part of our morning report that shows file
spaces that have not been backed up in the last 24 hrs. It has been very
successful in catching the ones that say "completed" but didn't really do
it.

select node_name,filespace_name,backup_end from -
 filespaces where 1 < -
 days(current_timestamp) - days(backup_end) or -
 backup_end is null


Douglas C. Nelson
Distributed Computing Consultant
Alltel Information Services
Chittenden Data Center
2 Burlington Square
Burlington, Vt. 05401
802-660-2336



-----Original Message-----
From: Kathleen Hallahan [mailto:Kathleen_Hallahan AT FREDDIEMAC DOT COM]
Sent: Monday, December 30, 2002 10:36 AM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Re: TSM Scheduler shows backups completed successfully
dsmerror.log shows backups failed ?


This is an issue we have raised with Tivoli ourselves recently, even
requesting--and being turned down for-- a Design Change Request.  Their
recommendation was to check the dsmsched.logs on the clients.  This is
impossible for us and presumably for anyone in a large envorinment; we
have
over 800 clients and I know there are organizations out there with many
more.  As it is, we've had to write an AIX script to catch the ones that
report as "In Progress," considering that they are rarely, if ever,
actually in progress at that time.  I have no idea how to chase down the
ones reporting as "Completed" when in fact they have not, on a daily basis
across our entire environment.  If anyone has any workable solutions I'd
be
very interested in hearing them; I'm at a bit of a loss myself.





                    Mark Stapleton
                    <stapleto@BERB       To:     ADSM-L AT VM.MARIST DOT EDU
                    EE.COM>              cc:
                    Sent by:             Subject:     Re: TSM Scheduler
shows backups  completed successfully
                    "ADSM: Dist           dsmerror.log         shows
backups
failed ?
                    Stor Manager"
                    <ADSM-L AT VM DOT MAR
                    IST.EDU>


                    12/29/2002
                    09:58 PM
                    Please respond
                    to "ADSM: Dist
                    Stor Manager"






On Fri, 2002-12-27 at 14:07, shekhar Dhotre wrote:
>  q eve * * shows all backups are completing successfully but  clients
> dsmerror.log  file shows  schedule command failed ,How I can verify that
> these backups are completing successfully other than restore test?
>
> 26-12-2002 20:45:08 ANS1909E The scheduled command failed.
> 26-12-2002 20:45:08 ANS1512E Scheduled event 'EMEAPROD_HOT_DAILY'
failed.

Remember that the *schedule* itself completed successfully. You're far
better off looking for the message numbers that appear in the summaries
that appear in the server activity log at the end of a successsful
backup. (I'm not within reach of a TSM server tonight, so I can't look
them ATM.)

This is a much better indicator of successful backups than the event
list.

--
Mark Stapleton (stapleton AT berbee DOT com)