ADSM-L

Re: [ADSM-L] TSM Client Return Codes - Failed or partial failed

2011-08-31 15:33:21
Subject: Re: [ADSM-L] TSM Client Return Codes - Failed or partial failed
From: "Huebschman, George J." <gjhuebschman AT LEGGMASON DOT COM>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Wed, 31 Aug 2011 15:25:10 -0400
* A client can have failed objects in the Schedule Summary report
without having a failed backup.
The Return Codes from the Client RC 0, RC 4, RC 8, and RC 12 determine
whether or not a Scheduled backup fails.
A Scheduled backup can FAIL with NO failed files.  That used to drive me
crazy, but that is a short trip for me.


* Messages like these in the Client's dsmerror log should NOT cause RC
12 return codes (FAIL):
<date_time> ANS4037E Object \\...\\ changed during processing. Object
skipped.
<date_time> ANS4005E Error processing '\\path\to\file.tmp': file not
found
<date_time> ANS1228E Sending of object
'\\path\to\file_important_Workbook.xlsx' failed
<date_time> ANS4987E Error processing
'\\path\to\file_important_Workbook.xlsx': the object is in use by
another process

        A "file... changed during processing" error (ANS4037E) happens
when a file has changed between the inventory and actually backup
attempt.  TSM skips it but does not fail the backup event.
        A file not found error (ANS4005E)occurs because the client can
not find a file that was inventoried at the beginning of the backup
event.  The client starts the Incremental backup by checking to see what
new or changed objects exist. It makes a list and then backs them up.
If the file is deleted or moved between the creation of the list and the
backup of the object, (as with the .tmp files above), you get a message,
but not a failed backup.
        A file in use error (ANS4987E) is regarded as a minor issue by
TSM as well. TSM feels it should tell you about it, but since it is not
in a state where a good backup can be taken, it is not a fatal event.
        Your TSM Server CopyGroup serialization settings determine if
TSM will try to back it up again.
        The CHAngingretries option for the Client in the dsm.sys or
dsm.opt file determines how many times a client should retry the file.
        You may see Retry messages such as these for objects in that
situation:
08/30/2011 01:27:43 Retry # 1  Normal File-->         2,992,622
\\CIFS_Filer_name\Long\long\path\to\file_Report.pdf [Sent]

* Actual FAILure of the scheduled backup event is often from an
inability to access a path or (Client) domain, from inability to perform
a specific option in an options file (such as an include statement, or a
pre-scheduled command), or from permissions errors:
08/22/2011 15:34:30 ANS4013E Error processing '\\Path\to\some_file:
invalid file handle

08/22/2011 15:43:02 ANS1512E Scheduled event '6PM-DAILY-INCR' failed.
Return code = 12

08/30/2011 01:27:43 ANS4007E Error processing
'\\rrstore11a\assetmgmt\Users\ADoyle\IEfavs\Links\Customize Links.url':
access to the object is denied
        These ANS4007E messages are a pain for me.  Often they indicate
that the Client is running from a profile with insufficient permissions
to access the file.  In my case these are files on a CIFS share.  The
filer, CIFS, VSCAN, and Virus software and not working together well.

On Windows clients if you are backing up SystemState and there are VSS
errors...welcome to the club.  They will fail your Scheduled backup
event.  Looking at one of mine that I see failed for VSS/SystemState
errors, there is no mention of the VSS error/failure in the TSM Server
Actlog.

* You can look for these message codes in the TSM Server ACTLOG to
determine Failed (and Missed) backups:
***MISSED/FAILED
FAILED: q ac begint=-24 msg=2579
MISSED: q ac begint=-24 msg=2578
        You can also do them as selects.

Checking for this with a TSM Server "query actlog msg=4959" will NOT
tell you if the backup failed
08/30/2011 20:58:23      ANE4959I (Session: 4580433, Node:
<Some_Client_Name>)  Total
                          number of objects failed:           4
(SESSION: 4580433)
For example, in the summary information reported the the TSM Server by
the Client (and logged in the Actlog) there are 15 objects failed, but
the backup was successful
                          4579578)
08/30/2011 18:27:02      ANE4952I (Session: 4579578, Node:
<SOME_CLIENT_NAME>)  Total
                          number of objects inspected:   44,415
(SESSION: 4579578)
08/30/2011 18:27:02      ANE4954I (Session: 4579578, Node:
<SOME_CLIENT_NAME>)  Total
                          number of objects backed up:    3,360
(SESSION: 4579578)
08/30/2011 18:27:02      ANE4958I (Session: 4579578, Node:
<SOME_CLIENT_NAME>)  Total
                          number of objects updated:          0
(SESSION: 4579578)
08/30/2011 18:27:02      ANE4960I (Session: 4579578, Node:
<SOME_CLIENT_NAME>)  Total
                          number of objects rebound:          0
(SESSION: 4579578)
08/30/2011 18:27:02      ANE4957I (Session: 4579578, Node:
<SOME_CLIENT_NAME>)  Total
                          number of objects deleted:          0
(SESSION: 4579578)
08/30/2011 18:27:02      ANE4970I (Session: 4579578, Node:
<SOME_CLIENT_NAME>)  Total
                          number of objects expired:          7
(SESSION: 4579578)

08/30/2011 18:27:02      ANE4959I (Session: 4579578, Node:
<SOME_CLIENT_NAME>)  Total
                          number of objects failed:          15
(SESSION: 4579578)

08/30/2011 18:27:02      ANE4965I (Session: 4579578, Node:
<SOME_CLIENT_NAME>)  Total
                          number of subfile objects:          0
(SESSION: 4579578)
08/30/2011 18:27:02      ANE4961I (Session: 4579578, Node:
<SOME_CLIENT_NAME>)  Total
                          number of bytes transferred:  1.16 GB
(SESSION: 4579578)
08/30/2011 18:27:02      ANE4963I (Session: 4579578, Node:
<SOME_CLIENT_NAME>)  Data
                          transfer time:                   35.93 sec
(SESSION:
                          4579578)
08/30/2011 18:27:02      ANE4966I (Session: 4579578, Node:
<SOME_CLIENT_NAME>)
                          Network data transfer rate:        34,077.75
KB/sec
                          (SESSION: 4579578)
08/30/2011 18:27:02      ANE4967I (Session: 4579578, Node:
<SOME_CLIENT_NAME>)
                          Aggregate data transfer rate:      2,410.23
KB/sec
                          (SESSION: 4579578)
08/30/2011 18:27:02      ANE4968I (Session: 4579578, Node:
<SOME_CLIENT_NAME>)
                          Objects compressed by:                    0%
(SESSION:
                          4579578)
08/30/2011 18:27:02      ANE4969I (Session: 4579578, Node:
<SOME_CLIENT_NAME>)
                          Subfile objects reduced by:               0%
(SESSION:
                          4579578)
08/30/2011 18:27:02      ANE4964I (Session: 4579578, Node:
<SOME_CLIENT_NAME>)
                          Elapsed processing time:            00:08:28
(SESSION:
                          4579578)
08/30/2011 18:27:02      ANR2507I Schedule 6PM-DAILY-INCR for domain
SUPT started at
                          08/30/11 18:00:00 for node <SOME_CLIENT_NAME>
completed
                          successfully at 08/30/11 18:27:02. (SESSION:
4579578)

You can check the TSM Actlog for messages like these to see if the
failed file is critical:
08/30/2011 18:26:33      ANE4987E (Session: 4579578, Node:
<SOME_CLIENT_NAME>)  Error
                          processing '\\<Some_Client_Name>\c$\Program
                          Files\SOMEAPP FND v99.2\Log\User
                          Changes.dat': the object is in use by another
process
                          (SESSION: 4579578)

George Huebschman
Legg Mason, LMTS
"When you have a choice, spend your money where you would want to work
if it was your only choice."

-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf Of
Botelho, Tiago (External)
Sent: Wednesday, August 31, 2011 1:36 PM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: [ADSM-L] TSM Client Return Codes - Failed or partial failed



Hello,





I have a 160  (daily backup) environment and I need to perform a daily
report.



Several Client Nodes  report Status: Failed  Result: Errors.



In those cases I have to login in the TSM Node and check dsmsched.log
and dsmerror.log to check if it fails completed or not.



In some situations, the client Backup fails completed. Others the
errorlog.log / dsmsched.log report open files or files not found ( but
in the report send by TSM Management console I cannot see the
difference).



I'm looking a solution that can see the difference between a complete
failure and a partial failure (like open files or files not found) to
use on TSM Management console (SQL statements).



There are any SQL query that can show me the difference?



Any event that can be disable on the TSM Server to prevent this.



Any ideas?





Thank you for your delp





Cumprimentos / Best regards

Tiago Botelho
T_Systems at Volkswagen



IMPORTANT:  E-mail sent through the Internet is not secure. Legg Mason 
therefore recommends that you do not send any confidential or sensitive 
information to us via electronic mail, including social security numbers, 
account numbers, or personal identification numbers. Delivery, and or timely 
delivery of Internet mail is not guaranteed. Legg Mason therefore recommends 
that you do not send time sensitive 
or action-oriented messages to us via electronic mail.

This message is intended for the addressee only and may contain privileged or 
confidential information. Unless you are the intended recipient, you may not 
use, copy or disclose to anyone any information contained in this message. If 
you have received this message in error, please notify the author by replying 
to this message and then kindly delete the message. Thank you.

<Prev in Thread] Current Thread [Next in Thread>