ADSM-L

Re: Backup failure

2006-11-02 00:58:32
Subject: Re: Backup failure
From: Roger Deschner <rogerd AT UIC DOT EDU>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Wed, 1 Nov 2006 23:55:36 -0600
Looks to me like it could be two different machines trying to use one
node name. The biggest clue that this is the problem is your message
ANR2576W. Reboot the client machine, in case it really did get two
schedulers running. But it's more likely two different machines trying
to share one nodename. At any rate, do

q actlog begindate=-2 search=GBTLSWIOA001

and just check the IP addresses it is connecting from. This is one way
to catch cheaters who use one node name on two machines. Here's what
you're looking for:


2006-11-01 23:15:25      ANR0406I Session 81867 started for node GBTLSWIOA001
                          (WinNT) (Tcp/Ip 111.222.111.16(2566)).
then a bit later...
2006-11-01 23:20:25      ANR0403I Session 81867 ended for node GBTLSWIOA001
                          (WinNT).

This is as it sould be. But if you have

2006-11-01 23:15:25      ANR0406I Session 81867 started for node GBTLSWIOA001
                          (WinNT) (Tcp/Ip 111.222.111.16(2566)).
2006-11-01 23:25:25      ANR0406I Session 81895 started for node GBTLSWIOA001
                          (WinNT) (Tcp/Ip 111.222.111.50(2566)).

...Now you've got your perpetrator. They're running two machines on one
nodename, and you now have the IP addresses of both of them. Go gettum!

Q NODE GBTLSWIOA001 F=D shows the IP address this node last connected
from, which can be useful.

Beware that IP addresses can change if the client node uses DHCP (i.e. a
laptop), but even if the IP addresses change, you should see a start and
an end. If you see several starts from the same IP address, that is
normal too, especially if they have RESOURCEUTILIZATION set higher than
1. What you are looking for is several starts from different IP
addresses before you get the matching ends.

Also try

q filespace GBTLSWIOA001

and see if you can eyeball what looks like two different primary drives:

ROGERD.ADSM1       \\rogerd\c$       1    WinME
ROGERD.ADSM1       \\861077\c$       3    WinNT

Aha! This one has two different Windows C:\ drives. This is harder to
spot on unix-ish systems (including Mac OSX) because they all have the
same filespace names. You can eliminate false positives here by doing
Q FILESPACE F=D which shows the last backup start and end dates. It's
possible that one of those duplicate C:\ drives is from an old OS or
old machine that got upgraded, and backup dates will tell you that.

The most frequent cause of cheating like this is not people who set
out to beat the system, but rather people who replace their computer
with a new one and give away their old computer to a lucky colleague.
It still has the old TSM client including the scheduler on it, and
like the Energizer Bunny, it keeps going, and going, and going. We
find that most people with these Energizer Bunny nodes don't even know
they're backing up. The key message to look for to spot this kind of
problem is the ANR2576W you show below.

You might want to set minimum throughput thresholds, also. See the TSM
Admin Guide.

Roger Deschner      University of Illinois at Chicago     rogerd AT uic DOT edu


On Wed, 1 Nov 2006, Gopinathan, Srinath wrote:

>Hi All,
>
>I am having a backup which is failing regularly. There are no objects
>which is showing as failing. However, the backup is failing with the
>following errors.
>
>Any help on this would be appreciated.
>
>Regards,
>Srinath G
>
>10/31/2006 15:51:51  ANR0482W Session 694902 for node GBTLSWIOA001
>(WinNT)
>                      terminated - idle for more than 750 minutes.
>(SESSION:
>                      694902)
>10/31/2006 15:51:51  ANR2579E Schedule TBO_SEV3_INCR_2 in domain SWITBOW
>for
>                      node GBTLSWIOA001 failed (return code 12).
>(SESSION:
>                      694889)
>10/31/2006 15:51:51  ANR2576W An attempt was made to update an event
>record for
>                      a scheduled operation which has already been
>executed -
>                      multiple client schedulers may be active for node
>                      GBTLSWIOA001. (SESSION: 694889)
>10/31/2006 15:51:51  ANR0480W Session 694889 for node GBTLSWIOA001
>(WinNT)
>                      terminated - connection with client severed.
>(SESSION:
>                      694889)
>
>This e-mail has been scanned for viruses by the Cable & Wireless e-mail 
>security system - powered by MessageLabs. For more information on a proactive 
>managed e-mail security service,  visit http://www.cw.com/uk/emailprotection/
>
>The information contained in this e-mail is confidential and may also be 
>subject to legal privilege. It is intended only for the recipient(s) named 
>above. If you are not named above as a recipient, you must not read, copy, 
>disclose, forward or otherwise use the information contained in this email. If 
>you have received this e-mail in error, please notify the sender (whose 
>contact details are above) immediately by reply e-mail and delete the message 
>and any attachments without retaining any copies.
>

<Prev in Thread] Current Thread [Next in Thread>