ADSM-L

Re: GetHostnameOrNumber error

2004-05-17 14:38:18
Subject: Re: GetHostnameOrNumber error
From: Richard Sims <rbs AT BU DOT EDU>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Mon, 17 May 2004 14:38:07 -0400
Just some thoughts on this.  Someone else may have had direct experience
with this error condition.

>We have a Windows XP client ( TSM version 5.1.0.1 ) backing up to
>our AIX TSM server ( TSM version 5.1.8.0 ).

As we so often stress on the List, staying at base level is a Bad Idea:
get that client level boosted to a higher, server-compatible level, to
avoid a lot of problems.

>It is initiated via the GUI supposedly to backup the whole C: drive but
>the 'Last Backup Start' and 'Last Backup Completion' times are not being
>updated on the server. It appears to be performing backups as seen in the
>activity log.

The timestamps not getting updated suggests that the backup did not really
complete, or perhaps was not an unqualified Incremental type.
The client backup log is more definitive: have the machine's owner pore
over that log for problem indications.

>The user whose machine this is has sent me the error log and it contains an
>error at the time the backup session is ending :
>
>2004.05.11 13:50:17 GetHostnameOrNumber(): gethostbyname(): errno = 11001.
>2004.05.11 13:50:17 TcpOpen: Could not resolve host name.
>2004.05.11 13:50:17 sessOpen: Failure in communications open call. rc: -53
>2004.05.11 13:50:17 ANS1029E Communications have been dropped.
>
>Could this be a consequence of being behind a Cisco PIX firewall ?

Perhaps.  There's obviously a DNS service problem here that needs to get
fixed.  If there is a private subnet address range being employed behind the
firewall, it makes things more interesting, but not unworkable.

>Why does it need to resolve a hostname in any case ?

Why not?  ;-)  The lookup may be incited by using network hostnames in
options and config files, rather than IP addresses.  Sometimes these files
never get updated, despite the environment changing: one of these files may
contain an obsolete network hostname.  You can use nslookup or similar aids
to track down DNS problems.

>Our network team have monitored the firewall and there do appear to be
>some timeouts occurring after 30 mins ( the TSM server timeout is set to
>10 mins ).

The timeout numbers are not significant unless you can match them to one of the
multiple timeout values which can be coded in the server.
More important would be the IP addresses and port number(s) involved.

>From the posting, we don't have perspective on whether this is a longstanding
problem or just started recently.  If the latter, look at environmental changes
which occurred around when it started.  Whereas the problem should be
reproducible, you can have the client owner perform escalating dsmc queries
and mini backups to help isolate the problem.  Don't overlook an errant
POSTSchedulecmd thingie as contributing to the problem.

If need be, perform a client trace to isolate the issue.  The TSM Problem
Determination Guide will help, as will
      "The TSM Client - Diagnostics":
       http://adsm-symposium.oucs.ox.ac.uk/2001/papers/Raibeck.Diagnostics.PDF

  Richard Sims   http://people.bu.edu/rbs/ADSM.QuickFacts

<Prev in Thread] Current Thread [Next in Thread>