ADSM-L

Re: TSM on AIX 4.3.3 goes to "sleep"

2002-03-15 20:27:08
Subject: Re: TSM on AIX 4.3.3 goes to "sleep"
From: asr AT UFL DOT EDU
Date: Fri, 15 Mar 2002 20:24:38 -0500
The two sessions you see starting up initially are probably the thread that
queries the server, and the thread that starts searching the filesystems.


=> On Sat, 16 Mar 2002 11:33:15 +1100, John Nawotka <john AT computerguy.com 
DOT au> said:

> We have one node that for some reason has started taking a lot longer to
> complete its backup than any of the other similar nodes.
> [...]
> 03/12/02   07:29:27 ANS1898I ***** Processed 1,180,000 files *****

While not extremely large, this is a decent number of files.  Depending on the
disk technology and the arrangement of the files, evaluating this many could
be a Big Deal.

I once had to deal with a directory structure of the form:

/some/path/[00-99]/[00-99]/[00-99]/

That's 100 directories, with 100 subdirectories, with 100 subdirectories.

1,010,100 directories.  And some files in there too. (usually fewer than the
number of directories)

This was on rather slow disk tech (Raid-5 4.5G drives) and it could take many
hours to run an incremental, even though the change rate on these filesystems
was very very low.

... Anyway, we had similar symptoms.  Once the client downloaded all its
information for the incremental, it chugged along for a -LONG- time without
having anything else to say to the server.  Once it came up with something, a
reconnect was indicated.

So the hiatus is not surprising or unusual.  What you ought to do is take a
look in the scheduler log, and see if you can figure out what filesystem takes
the majority of the time.  Once you've done that, poke around.  There's a good
chance you'll find a single directory with 300,000 files, or some such
frightening thing.

Once you've found it, find the person who's responsible, and give them a
talking to. ;)


Allen S. Rout
<Prev in Thread] Current Thread [Next in Thread>