ADSM-L

Re: Linux client pathology

2003-12-09 12:36:19
Subject: Re: Linux client pathology
From: Remco Post <r.post AT SARA DOT NL>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Tue, 9 Dec 2003 18:35:37 +0100
On Tue, 9 Dec 2003 11:29:59 -0500
Thomas Denier <Thomas.Denier AT MAIL.TJU DOT EDU> wrote:

> We have been backing up a Linux client with about five million files on
> it. The system administrator has told me that he recently updated the
> include/exclude files to exclude a portion of /var that accounts for

5 million files in /var??? amazing :-)

> most of the file. I have not so far been able to verify this be
> inspecting the configuration files myself. Ever since the change, we
> have been seeing strange and very troublesome behavior. The client

did the admin use exclude or exclude.dir, the latter may save you a lot of
directory traversal.

> will send about eight gigabytes of data to the server over the course
> of about twelve hours. At that point data transfer will slow to a
> crawl and all other sessions and processes will perform poorly until
> the session is cancelled. We have been through three iterations so
> far, transferring about 23 gigabytes from client to server (I stopped
> the last one at about 7 gigabytes). About one gigabyte out of the 23
> can be accounted for by data written to storage pools. I usually don't
> get the various session statistics messages. In the one case where I
> did, they reported 0 objects deleted and only 55 objects expired.

Depending on the filesystem used, Linux can be very slow in listing large
dirs. The default listing requires in memory sorting of the dir (TSM does
this as well) and with very large dirs, with lots of files, you may suffer
from horrific performance on the client just from listing. Most likely, a ls
in that one dir will either crash or hang forever....

> I would expect a substantial amount of data movement from client to
> server after excluding large numbers of files as the client told the
> server to expire the files. However, this would presumably be
> comparable to the amount of file status information sent from server
> to client each night before the change. This was about a gigabyte.

I would expect that your server has never had a decent back-up of that one
dir, that TSM is still listing that dir since the admin didn't use
exclude.dir and that you have very files to expire.

> Our entire TSM database is only about 20 gigabytes. The client code
> is at the 5.1.6.0 level. The Linux kernel is at the 2.4.20-24 level.
> The TSM server is at the 5.1.7.2 level and runs under OS/390.

5 milion object would eat about 1.5 GB (give or take half a GB) expiration of
these files must be noticable on your server.

--
Met vriendelijke groeten,

Remco Post

SARA - Reken- en Netwerkdiensten                      http://www.sara.nl
High Performance Computing  Tel. +31 20 592 8008    Fax. +31 20 668 3167

"I really didn't foresee the Internet. But then, neither did the computer
industry. Not that that tells us very much of course - the computer industry
didn't even foresee that the century was going to end." -- Douglas Adams

<Prev in Thread] Current Thread [Next in Thread>