ADSM-L

Re: Help! AIX HSM problem

2000-03-21 03:27:09
Subject: Re: Help! AIX HSM problem
From: "Mauro M. TINELLI" <Mauro.TINELLI AT ST DOT COM>
Date: Tue, 21 Mar 2000 09:27:09 +0100
John,

what about the "FS" structure you are using? Is it possible you've
reached the point to fill up some table (i-node table full in the old FSs
or fragmentation or whatever else in that structure of FS you're using)
so that the space is not actually the problem? By the way I didn't find
out the information about what kind of FS you're using. What I mean is
that your FS might allocate 4meg immediately and looks for further space
which it cannot find. Do you see anything interesting in the syslog or
equivalent file?

Ciao,

Mauro/STM

> On Fri, Mar 17, 2000 at 10:42:12AM -0500, Richard Sims wrote:
> > >Please check /etc/security/limits or use smitty users. You probably
want =
> > >to set fsize to -1 and fsize_hard to -1. This is in AIX not ADSM or
HSM.
> >
> > The /etc/security/limits fsize value would affect the creation or
extension
> > of all files operated upon by the designated user; but John indicated
that
> > "I can create files bigger than 4MB on other filesystems w/o
problems",
> > so it doesn't look like a limits problem.
>
> That's right; it's not a limit problem.
>
> In another email, Richard Sims continued:
> >        The most baffling thing about HSM, for its users and
> > administrator, is that the space remaining on the file system
> > as per 'df' and 'dsmdf' has NOTHING to do with how much more
> > data can be copied into the file system.  We often see our modest
> > HSM file systems some 85% full, saying they have like 80MB free,
> > and still we get File System Full when trying to copy in a file
> > substantially smaller than that.  This seems to be a manifestation
> > of the FSM device driver having true control over what happens,
>
> Yup.  The FSM driver basically wedges itself between the
> kernel/processes and the lower level disk drivers.  When the
> filesystem is full, rather than returning a full condition to the
> kernel/process, it blocks the process and tries to clear off
> space.  Once enough space is available, the process can then
> continue.  Likewise, when the kernel/process tries to read a migrated
> file, the driver blocks the process while it restores the data from
> the ADSM server.
>
> > and it thinking perhaps that it has to save space for the .Spaceman
> > db.
>
> Actually, I would LOVE it if did this (reserved space for the
> .SpaceMan files), but it doesn't seem to, as in the past, we've
> managed to fill the system to the point where dsmreconcile doesn't
> have enough space to write out its candidates file or create temp
> files when migrating files, which would cause problems for HSM.
>
> That said, I have some more observations on this problem.  I've now
> determined that this problem shows up only when the filesystem gets to
> around 72% full.  Once it's at this point, if I try to "cp" a file to
> this filesystem, the cp hangs once it's copied 4MB.  However, if I
> leave the cp process running and delete another file off the
> filesystem, the cp succeeds.  It's just as if the filesystem were
> really 100% full, and the FSM driver is blocking the cp until space is
> freed.  If the filesystem is below 72% full, then I have no problems
> whatsoever.
>
> I'm still not 100% convinced it's HSM/the FSM driver and not the
> underlying filesystem/raid/disk drivers at fault, but the scale has
> now tipped towards HSM.
>
> I'm still open to suggestions on how to fix this... :)
>
> John
>
>
-------------------------------------------------------------------------
> John Valdes                        Department of Astronomy &
> John Valdes                        Department of Astronomy &
Astrophysics
> j-valdes AT uchicago DOT edu                               University of
Chicago
<Prev in Thread] Current Thread [Next in Thread>