ADSM-L

Re: Help! AIX HSM problem

2000-03-17 10:23:08
Subject: Re: Help! AIX HSM problem
From: Miles Purdy <PURDYM AT NISA.GC DOT CA>
Date: Fri, 17 Mar 2000 09:23:08 -0600
This sounds like a problem with users who are trying to create the files. UNIX 
has a little security thing that prevents uses from creating large files and 
filling up the filesystems.

Please check /etc/security/limits or use smitty users. You probably want to set 
fsize to -1 and fsize_hard to -1. This is in AIX not ADSM or HSM.

Miles
>>> John Valdes <j-valdes AT UCHICAGO DOT EDU> 16-Mar-00 6:53:33 PM >>>
All,

More HSM problems... (Actually, I'm not certain if it's an HSM problem
or something else; keep reading).

I'm running the ADSM HSM v.3.1.0.8 client on AIX 4.2.1.  This system
as just recently developed a very odd problem: I can't create files
larger that 4MB on its HSM filesystem.  If I ftp a file to the HSM
filesystem, the transfer hangs once 4MB has been written (*exactly*
4MB, 4194304 bytes).  Likewise, the same problem happens if I try to
'cp' a file from another filesystem to the HSM filesystem; the cp
hangs once 4MB has been written.  If I try to recall a migrated file,
the recall process hangs once 4MB have been recalled.  I can create
files smaller than 4MB w/o a problem.  Likewise, I can create files
bigger than 4MB on other filesystems w/o problems.

The filesystem *isn't* full; there's over 70GB free.  Also, the
filesystem hasn't yet reached it's migration threshold (it's at 72%
full, while the threshold is at 80%).  We aren't using Unix quotas on
this filesystem.

Some numbers:

  prompt> df -k /hsm
  Filesystem    1024-blocks      Free %Used    Iused %Iused Mounted on
  /dev/hsm        248512512  71509660   72%  1183882    16% /hsm
  /hsm            248512512  71509660   72%  1183882    16% /hsm

  prompt> ddf
  FSM             FS      Mgrtd   Pmgrtd  Mgrtd   Pmgrtd  Unused  Free
  Filesystem      State   KB      KB      Files   Files   Inodes  KB

  /hsm              a     2406853784
                                  43155716
                                          1117567 23993   6582134 71509660

  prompt> dsmmigfs query /hsm
  File System   High    Low     Premig  Age     Size    Quota   Stub    Server
  Name          Thrshld Thrshld Percent Factor  Factor          Size    Name

  /hsm          80      20      -       1       1       3000000 4095    ADSM

From the 'dsmmigfs query', you can see that we haven't exceeded the
HSM quota.  We haven't run out of space in the storage pool on the
ADSM server:

  adsm> q stg hsm_tape

  Storage      Device       Estimated    Pct    Pct  High  Low  Next
  Pool Name    Class Name    Capacity   Util   Migr   Mig  Mig  Storage
                                 (MB)                 Pct  Pct  Pool
  -----------  ----------  ----------  -----  -----  ----  ---  -----------
  HSM_TAPE     3590TAPE    4,712,064.   70.9   78.8    90   70
                                    8

There are no errors in dsmerror.log, and nothing has been logged to the
system console nor to the system error log.

The only other unique thing about this filesystem, other than using
HSM, is that it resides on an SSA RAID-5 array.  The underlying
filesystem is actually composed of two 128GB "physical" volumes, where
each physical volume is really an SSA RAID-5 array made up of 15 9GB
SSA disks.  The SSA arrays are attached to an "IBM SSA Enhanced RAID
Adapter (14104500)" *WITHOUT* the 4MB fast-write cache option:

  prompt> lscfg -v -l ssa0
  DEVICE            LOCATION          DESCRIPTION

  ssa0              04-02             IBM SSA Enhanced RAID Adapter
                                      (14104500)

        Part Number.................009L2060
        FRU Number..................009L2060
        Serial Number...............C9516003
        EC Level....................0000F23660
        Manufacturer................IBM053
        ROS Level and ID............6701
        Loadable Microcode Level....04
        Device Driver Level.........00
        Displayable Message.........SSA-ADAPTER
        Device Specific.(Z0)........DRAM=032
        Device Specific.(Z1)........CACHE=0
        Device Specific.(Z2)........00000020354e90de

The RAID arrays are in a good state:

  prompt> ssaraid -I -l ssa0 -n hdisk[34] -z
  hdisk3          24903737A5544CK good                    127.6GB RAID-5 array
  hdisk4          249037E6DFD84CK good                    127.6GB RAID-5 array

Except for occasion fits w/ dsmreconcile, this system has been working
perfectly for over a year.

Anyone have any ideas?

With this problem, we can't load any new data onto our system, nor can
we recall the over 2TB of data that we already have loaded into the
system.  Needless to say, any help would be greatly appreciated!

John

-------------------------------------------------------------------------
John Valdes                        Department of Astronomy & Astrophysics
John Valdes                        Department of Astronomy & Astrophysics
j-valdes AT uchicago DOT edu                               University of Chicago

-------------------------------------------------------------------------------------------
----------------
Miles Purdy, System Manager
Miles Purdy, System Manager
purdym AT nisa.gc DOT ca
Net Income Stabilization Account
Winnipeg, MB, CA
(204) 984-1602
<Prev in Thread] Current Thread [Next in Thread>