ADSM-L

Re: ADSM/HSM problem

1997-07-02 09:59:02
Subject: Re: ADSM/HSM problem
From: Henk ten Have <hthta AT SARA DOT NL>
Date: Wed, 2 Jul 1997 15:59:02 +0200
 Hi to all of you who listen and likes to play with HSM or willing
 to do so in the future:

(I had problems with migration which did not run on one filesystem in spite of:
 1. used filespace was far above High Threshold for the filesystem;
 2. dsmmonitord was running;
 3. there was a candidate list (> 50.000 files);
 4. CHECKTHRESHOLDS in dsm.sys .eq. 2;
 5. Space Management Technique = Automatic;
 6. Backup Required Before Migration: Yes, and YES all the candidates were
    backup-ed;
 7. and migration did run very well for more then 9 month;
 And there was no HSM activity-/event- or errorlog)

 WELL, THE ANSWER IS:

-dsmreconcile /ssa/home04:
  Reconciling '/ssa/home04' file system:
  Reconciling '/ssa/home04' file system:
   ......
    Writing the new migration candidates list...
           Wrote 57216 entries
  ANS9250I File system '/ssa/home04' reconciliation completed.

-head -3 /ssa/home04/.Spa*/can*:
   410445 420296249 867785752 sscphenk/tmp/exportfs
   410445 420296249 867785752 sscphenk/tmp/exportfs
   .man                                                          <------ ?
   410440 420290568 867785654 sscphenk/tmp/a45
   205220 210145284 867765554 sscphenk/tmp/a3

-dsmmigfs Q /ssa/home04:
   ADSTAR Distributed Storage Manager
   ADSTAR Distributed Storage Manager
   space management Interface - Version 2, Release 1, Level 0.6
   (C) Copyright IBM Corporation, 1990, 1996, All Rights Reserved.
   File System     High    Low     Premig  Age     Size    Quota   Stub File
   Name            Thrshld Thrshld Percent Factor  Factor          Size
   /ssa/home04     90      80      2       500     1       13000000        4095

-df -tk /ssa/home04:
   Filesystem    1024-blocks      Used      Free %Used Mounted on
   Filesystem    1024-blocks      Used      Free %Used Mounted on
   /dev/lvhome04    13189120  11039684   2149436   84% /ssa/home04
   /ssa/home04      13189120  11039684   2149436   84% /ssa/home04

-dsmmigfs upd -hth=82 -lth=80 -a=500 /ssa/home04
-dsmmigfs Q /ssa/home04
-dsmmigfs Q /ssa/home04
   ADSTAR Distributed Storage Manager
   ADSTAR Distributed Storage Manager
   space management Interface - Version 2, Release 1, Level 0.6
   (C) Copyright IBM Corporation, 1990, 1996, All Rights Reserved.
   File System     High    Low     Premig  Age     Size    Quota   Stub File
   Name            Thrshld Thrshld Percent Factor  Factor          Size
   /ssa/home04     82      80      2       500     1       13000000        4095

 AND THERE GOES HSM:

   ====> dsmautomig started, 1997/07/02 13:48:50
   ANS9126E dsmautomig.program: cannot get the state of space management for 
/ssa/home04/sscphenk/tmp/exportfs: No such file or directory.
   ADSTAR Distributed Storage Manager
   space management Interface - Version 2, Release 1, Level 0.6
   (C) Copyright IBM Corporation, 1990, 1996, All Rights Reserved.
   <==== dsmautomig ended, 1997/07/02 13:48:59

 AND THAT WAS THE END OF HSM, NOT MIGRATING A SINGLE BIT and
 the first candidate from the candidate list was gone:

-head -3 /ssa/home04/.Spa*/can*
   man                                                           <------ ?
   man                                                           <------ ?
   410440 420290568 867785654 sscphenk/tmp/a45
   205220 210145284 867765554 sscphenk/tmp/a3

 AND THEN AFTER 2 MINUTS THERE GOES HSM AGAIN:

   ====> dsmautomig started, 1997/07/02 13:50:50
   ADSTAR Distributed Storage Manager
   space management Interface - Version 2, Release 1, Level 0.6
   (C) Copyright IBM Corporation, 1990, 1996, All Rights Reserved.
   <==== dsmautomig ended, 1997/07/02 13:50:52

 AND AGAIN WITHOUT MIGRATING A SINGLE BIT (and there are still 57215
 cadidates left)
 The dsmautomig started every two minuts without doing anything.

 What was this first candidate for kind of file anyway?

-ls -l /ssa/home04/sscphenk/tmp/expor*
   -rw-r-----   1 sscphenk sscp     420296249 Jul  1 20:48 exportfs
   -rw-r-----   1 sscphenk sscp     420296249 Jul  1 20:48 exportfs
   .man                                                          <------ ?

 Hmmm, this looks strange:

-ls -lb /ssa/home04/sscphenk/tmp/expor*
   -rw-r-----   1 sscphenk sscp     420296249 Jul  1 20:48 exportfs\012.man
   -rw-r-----   1 sscphenk sscp     420296249 Jul  1 20:48 exportfs\012.man

Summary:
 When you get by accident such a "bad" filename on top of your candidate
 list, and some silly string is left behind, migration is stopped.
 I think this is what we called a BUG.

 And for those who likes to play this scenario on a very simple way:
 put the string "NO MIGRATION PLEASE!" on top of your candidate list
 and run dsmautomig /filename and see what happens.
 (Well I can tell you: the first N is gone and what's left is
  "O MIGRATION PLEASE!" and you will see no warning or error or whatever
   and in spite of this exclamation, no migration will take place ;-)

 And special thanks to Richard Sims (HSM is "fun", isn't it?)
 who gave me the tip (and scripts) to put a shell wrapper around
 dsmautomig and dsmreconcile to capture the output in log files, so
 at least I could see what happens and what should have happen.

Regards,

Henk ten Have                   SARA, Academic Computing Services Amsterdam
Systems Programmer              Kruislaan 415
E-mail: hthta AT sara DOT nl           1098 SJ Amsterdam
Phone : +31205923000            The Netherlands

(running ADSM V2.1.0.12 on a SP2 (AIX 4.1.4.0) using 3494 with 4 3590 units,
 backup/archive client's 2.1.0.6 on AIX 3.2, AIX 4.1.4, SunOs, Solaris,
 IRIX, Win95, WinNT, HSM client's 2.1.0.6 on AIX 4.1.4.0)
<Prev in Thread] Current Thread [Next in Thread>
  • Re: ADSM/HSM problem, Henk ten Have <=