Networker

Re: [Networker] HDS experiences ?

2003-08-13 09:59:24
Subject: Re: [Networker] HDS experiences ?
From: "David E. Nelson" <david.nelson AT NI DOT COM>
To: NETWORKER AT LISTMAIL.TEMPLE DOT EDU
Date: Wed, 13 Aug 2003 08:59:19 -0500
Hi Mic,

We've got a 9970 w/ ShadowImage (SI) w/ several Sun servers attached. I wrote
the Legato NW scripts to make it all work and so far it's working very well.

Unless things recently changed, NW does not natively support the HDS API's
unless you use NW's Oracle module (not sure what version).  Yes, we are an
oracle shop but we also use internal scripts to stop/start/hot-backup oracle.

Anyways, here's the architecture:

- Sun/Oracle systems connect directly to the 9970 and have a shadow image
configured

- I attached one E-250 to the 9970 via the SAN.  I also zoned out 2x9840A's and
installed the NW storage node s/w on the E-250.  The E-250 is licensed as a
'SAN Storage Node'.
nn
- All the above systems use 'savepnpc' scripts to make the majic happen.

- When I implemented NW here, my s/w architecture included a central korn shell
savepnpc script (more of a library since it isn't called directly) that
contains all the neccessary subroutines for handling the core functions.  This
script is accessible by NFS by all clients.

- In addition to the /nsr/res/ savepnpc .res file.  The .res file calls the
clients private ksh savepnpc script as appropriate which sources the library
script above.

- With the 9970 it got a little tricky in the beginning because you have two
systems involed and somehow need to coordiate the actions between the two.
For this, we use HDS's HORCM which is configured on all the Sun servers and the
E-250.  Our setup works as follows:

  - Each 9970 NW group has two clients - the Sun server and the E-250.

  - Backups start on both systems at the same time.

    - The 250 immediately starts a resync and then waits for the SVOL to reach
      PSUS state using the 'pairevtwait' cmd.

    - The Sun server immediatly starts shutdown down Oracle or putting Oracle
      into hot backup mode and waits for the PVOL to reach the PAIR state.

  - When PAIR state is reached, I/O buffers are flushed using 'lockfs' then
   splits the PVOL - this triggers the PSUS state which the E-250 is waiting
   for.  This completes the HDS actions on the server.

  - The E-250, now having detected that the SVOL has reached the PSUS state,
    performes a quick fsck of the volume.  If that failes, an intensive fsck is
    performed.

  - The E-250 then tunes the filesystem for I/O using 'maxcontig'.

  - The E-250 then mounts the filesytem read-only.  This has the added benefit
    of disabling 'atime' updates which further increases I/O.

  - The E-250 then backs up the filesystem to 2x9840A's.

  - The E-250 then unmounts the filesystem in it's post-process actions.

And that's about it.  I do make extensive use of error return codes to handle
problems.

Feel free to ask more questions.

Regards,
        /\/elson

--
Note: To sign off this list, send a "signoff networker" command via email
to listserv AT listmail.temple DOT edu or visit the list's Web site at
http://listmail.temple.edu/archives/networker.html where you can
also view and post messages to the list.
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=

<Prev in Thread] Current Thread [Next in Thread>