1. Forum Rules (PLEASE CLICK HERE TO READ BEFORE POSTING) Click the link to access ADSM.ORG Acceptable Use Policy and forum rules which should be observed when using this website. Violators may be banned from this website. This message will disappear after you have made at least 12 posts. Thank you for your cooperation.

TSM and PowerHA (HACMP)

Discussion in 'Restore / Recovery Discussion' started by DanGiles, Aug 1, 2012.

  1. DanGiles

    DanGiles Senior Member

    Joined:
    Oct 25, 2002
    Messages:
    570
    Likes Received:
    10
    Occupation:
    Sr. Storage Admin
    Location:
    Toronto, Ont. Canada
    We will soon be implementing TSM on AIX with PowerHA. The environment will consistest of two servers - one library manager and one client - plus a number of LAN-free clients. There is not much out there in terms of documentation, and I am mostly concerned with what happens to drives and tapes that are in use during a fail-over.

    I went through the redbook for TSM 5.5 & HACMP, and it looks like TSM should clean itself up on start-up. The problem is that I've seldom experienced this 'clean-up' in real life! There's usually a lot of manual reseting of drives and volumes after a non-graceful shut-down.

    So, anyone using this configuration in real life? Any caveats or recommendations?
     
  2.  
  3. moon-buddy

    moon-buddy Moderator

    Joined:
    Aug 24, 2005
    Messages:
    6,258
    Likes Received:
    282
    Occupation:
    Electronics Engineer, Security Professional
    Location:
    Somewhere in the US
    Sorry Dan,

    I have no experience with HACMP deployment of TSM.

    But, if all that is adverstised is true, device fail-over should be 'automatic' with AIX. The reason I am saying this is the fact that IBM's DS8000 series SAN arrays uses AIX boxes to control the disk arrays. Failover works almost magically.

    If you apply the same logic to a HACMP-defined TSM environment, the device is really under the control of the 'virtual' cluster environment which TSM sees. Therefore, TSM does NOT care which node holds the devices. It just interfaces with the 'virtual' machine.
     
  4. DanGiles

    DanGiles Senior Member

    Joined:
    Oct 25, 2002
    Messages:
    570
    Likes Received:
    10
    Occupation:
    Sr. Storage Admin
    Location:
    Toronto, Ont. Canada
    I have no concern over the actual transferance of resources between machines. My concerns refer to my other post: what happens on fail-over when drives are in use? Will TSM actually clean up the library (unload drives, move cartridges) when it restarts?
     
  5. moon-buddy

    moon-buddy Moderator

    Joined:
    Aug 24, 2005
    Messages:
    6,258
    Likes Received:
    282
    Occupation:
    Electronics Engineer, Security Professional
    Location:
    Somewhere in the US
    I believe the answer is 'it will stay as is and keep on running'.

    In HACMP settings, it is mandatory to have two fiber paths for the devices - one path to each node. This then takes care of the failover. Likewise, the library should have two paths - one for each node.

    The switch over is handled by the virtual environment.

    It is also my understanding that the devices is presented to TSM using its virtual device name and not using its real physical name/s.
     
    Last edited: Aug 1, 2012
  6. DanGiles

    DanGiles Senior Member

    Joined:
    Oct 25, 2002
    Messages:
    570
    Likes Received:
    10
    Occupation:
    Sr. Storage Admin
    Location:
    Toronto, Ont. Canada
    As I said, transfering resources is not my concern: it's transfering resources that are not in their default state that's the concern!
     
  7. moon-buddy

    moon-buddy Moderator

    Joined:
    Aug 24, 2005
    Messages:
    6,258
    Likes Received:
    282
    Occupation:
    Electronics Engineer, Security Professional
    Location:
    Somewhere in the US
    This is what I mean.

    If the cluster is true to its form, TSM would not care where the resource is: whether default or not, and whether it falls back to default or not. As far as TSM goes, it still thinks there is nothing that happened behind the scene since the virtual environment presents it consistently. Thus, things will keep on running.
     
  8. DanGiles

    DanGiles Senior Member

    Joined:
    Oct 25, 2002
    Messages:
    570
    Likes Received:
    10
    Occupation:
    Sr. Storage Admin
    Location:
    Toronto, Ont. Canada
    Nyet. Forget about HA for now. A client is backing up to a tape drive on the TSM server. The server goes down un-gracefully, leaving the tape in the drive. TSM comes back up and the client re-connects to finish its backup. What happens? Does the TSM remember its state when it went down a give the drive and cartridge back to the client, or does TSM goes "okay, you'll want this tape. Wait, the tape isn't in it's right element - I'll give you a new one and load it into this drive. Wait, there's a cartridge in there that shouldn't be - spit out an error!"

    According to the TSM/HA redbook, it does the former (more or less). According to experience, it does the latter.
     
  9. moon-buddy

    moon-buddy Moderator

    Joined:
    Aug 24, 2005
    Messages:
    6,258
    Likes Received:
    282
    Occupation:
    Electronics Engineer, Security Professional
    Location:
    Somewhere in the US
    I see your point from the get go.

    The latter statement is true for standalone systems; and I won't argue with you on that since that has and always been my experience.

    However, you introduced the HACMP picture. In all my work with clusters - not with TSM in particular - the well setup ones behave properly as advertised; and the app chugs along well. Thus, I am convienced that the former scenario will be true. TSM (as an application sitting on top of the HACMP environment) will pickup where it left-off after a failover occured.

    A power lost is another issue :)
     
  10. DanGiles

    DanGiles Senior Member

    Joined:
    Oct 25, 2002
    Messages:
    570
    Likes Received:
    10
    Occupation:
    Sr. Storage Admin
    Location:
    Toronto, Ont. Canada
    I guess this is one of those "I'll believe it when I see it" scenarios. I'll probably set up a string of tests, but if the first two actually work as advertised, I'll save myself a lot of time and trust them. ;p

    These servers are in major data centres, so we don't have to worry about power failures XO <tongue planted firmly in cheek>
     

Share This Page