ADSM-L

[ADSM-L] Power failures

2017-04-30 19:08:38
Subject: [ADSM-L] Power failures
From: "Harris, Steven" <steven.harris AT BTFINANCIALGROUP DOT COM>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Sun, 30 Apr 2017 23:04:51 +0000
Hi Ricky

To further the conversation, I have been around a long time and the number of 
total outages of supposedly redundant data centers I have seen is astonishing.

1. Small bank.  Tested the generator every month, no one checked the fuel 
level.  When we had a power failure the generator ran for only a few minutes.
2. Same bank  new building.  Had an unexpected power failure just as the switch 
that changed over the power from mains to generator was disassembled for 
maintenance.
3. State government computer centre.  Electrician dropped a spanner which 
shorted out the inside path of the UPS
4. Enterprise level data centre run by big IT outsourcer.  Big voltage 
fluctuations on external grid, so decision was taken to go to generator. 
Generator was running but no power because switch was in manual mode not auto.  
Ran out of air conditioning which caused some shutdowns, then ran out of UPS 
just before the problem was addressed.
5. Different Enterprise level data centre run by big IT outsourcer.  Total 
power outage after fierce thunderstorm.

Now that’s five, since 1990, but that’s is just what has happened to me. The 
moral is,  always expect an unexpected shutdown.

Cheers

Steve.



-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf Of 
Stefan Folkerts
Sent: Saturday, 29 April 2017 1:31 AM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Re: [ADSM-L] DR Rebuild please chime in.

Strange, first problem will only get worse without the TSM replication because 
you lose failover restores and future failover backup functions, and the second 
problem should be a reason for a generator and not a different DR solution I 
would think. :-)

Anyway, goog luck with the project. :-)

On Fri, 28 Apr 2017 at 15:48, Plair, Ricky <rplair AT healthplan DOT com> wrote:

> There were a number of problems this year that caused management to 
> rethink the TSM solution.
>
> One,  we have our TSM server on a Windows 2008 R2 Enterprise system 
> running TSM server version 6.3.4.0. and it uses Microsoft clustering.
> Somehow our clustering died and we lost the secondary TSM server and 
> it took almost 2 days to get back the primary TSM server. Then about a 
> month later we had a power outage (complete power outage) and lost the 
> entire data center. This corrupted the data on the TSM server and 
> caused a lot of different problems and basically had to be rebuilt 
> from scratch. Then when we got to the DR exercise approximately 2 
> months later a couple of the DB2 database were corrupt and could not 
> be restored from TSM. Sooooo, that meant that TSM was a problem and we 
> needed to change our backup solution around.
>
> And below is what they want to do. Fantastic!
>
> Around here if there is a problem, blame TSM.
>
>
> Ricky M. Plair
> Storage Engineer
> HealthPlan Services
> Office: 813 289 1000 Ext 2273
> Mobile: 813 357 9673
>
>
>
> -----Original Message-----
> From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf 
> Of Stefan Folkerts
> Sent: Friday, April 28, 2017 7:52 AM
> To: ADSM-L AT VM.MARIST DOT EDU
> Subject: Re: [ADSM-L] DR Rebuild please chime in.
>
> That should work, I am wondering why you are stopping with TSM 
> replication because as mentioned by Matthew moving away from the 
> application integration has it's downsides.
> So can you share the reasons with us?
> On the plus side, you get something close to continues replication so 
> things like DB log backups are offsite the moment they are done 
> locally something that TSM replication does not currently support.
>
> For me, the biggest thing you lose is that you lose the recovery of 
> damaged data on the local server from the replica, an automatic 
> mechanism with TSM replication.
> That would mean you have to have a copypool locally to protect the 
> data in the same way.
> This would be an important point for me, if you are running the 
> directory containerpool the impact on housekeeping might be limited 
> but I find the impact of a large copypool with deduplicating file 
> device type storagepools to be a disaster.
>
>
>
>
> On Fri, Apr 28, 2017 at 1:02 AM, Harris, Steven < 
> steven.harris AT btfinancialgroup DOT com> wrote:
>
> > Ricky
> >
> > I have something similar at my current gig.
> >
> > Database and landing storage pools are on V840 flash and migrate to 
> > Protectier VTL. The V840 data is remote copied and the VTL uses its 
> > own replication mechanism. Recovery is to bring up the instance on 
> > hot AIX LPARs using the replicated database at the  remote site.  
> > This is used for multiple TSM Servers, in both directions.
> >
> > DR has been tested twice by others, and appears to work.  I'll find 
> > out for myself in a week or so, change control willing.
> >
> > Cheers
> >
> > Steve
> >
> > Steven Harris
> > TSM Admin/Consultant
> > Canberra Australia.
> >
> > -----Original Message-----
> > From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On 
> > Behalf Of Matthew McGeary
> > Sent: Friday, 28 April 2017 6:17 AM
> > To: ADSM-L AT VM.MARIST DOT EDU
> > Subject: Re: [ADSM-L] DR Rebuild please chime in.
> >
> > If I'm understanding correctly, your DR site will have a 
> > storage-level copy of all your TSM storage pools, database, logs, etc.
> >
> > In that case, yes, what is being proposed should work.  However, 
> > you're trading a replication that can be monitored and validated to 
> > a storage-level model that isn't application aware.
> >
> > AND, if you're not doing anything on the DB2 side during replication (ie:
> > quiescing) then the server will do a crash-recovery startup at the 
> > DR
> site.
> >
> > Crash-recovery has always worked for me in DB2, but it's not as 
> > fool-proof as DB2 HA/DR replication, recovering from a DB2 backup or 
> > using the TSM replication that you're ripping out.  There may come a 
> > time when you do a DR test or actual DR and your TSM database won't 
> > recover properly from that crash-level snapshot.  Then what do you do?
> >
> > Why in god's name is this change happening?
> > __________________________
> > Matthew McGeary
> > Senior Technical Specialist - Infrastructure Management Services 
> > PotashCorp
> > T: (306) 933-8921
> > www.potashcorp.com
> >
> >
> > -----Original Message-----
> > From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On 
> > Behalf Of Plair, Ricky
> > Sent: Thursday, April 27, 2017 1:27 PM
> > To: ADSM-L AT VM.MARIST DOT EDU
> > Subject: [ADSM-L] DR Rebuild please chime in.
> >
> > All,
> >
> > Our last DR was a disaster.
> >
> > Right now,  we do the TSM server to TSM server replication and it 
> > works fairly well but, they have decide we need to fix something 
> > that is not broken.
> >
> > So, the idea is to upgrade to SP 8.1 and install on a zLinux machine.
> > Our storage is on an IBM V7000, and where we were performing  the 
> > TSM replication, we are trashing that and going to IBM V7000 
> > replicating to V7000.
> >
> > Now,  the big twist in this is,  we will not have a TSM server at 
> > our DR anymore. The entire primary TSM server will be backed up to 
> > the
> > V7000 and replicated to our V7000 at the DR site.
> >
> > There is no TSM server at the DR site so, IBM will build us one when 
> > we have our DR exercise and then according to our trusty DB2 guys we 
> > should just be able to break the connection to the Primary TSM 
> > server, do a little DB2 magic and WOLA the TSM server will be up.
> >
> > This is my question, if the TSM server is built in DR and the 
> > primary TSM servers database in on the DR V7000,  then that database 
> > will still have to be restore to the TSM server. You're not going to 
> > be able to just bring it up because its DB2 and point to the TSM 
> > server and
> it work, right?
> >
> > Please let me know your thought's. I know I have left a lot of 
> > details out but I'm just trying to get some views. If you need more 
> > information I will be happy to provide it.
> >
> > I appreciate your time.
> >
> >
> >
> >
> > Ricky M. Plair
> > Storage Engineer
> >
> >
> >
> > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> > _ _ _ _ CONFIDENTIALITY NOTICE: This email message, including any 
> > attachments, is for the sole use of the intended recipient(s) and 
> > may contain confidential and privileged information and/or Protected 
> > Health Information (PHI) subject to protection under the law, 
> > including the Health Insurance Portability and Accountability Act of 
> > 1996, as amended (HIPAA). If you are not the intended recipient or 
> > the person responsible for delivering the email to the intended 
> > recipient, be advised that you have received this email in error and 
> > that any use, disclosure, distribution, forwarding, printing, or 
> > copying of this email is strictly prohibited. If you have received 
> > this email in error, please notify the sender immediately and 
> > destroy all copies of
> the original message.
> >
> > This message and any attachment is confidential and may be 
> > privileged or otherwise protected from disclosure. You should 
> > immediately delete the message if you are not the intended 
> > recipient. If you have received this email by mistake please delete 
> > it from your system; you should not copy the message or disclose its 
> > content to anyone.
> >
> > This electronic communication may contain general financial product 
> > advice but should not be relied upon or construed as a 
> > recommendation of any financial product. The information has been 
> > prepared without taking into account your objectives, financial 
> > situation or needs. You should consider the Product Disclosure 
> > Statement relating to the financial product and consult your 
> > financial adviser before making a decision about whether to acquire, 
> > hold or dispose of a financial
> product.
> >
> > For further details on the financial product please go to 
> > http://www.bt.com.au
> >
> > Past performance is not a reliable indicator of future performance.
> >
>
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ _ _ CONFIDENTIALITY NOTICE: This email message, including any 
> attachments, is for the sole use of the intended recipient(s) and may 
> contain confidential and privileged information and/or Protected 
> Health Information (PHI) subject to protection under the law, 
> including the Health Insurance Portability and Accountability Act of 
> 1996, as amended (HIPAA). If you are not the intended recipient or the 
> person responsible for delivering the email to the intended recipient, 
> be advised that you have received this email in error and that any 
> use, disclosure, distribution, forwarding, printing, or copying of 
> this email is strictly prohibited. If you have received this email in 
> error, please notify the sender immediately and destroy all copies of the 
> original message.
>


This message and any attachment is confidential and may be privileged or 
otherwise protected from disclosure. You should immediately delete the message 
if you are not the intended recipient. If you have received this email by 
mistake please delete it from your system; you should not copy the message or 
disclose its content to anyone. 

This electronic communication may contain general financial product advice but 
should not be relied upon or construed as a recommendation of any financial 
product. The information has been prepared without taking into account your 
objectives, financial situation or needs. You should consider the Product 
Disclosure Statement relating to the financial product and consult your 
financial adviser before making a decision about whether to acquire, hold or 
dispose of a financial product. 

For further details on the financial product please go to http://www.bt.com.au 

Past performance is not a reliable indicator of future performance.
<Prev in Thread] Current Thread [Next in Thread>
  • [ADSM-L] Power failures, Harris, Steven <=