Veritas-bu

Re: [Veritas-bu] Designing SLAs

2007-07-12 11:21:11
Subject: Re: [Veritas-bu] Designing SLAs
From: "Jeff Lightner" <jlightner AT water DOT com>
To: "Curtis Preston" <cpreston AT glasshouse DOT com>, "Ellis, Jason" <Jason.Ellis AT imb DOT com>, "Angela Akridge" <angela.akridge AT gmail DOT com>, <veritas-bu AT mailman.eng.auburn DOT edu>
Date: Thu, 12 Jul 2007 10:55:31 -0400
Well I allowed for the possibility of including it and accounting for
the full "recovery" period.   However, often SLAs are "internal" to an
organization.  Meaning the Backup Admins (often UNIX or Windows Admins)
are providing SLA for THEIR services to the "internal" customer rather
than to a real paying "external" organization.   In such a scenario it
isn't unusual that DBAs have their own SLA to that "internal" customer.

Working at major companies you find this kind of demarcation into
"Business Units" all the time.  And it leads to all sorts of fun
discussions (e.g. Why did my BU get charged for 70% of the paper for the
2nd floor copier when 41% of the people there are not in my BU [never
mind that my BU is the one that prints 5 copies of every invoice for
record keeping purposes and the other BU uses the copier once a month].

In such scenarios it is EXTREMELY important to be sure your BU's SLA
includes only things only your BU can control or you'll soon find you've
lost your "revenue" stream and aren't making a "profit".   That fact
that the revenue doesn't represent cash and the profit is all on paper
is beside the point in such organizational setups.   One might bemoan
whether this makes sense in the first place but changing such a
structure often is harder than getting a new bill through the U.S.
Congress.

I recall at a Fortune 500 telecom I once worked at.  The BU for mail
delivery "saved money" by creating mail stops.   This meant that highly
paid technical people like me (and there were hundreds of us) had to
each day look through the mail of others to find our own instead of
having a couple of lower paid clerks in that BU do the presorting for
everyone as they had previously.   When I asked about the cost benefit
analysis done for this lamebrain idea at an all employee meeting I got a
standing ovation letting me know I wasn't the only one who saw it that
way. 

-----Original Message-----
From: Curtis Preston [mailto:cpreston AT glasshouse DOT com] 
Sent: Thursday, July 12, 2007 1:01 AM
To: Ellis, Jason; Jeff Lightner; Angela Akridge;
veritas-bu AT mailman.eng.auburn DOT edu
Subject: RE: [Veritas-bu] Designing SLAs

As Paul stated, I think it's more important that you just agree on and
document something.  I think it's great that you're having the
conversation about RTO and RPO.  So many environments just make it up.

I disagree that the RTO shouldn't include roll-forward time, for a
couple of reasons.  First, the app ain't ready til it's ready.  And it
ain't ready til the log roll is done.  Who cares how long your restore
took.  (You do, of course, but the customer doesn't.  They only care
when their app is running again.)

Secondly, you can do things to affect the log roll time both negatively
and positively.  For example, you could do a full backup only once a
month and say "just roll the logs."  You could also do a full backup
weekly and an incremental backup four times a day to minimize the amount
of time the log roll will take.

I think what you're saying is that if the DBAs don't do their job, then
the restore will take longer.  Fine, then they get the blame for blowing
the RTO, not you.  But saying that a RTO is met even thought the app
isn't up is semantics.  Perhaps for db apps you have a restore TO and a
log roll TO that add up to the "real" RTO.

---
W. Curtis Preston
Backup Blog @ www.backupcentral.com
VP Data Protection, GlassHouse Technologies 

-----Original Message-----
From: veritas-bu-bounces AT mailman.eng.auburn DOT edu
[mailto:veritas-bu-bounces AT mailman.eng.auburn DOT edu] On Behalf Of Ellis,
Jason
Sent: Wednesday, July 11, 2007 8:49 AM
To: Jeff Lightner; Angela Akridge; veritas-bu AT mailman.eng.auburn DOT edu
Subject: Re: [Veritas-bu] Designing SLAs

Jeff,

Excellent points! The RTO should only be how quickly the data needs to
be recovered from backup, not the entire time for the recovery of the
system.

Jason Ellis

-----Original Message-----
From: Jeff Lightner [mailto:jlightner AT water DOT com] 
Sent: Wednesday, July 11, 2007 6:08 AM
To: Ellis, Jason; Angela Akridge; veritas-bu AT mailman.eng.auburn DOT edu
Subject: RE: [Veritas-bu] Designing SLAs

Make sure "RTO" part of the SLA deals only with the restore from backup.
For a large DB the full "recovery" time will include rolling logs and
other activities performed by the DBAs AFTER the restore from backup is
complete.  It is important you note your SLA doesn't cover activities
after the restore from backup to bring the app/db operational again (or
that if it does that you've factored in these additional steps).

Nice information Jason.

-----Original Message-----
From: veritas-bu-bounces AT mailman.eng.auburn DOT edu
[mailto:veritas-bu-bounces AT mailman.eng.auburn DOT edu] On Behalf Of Ellis,
Jason
Sent: Tuesday, July 10, 2007 5:00 PM
To: Angela Akridge; veritas-bu AT mailman.eng.auburn DOT edu
Subject: Re: [Veritas-bu] Designing SLAs

Angela,

A couple of the things that you want to define in a backup SLA are
Recovery Point Objectives (RPO) and Recovery Time Objectives (RTO). 

Recovery Point Objective (RPO): How much data can you afford to lose and
still be able to survive and/or be compliant?

Understanding how much data loss is acceptable for a particular system
or application can have a major impact on how that system or application
is protected. The RPO of a system can help to set the importance of
backups and help to determine what an acceptable error rate is. For
example: If your RPO is 48 hours, then as long as you have a successful
backup within the 48 hour windows you are in compliancy with your SLA.

Recovery Time Objective (RTO): How quickly does a specific system or
application need to be recovered?

RTO is the period of time within which systems or applications must be
recovered after an outage. Know a system's or application's RTO can help
determine whether copies of the data need to be maintained on-site for
quick recoveries. An RTO can also help to set the importance of restore
requests.

Hope that helps!

Jason Ellis

-----Original Message-----
From: veritas-bu-bounces AT mailman.eng.auburn DOT edu
[mailto:veritas-bu-bounces AT mailman.eng.auburn DOT edu] On Behalf Of Angela
Akridge
Sent: Tuesday, July 10, 2007 11:36 AM
To: veritas-bu AT mailman.eng.auburn DOT edu
Subject: [Veritas-bu] Designing SLAs

Hi!

Do you have any web sources that provide excellent best practices
about how to design an SLA for backup duration and restore duration?
What is a reasonable backup duration and restore duration? Storage
Magazine has plenty of articles, but not really anything that I can
"sink my teeth into."

(CISCO has detailed best practices, but the SLAs are specific to
networking:
http://www.cisco.com/en/US/tech/tk869/tk769/technologies_white_paper0918
6a008011e783.shtml.)

Thank you,

Angela
_______________________________________________
Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu


_______________________________________________
Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu


_______________________________________________
Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu

_______________________________________________
Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu

<Prev in Thread] Current Thread [Next in Thread>