Veritas-bu

Re: [Veritas-bu] How to implement a 24 hour RPO with a traditionalbackup system.

2010-07-17 03:45:42
Subject: Re: [Veritas-bu] How to implement a 24 hour RPO with a traditionalbackup system.
From: "JC Cheney" <joseph_cheney AT symantec DOT com>
To: "Dean" <dean.deano AT gmail DOT com>, <veritas-bu AT mailman.eng.auburn DOT edu>
Date: Sat, 17 Jul 2010 08:45:04 +0100

A variation on my suggestion of using the snapshot client : can you setup os-based snapshotting?

 

If so you could have a job that runs at , say, 5:50pm to create the snapshot – you then kick off a backup at 6pm of the snap. Technically your offering a 23hr 50min + snapshotting time (typically seconds to create a snap) service!!!

 

Don’t know what your customer’s are running but for a system with VxFS you can create a snapshot using the “-o snapoff=” option to the mount command. ZFS snapshots can be created with the “zfs snapshot” command; Windows has VSS, etc.

 

You could use your post-backup script to remove the snapshot after it’s been backed up…

 

JC

 

From: veritas-bu-bounces AT mailman.eng.auburn DOT edu [mailto:veritas-bu-bounces AT mailman.eng.auburn DOT edu] On Behalf Of Dean
Sent: 17 July 2010 04:46
To: veritas-bu AT mailman.eng.auburn DOT edu
Subject: Re: [Veritas-bu] How to implement a 24 hour RPO with a traditionalbackup system.

 

Thank you all for the responses.

We do already offer a Bronze service class, which is basically best-effort. But most of the systems fall into this 24 hour Silver category.

Doing daily differential incrementals then synethetic backups wouldn't really change the fact that in the worst case, our RPO is still 24 hours plus the elapsed time of tonight's differential backup.

I know this is more of a contractual wording and understanding issue, but I did wonder if anyone had thought of a simple technical solution to "retrofit" the backup regime to this incorrectly worded SLA.

Thanks again,
Dean

On Fri, Jul 16, 2010 at 11:51 PM, Ed Wilts <ewilts AT ewilts DOT org> wrote:

On Thu, Jul 15, 2010 at 11:54 PM, Dean <dean.deano AT gmail DOT com> wrote:

Silver is 24 hours. The large majority of our backup clients fall into this category. Silver class is all based on tape backup/recovery. It's the traditional overnight backup to tape (or disk, VTL, whatever).... fulls on the weekend, incrementals on weeknights.

But one of our clients has questioned this worst-case 24 hour RPO, and their query is quite valid.

Here is an example:

There is a system with 24 hour RPO that we backup every night at 6PM. The backup takes one hour. So, if a disaster occurs at 6:59 PM, before tonight's backup completes, we have to restore from the previous night's backup. But, really, that backup is only consistent as of 6PM the previous night, when the backup *started*. That means our worst case RPO is actually 25 hours.

I know this can be fixed with disk mirroring, but I'm looking for ways around this using purely "traditional" tape based (or disk) backup. If we're going to mirror all these systems, we'd be effectively moving them all to the Platinum DR class, and the customer is not willing to pay for that.

To do it with a traditional daily backup regime, we'd have to ensure that each day's backup completed less than 24 hours before the previous day's backup started, which means the backup window would constantly rotate throughout the day. Obviously that's not realistic.

The easy solution is to adjust the SLA to say that the RPO is "24 hours, plus the elapsed time of your backup", but the customer will not accept that.

We could also do something like running two backups a day, but obviously that will double the resources we need for our backup infrastructure, and I don't think the customer would be happy with all their servers grinding to a halt when the backups kick off in the middle of the day.

 

It's really the customer's choice - they can't have it both ways.  What you can do is reduce the likelihood of it happening, but you obviously can't avoid it as you've already discovered.  In most cases, your recovery point will be less than 24 hours - in fact, on average, your recovery point is about 12 hours if you're doing backups every 24 hours.

I'd also guess that your backups for a particular client doesn't run at exactly the same time every day.  The backup window can open at 6pm for a lot of clients and some will run at 6 but some may not actually start until later in the window.  It could be 6pm one day and 10pm the next.

If your tape drive is in the same location as the data, your RPO is actually much worse since a single event could destroy the tape and the disk at the same time.  A tape that you write every day at the beginning of a 6pm backup window and doesn't go offsite until 9am the next morning gives you a maximum RPO of 24+15 = 39 hours.

Yeah, it's ugly, and I'm guessing you will change the definition of silver to simply be to restore from the last successful tape backup and that backups are attempted approximately every 24 hours.  As you've said, customers do have the ability to bump themselves up to a better RPO.

   .../Ed

Ed Wilts, RHCE, BCFP, BCSD, SCSP, SCSE
ewilts AT ewilts DOT org

 

 

_______________________________________________
Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu