Veritas-bu

[Veritas-bu] Re: start notify and multiple data streams

2001-06-21 17:17:28
Subject: [Veritas-bu] Re: start notify and multiple data streams
From: John_Wang AT enron DOT net (John_Wang AT enron DOT net)
Date: Thu, 21 Jun 2001 16:17:28 -0500
Hello Curtis

Wow, good post.

Regards,
John I Wang
Sr. Systems Engineer
Steverson Information Professionals

---
Enron Networks
Enron Building room 3427e
ph (713) 345-4888
cell (832) 493-1263
fax (713) 646-8462
pg pagejwang AT skytel DOT com or 1-877-390-4155





|--------+------------------------>
|        |          curtis@backupc|
|        |          entral.com    |
|        |                        |
|        |          06/21/01 03:49|
|        |          PM            |
|        |                        |
|--------+------------------------>
  >--------------------------------------------------------------------|
  |                                                                    |
  |       To:     John Wang/Contractor/Enron Communications@Enron      |
  |       Communications                                               |
  |       cc:     morms AT es DOT com, veritas-bu AT mailman.eng.auburn DOT edu 
     |
  |       Subject:     RE: [Veritas-bu] Re: start notify and multiple  |
  |       data streams                                                 |
  >--------------------------------------------------------------------|



I thought I'd continue this discussion for the mutual benefit of everyone
involved.  I think the points that you and I are both making are
interesting, and I hope others do as well.

At 02:51 PM 6/21/2001 -0500, John_Wang AT enron DOT net wrote:
>If they were truly critical, they would be worth the added cost of a decent
>filesystem.

Agreed.  In fact, most critical Unix servers that I work with do indeed use
Vxfs.  That doesn't mean, however, that they want the additional cost,
hassle, and risk of creating vxfs snapshots to back up.  In order to
implement the "stop the database, snapshot the filesystem, start the
database, then backup the snapshot" method, it requires:
1. A raw device for each filesystem being snapshoted
2. A script to manage all of this (that will be written by you)

Therefore, I only use this type of setup when things are absolutely
critical.  However, I would submit that there are servers that many people
critical, and for which backups should finish as quick as possible, but
that are not critical enough to warrant going outside of the standard,
supported backup methods that Veritas offers.

>A true hot backup state leaves the core of the database tables in a consistent
>state and journals continuing activity, allowing ongoing database activity
>with
>a modest performance hit that's probably more manageable than a mad
>scramble to
>back everything up.   I believe you are thinking of leaving the database in a
>quiescent state where the database is actually shut down.   Hey, I wasn't the
>one to start using the term "hot backup" in this thread and have no idea if
>there really is a "hot backup" mode in any known databases.   I suspect
>that the
>term "hot backup" is a misnomer from the original poster.

My definition of a hot backup is slightly less limiting.  A hot backup must
allow you to backup the database while it is running, without requiring an
impact to the user.  This is done in several databases today using products
like onbar, RMAN, Backup server, etc.  However, in the type of backup setup
that we are discussing, the only product that's going to work is
Oracle.  (That is, only Oracle allows you to easily put the database into a
state that allows a third-party utility to backup the datafiles -- without
significantly impacting the user.)

To do this with Oracle, you simply issue an "alter tablespace begin backup"
command for each tablespace that the instance has.  What happens, then,
however?  Many people think that this halts all writes to the datafiles,
while the changes go to the redologs.  This misconception is so common that
it's even mentioned in a few books.  (Heck, my first article about Oracle
used this misconception.)

However, for a real explanation of how hot backups work in Oracle, check
out the following URL:

http://www.backupcentral.com/oracle-hot-backup.html

>A decent filesystem provides that ability without application developers
>having
>to design it in.   Arguably a decent database would also have it built in and
>would simply implement it as part of their rollback algorithm by simply
>allowing
>for a suspension in the journal or log consolidation to consistent states,   I
>know that Techra had the ability to rollback to the last known consistent
>state
>and hence was actually journalling everything anyway.   Such functionality may
>be found for free if you're willing to bone up on being a DBA from a system
>administrator's perspective. Tell your employer to have you trained as a
>DBA or
>buy the filesystem so that you can meet that responsibility ubiquitously.   My
>advocation of a systematic filesystem solution is obvious proof that I am
>acknowledging that database servers cannot be quiescent indefinitely.   I
>simply
>believe that run optimization by multi-streaming can ever produce truly
>acceptable and consistent results and therefore a systematic approach is
>required.

Where do I start?  I am trained as a DBA, and even wrote a book that
includes chapters on how to do backups for Oracle, Informix, and
Sybase.  Feel free to check out http://www.backupcentral.com/thebook.html .

Just because a filesytem journals doesn't mean that you can snapshot it
without putting the database into hot backup mode or shutting it
down.  Just thought I'd mention that.

I've had several large clients that are successfully backing up 100 TB of
data or more using the methods I'm describing, so I'm not sure what you
mean when you say that this can't produce consistent results.

>Even a mad scramble to minimize your backup window results in a variable
>quiescent downtime requirement and with Netbackup results in a variable
>time at
>which this window starts.   Critical servers tend to require that the
>quiescent
>periods are agreed upon beforehand and be of a regular period of time as
>well as
>being a minimum.   If the servers are truly critical, saying that it'll be
>shut
>down for an unspecified but minimal amount of time sometime during the backup
>window would not be good enough.    Ultimately, you need a decent filesystem.
>Purchasing a filesystem product is a capital expenditure, over time the
>cost of
>capex amortizes to far less than the value of time spent by employees at water
>coolers.   It's good business practice to consider capex cheap.

Again, I don't shut down anything and my restores work just fine.  I design
things in HA environments where the databases CAN'T be shut down.

BTW, miniming the time it takes to backup a single server also minimizes
the time it takes to restore it.  If you backed up your whole server as one
stream, then you can only RESTORE it with one stream.  HOWEVER, if you
backed it up with many streams (I've seen as many as 40 at a time with a
really big database server), then you can restore all of those streams
simultaneously as well.

>Multiple streams only allows performance gains by parallelism, you may be able
>to get some more out of a multiple processor computer and a RAID but if it's a
>single cpu server, than any performance gain would simply be due to poor
>tuning
>to begin with.    Look I used to be Senior Support for TMC, I know parallelism
>and TMC owns the patent on RAID and the SDA is still the only RAID that
>supports
>an indefinite number of spindles with only one parity drive and is still the
>only one where parity calculation costs nothing on the write.

I'm not talking about single server CPUs here.  I'm talking about Solaris,
HP, Compaq, and other big multi-cpu servers here.  And I would argue that
no matter how good your RAID disk is, you're still going to get more out of
it with multiple streams than with one stream.  I've seen this time and
time again.  1 stream = 10-20 MB/s, depending on how good the disk is.  20
streams = 100-200 MB/s, depending on how good the disk is.

And, BTW, not everybody puts everything in one big RAID device.  Many
people use several big RAID devices, making the argument for multistreaming
even more important.

>Really, it is irrelevant if you agree with me.   From the perspective of
>career
>advancement and business practices, one should tell management that you
>need DBA
>training to support consistent backups with DBA features or a commercial
>filesystem to provide that support from the system side.   From an ego centric
>public domain hacker viewpoint, you'll whip up a bunch of scripts and spend
>years supporting and arguing why yours is marginally faster than someone
>else's.
>Sometimes it makes business sense to go with the latter if the staffing
>that you
>have are fairly junior and underpaid but even then only for interim periods
>because capex is always cheaper than operating costs and IT staff rarely
>remains
>junior and underpaid for long.
>
>I've found that most creative automation solutions were only to solve a
>problem
>that existed due to misconfiguration in the first place.    How many sites
>used
>to have periodic NIS push scripts and warn users of password changes that
>would
>only be implemented periodically overnight when all they needed to do was
>correct the permission's and pathnames of the NIS source and database files...
>Why even one of the stated purposes of developing NIS+ was to address that
>issue
>though it was only an issue of poor education/documentation.
>
>Perhaps one of the reasons why we are differing on viewpoints here is just
>user
>demands.   My user is currently demanding the reduction of the quiescent
>window
>to the scale of a few seconds so from my perspective, filesystem freezes
>of some
>kind are the only way to go.

I think I've already mentioned that I'm fully skilled as a DBA, and in the
backup methodologies that you are mentioning.  I would submit that there
are three levels of requirement:

1. Systems/users that don't care how long the backup takes
2. Systems/users that want the backup short, but won't spend the money (or
take the risk) to make it instantaneous
3. Systems/users that want the backup instantaneous, as far as the
application is concerned.

For #1, do whatever you want. For #2, I would suggest that multistreaming
is the least expensive, effective way to satisfy them.  For #3, you use
either vxfs snapshots or EMC BCVs.  However, those cost a lot (in terms of
extra disk required for the snapshot log disks or extra disks for the
BCVs), and imposing that extra cost on people that don't really want
instantaneous backups is unreasonable.


>P.S.
>I get very few calls now (ie.: I've received one off duty page in three
>months)
>but I used to be paged on a daily basis back when I was young and concentrated
>on irrelevant tasks.
>
>---
>Enron Networks
>Enron Building room 3427e
>ph (713) 345-4888
>cell (832) 493-1263
>fax (713) 646-8462
>pg pagejwang AT skytel DOT com or 1-877-390-4155
>
>
>
>
>
>|--------+------------------------>
>|        |          curtis@backupc|
>|        |          entral.com    |
>|        |                        |
>|        |          06/21/01 12:21|
>|        |          PM            |
>|        |                        |
>|--------+------------------------>
>   >--------------------------------------------------------------------|
>   |                                                                    |
>   |       To:     John Wang/Contractor/Enron Communications@Enron      |
>   |       Communications                                               |
>   |       cc:     morms AT es DOT com, veritas-bu AT mailman.eng.auburn DOT 
> edu      |
>   |       Subject:     RE: [Veritas-bu] Re: start notify and multiple  |
>   |       data streams                                                 |
>   >--------------------------------------------------------------------|
>
>
>
>Some thoughts below..
>
>At 11:53 AM 6/21/2001 -0500, John_Wang AT enron DOT net wrote:
>
>
> >Hello Curtis
> >
> >That may be irrelevant, if the other streams are still in the queue when
> >all the
> >streams that are running finishes, switching the database out of hot
> mode till
> >the other streams come out of the queue may be acceptable depending on
> how the
> >streams were configured.   Ultimately, to minimize time in the "hot"
> mode  and
> >overhead, some sort of filesystem freeze should be used as is available from
> >Veritas Filesystem or Sun's old Backup Copilot product.
>
>It's definitely not irrelevant when you get calls from the DBAs complaining
>that you shutdown and restarted their database five times last night.  (Or
>put into and took out of hot backup mode.)  And yes, you can use vxfs
>snapshot or EMC BCVs to take downtime to a minimum, but that costs a LOT
>more per box.
>
>
> >Besides multi streaming an individual server is a pointless exercise if your
> >site is large enough to have enough simultaneous sessions from different
> >servers
> >to begin with.   It's the old parallel processing performance problem, too
> >fine
> >a granularity will not actually amortize the overhead required.   If you
> have
> >more database servers to backup than the number of streams that you are
> >willing
> >to allow to your storage units concurrently, there's no point in
> >multi-streaming.
>
>I definitely disagree here.  You assume that all of the database servers
>can be kept in backup mode indefinitely.  Doing things the way you suggest
>might result in the same amount of overall backup time, but it will
>definitely result in increased time per server.  I try to design things in
>such a way that each server is backed up as quick as possible.
>
> >Also, if the database is distributed across a RAID, the RAID
> >hardware will hopefully already have squeezed out whatever performance
> >gain that
> >could be had by parallelism making multiple streams a moot point, if the
> >database is all on one volume, multi-streaming will slow it down and with
> >modern
> >disks going towards monolithic drives with very few platters, any kind of
> >parallelism will likely slow it down as more data is now on the same
> physical
> >device.
>
>Again, I definitely disagree.  I've used many very nice pieces of RAID
>hardware, and your assumption that one stream should be the same speed as
>many streams is definitely not the case.  In my latest configuration, we
>were pumping 200 MB/s out of one RAID array using multistreaming.  Try
>doing that with a single stream.
>
> >The only thing I use multi-streaming for is when I have an unreliable link.
> >That way, the backup is broken up into smaller backups that would not
> >invalidate
> >each other when a problem does occur ie.: if the link temporarily goes
> >out, only
> >a small piece needs to get requeued rather than the whole thing.   In this
> >case,
> >I would also constrain the class to a limited number of concurrent jobs in
> >order
> >to limit the number of segments disrupted by a network blip.   Other
> than for
> >that use, I consider multi-streaming to be just plain evil.   The only
> >problem I
> >have there is that I have to increase the period of time, jobs are
> allowed to
> >remain in the queue for and that can only be done globally rather than on
> >a per
> >class basis, wish Veritas would change that...
> >
> >Now, arguably, there may be cases where a specific server is so critical
> that
> >they want to be able to dedicate all resources to that server during the
> >backup
> >period.   At that point, my view is that they need a more robust
> >filesystem like
> >Veritas Filesystem and perhaps a local tape changer and drives if the need
> >for a
> >short backup window is so critical.   Playing with multi-streaming will just
> >accommodate that server at the expense of overall capacity.
>
>But what if you have a data center full of critical servers, each of which
>wants their backup to be completed as quick as possible?  The best way to
>do it is multistreaming.
>
>
>---
>W. Curtis Preston
>Principal Consultant for Storage Designs, your storage experts
>Webmaster: http://www.backupcentral.com Phone: 760 631 7991

---
W. Curtis Preston
Principal Consultant for Storage Designs, your storage experts
Webmaster: http://www.backupcentral.com Phone: 760 631 7991

_______________________________________________
Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu





<Prev in Thread] Current Thread [Next in Thread>