ADSM-L

Re: progressive backup vs. full + incremental

2003-03-07 13:53:19
Subject: Re: progressive backup vs. full + incremental
From: "Ford, Phillip" <phillip.ford AT SPCORP DOT COM>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Fri, 7 Mar 2003 13:52:22 -0500
I am going to stick my foot in my mouth but here goes.  I do not feel that 3
months ago, 9 months ago, or 7 years ago are backups.  These are archives.
We have the problem here that the higher ups tend to think of archives as
backups.  We try to define backups for short term retrieval of data.  Like I
accidentally deleted my home directory could you restore it? Sometimes short
term may be up to a year or more.  Our performance data is an example.
Auditing and other long term purposes are not backups but archives.  Since
TSM has both then use the proper one for the proper purpose.  Also TSM has
backup sets which could be used for either purpose.

Sorry but that is my 2 cents worth.


--
Phillip Ford
Senior Software Specialist
Corporate Computer Center
Schering-Plough Corp.
(901) 320-4462
(901) 320-4856 FAX
phillip.ford AT spcorp DOT com





-----Original Message-----
From: William Rosette [mailto:Bill_Rosette AT PAPAJOHNS DOT COM]
Sent: Friday, March 07, 2003 12:31 PM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Re: progressive backup vs. full + incremental


My question to you would be, could you do a restore, say 3 months ago, or 9
months ago?  How do you keep files longer than the versions on the active
database.  Our Month End fulls will keep us current for our Auditing
purposes.

Thank You,
Bill Rosette
Data Center/IS/Papa Johns International
WWJD



                      Roger Deschner
                      <rogerd AT UIC DOT EDU>         To:
ADSM-L AT VM.MARIST DOT EDU
                      Sent by: "ADSM:          cc:
                      Dist Stor                Subject:  Re: progressive
backup vs. full + incremental
                      Manager"
                      <[email protected]
                      .EDU>


                      03/07/2003 01:21
                      PM
                      Please respond to
                      "ADSM: Dist Stor
                      Manager"






We had an actual disaster last spring. An 80 gigabyte Unix file system that
contained all of our email became corrupted. This node was being backed up
both by TSM (with collocation, as this is obviously a critical node), and by
old-fashioned full+incremental backups.

We had a restore race. TSM lost, but not by much. Close enough to call it a
tie, and we were very pleasantly surprised that it was this close. 3:45 for
TSM, versus 3:15 for the old-fashioned restore. If we could plan all our
disasters for the day after the full backup, like this one was, the
full+incremental method would always win. But by day 6 of the weekly cycle,
TSM would become considerably faster.

With a full+differential scheme, restore time might be faster, but
differential backups are so much more costly to make. By the end of the
cycle, a differential backup could practically be a full backup. Consider
all those unnecessary copies of a file that changed only once, on day 1 of
the cycle.

In theory, the worst case without collocation is that a node's files could
be scattered over all the tapes in your library. In practice, that is not
the case. I'm not sure why, but I think that migration and reclamation at
least make an effort not to make the scattering any worse. Larger disk
storage pools definitely help reduce the scattering.

I completely agree with Don France about taking the time to accurately
separate your nodes by criticality. We do this, using exactly his
three-tiered scheme.

The key to TSM's progressive backup scheme is that it is totally
database-driven. Don't try to circumvent it by imposing periodic full
backups on it - TSM _ALWAYS_ has a full backup of each node! Every day, not
just on the day after your full backup. It's in the database. This, however,
takes a slight "leap of faith" for us who were brought up in the
full+incremental cycles of mainframe days. It's easier if you forget about
individual tape volumes, and consider your entire tape library and its robot
to be one huge tape.

Do away with those full backups, and your tape drive bandwidth and database
size problems should be significantly reduced. What is the purpose of a full
backup? To reduce restore time in a disaster. You're better off achieving
that goal with 1) collocation for the most critical nodes, and 2) TSM
Database disk tuning. That's how TSM earned a virtual tie in the restore
race in my actual disaster.

Roger Deschner      University of Illinois at Chicago     rogerd AT uic DOT edu
               Academic Computing & Communications Center


On Thu, 6 Mar 2003, William Rosette wrote:

>I agree with A), B) I'm not sure about the "consolidating the tape
mounts",
>could be my DIRMC for NT file servers are off.  On the (suppose C)) I
>am trying to implement but I am having trouble with Month Ends.
>Currently we have 117 nodes and 31 maximum days in a month.  I run 2
>backupsets that
use
>4 tape drives giving 6 left for daily and nightly operations.  Each
>backupset takes about 24 hours and is getting longer because of
>nocollocation and reclamation spreading the information further out.
>What I want to do is reduce my storage pools to criticality (my own
>word haha) and also Do Monthly Absolute backups.  Problem is disk space
>will double
on
>the database side.  Backupset minimum impact on DB where Monthly
>Absolute (same as full) increases DB considerable, especially 117 x 12
>months.  I
am
>having a hard time convincing the upper powers that the criticality and
>Monthly Absolutes are the best way.
>
>Thank You,
>Bill Rosette
>Data Center/IS/Papa Johns International
>WWJD
>
>
>
>                      DFrance
>                      <DFrance-TSM@ATT.        To:
ADSM-L AT VM.MARIST DOT EDU
>                      NET>                     cc:
>                      Sent by: "ADSM:          Subject:  Re: progressive
backup vs. full + incremental
>                      Dist Stor
>                      Manager"
>                      <[email protected]
>                      .EDU>
>
>
>                      03/06/2003 03:09
>                      PM
>                      Please respond to
>                      "ADSM: Dist Stor
>                      Manager"
>
>
>
>
>
>
>A)  The speed of new tape drives (eg, 9840 & 3590) with their mid-point
>load mostly mitigates the restore speed issue;  even IBM's LTO or STK's
>9940 seem to be sufficiently fast, they're more like the speed of disk
>of just a few years ago;
>
>B) TSM further mitigates restore speed by consolidating the tape mounts
for
>a given restore -- provided you properly implement DIRMC for your NT
>file servers!@!
>
>If you cannot tolerate collocation, consider smaller storage pools,
>organized by mission-critical vs. production vs. non-prod/desktops...
>personally, I've found that when a customer takes the time to properly
>identify his mission-critical servers (from a file-system recovery
>perspective), collocation is needed for less than 20% of the "farm"!
>And, for the business-critical servers (if that different than
"mission-critical
>for file-system recovery), it's more appropriate to use HACMP solution
>(possibly, in concert with collocation).
>
>Hope this helps!
>
>
>Don France
>Technical Architect -- Tivoli Certified Consultant
>Tivoli Storage Manager, WinNT/2K, AIX/Unix, OS/390
>San Jose, Ca
>(408) 257-3037
>mailto:don_france AT ayett DOT net (change aye to a for replies)
>
>Professional Association of Contract Employees
>(P.A.C.E. -- www.pacepros.com)
>
>
>
>-----Original Message-----
>From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU]On Behalf Of
>William Rosette
>Sent: Wednesday, March 05, 2003 7:22 AM
>To: ADSM-L AT VM.MARIST DOT EDU
>Subject: Re: progressive backup vs. full + incremental
>
>
>This progressive incremental confuses me.  This is what I thought was
going
>on:
>
>1. Differential is all changes from last FULL backup.
>2. Incremental is all changes from last ANY backup
>3. Full is all files not matter change on backup.
>
>We used to do Differential with Weekend Fulls.  During a restore we
>would restore Full if file had not changed, and then the last
>differential for the other files that changed.  We never restored more
>than necessary.  It depended on the restore.  Restore 1 file was the
>same on Differential as Incremental (most current or before
>corruption).  The problem comes when you are restoring directories or a
>whole system as in DR.  In the Differential Weekend Full world you
>would restore Full and lay on top the last Differential and your done.
>Always 2 restores was all and restores flew since data was all
>together.
>
>Now the TSM world has its own database with its own reclamation,
>expiration, migration, collocation, and the works. Come restoring 1
>file
it
>is the same as above.  Come restoring directories or whole systems it
>will depend where all the data is.  Our case, if the restore is not far
>back we have a quick restore, but the further back we go in date the
>slower the restore because of the no-collocation and the data is spread
>over 100's of tapes.  This seems to be the same as an Incremental that
>the TSM database keeps track of every file from every tape.
>
>Thus, the reason we did not use Incrementals before was 1. Restores
>were long, 2. No database to keep track of all Incremental tapes, 3.
>Differential & Full used less tapes, and 4. money.  I am still dealing
with
>the progressive incremental that progressively eats resources/money.
>My suggestion would be for reclamations to reclaim to a collocate
>status or somehow keeping the data together as it gets older.  Right
>now I am probably going to run FULLs just to keep my restores to a
>minimum since
the
>tape issue will hurt us if we collocate
>
>If I am off, I would appreciate anyone that can straighten out my
>backups/restores.
>
>Thank You,
>Bill Rosette
>Data Center/IS/Papa Johns International
>WWJD
>
>
>
>                      Gianluca Mariani1
>                      <gianluca_mariani@        To:
>                      ADSM-L AT VM.MARIST DOT EDU
>                      IT.IBM.COM>               cc:
>                      Sent by: "ADSM:           Subject:  Re: progressive
>                      backup vs. full + incremental
>                      Dist Stor Manager"
>                      <[email protected].
>                      EDU>
>
>
>                      03/05/2003 09:47
>                      AM
>                      Please respond to
>                      "ADSM: Dist Stor
>                      Manager"
>
>
>
>
>
>
>Progressive incremental backs up only new or changed files.  during the
>initial backup the client  backs up all eligible files of course(full
>backup). Subsequently, files are backed up again only if they are new
>or have changed since the last backup. In TSMs case, a pointer to each
version
>of every file for every client is kept  in the database , so there is
>no need for another full backup. When  you  need to restore, you can
>choose the specific version of the
file
>or  point-in-time  to  restore,  and TSM will restore only  that
particular
>file or files. The approach used for full + incremental backups
(NetBackup)
>requires  an  initial  full  backup,  followed  by  regular
>incremental
or
>differential  backups (usually once a day), with the complete cycle
needing
>a  full  backup  to  be  repeated on a regular (usually weekly) basis.
This
>backup  method  results in redundant weekly full backups of files that
have
>not  changed,  wasting  both  network  and  media resources. The
multi-step
>restore  process in this approach requires the software to restore the
last
>full  backup,  then to restore  incremental or differential backups  on
top
>of  that   in  order  to  recover the latest version of a file or an
entire
>system.  This  methodology  not  only involves restoring more data, it
also
>means  more  tape  mounts  and  tape  positioning and consumes more
network
>bandwidth all of which amounts to having longer restore times.
>
>
>
>Cordiali saluti
>Gianluca Mariani
>Tivoli TSM Global Response Team, Roma
>Via Sciangai 53, Roma
> phones : +39(0)659664598
>                   +393351270554 (mobile) gianluca_mariani AT it.ibm DOT com
>
----------------------------------------------------------------------------
------------------------

>
>
>
>The Hitch Hiker's Guide to the Galaxy says  of the Sirius Cybernetics
>Corporation product that "it is very easy to be blinded to the
>essential uselessness of  them by the sense of achievement you get from
>getting them to work at all. In other words ? and this is the rock
>solid principle  on which the  whole  of the Corporation's Galaxy-wide
>success is founded -their fundamental design flaws are  completely
>hidden  by  their superficial design flaws"...
>
>
>
>             Joni Moyer
>             <joni.moyer@HIG
>             HMARK.COM>                                                 To
>             Sent by: "ADSM:        ADSM-L AT VM.MARIST DOT EDU
>             Dist Stor                                                  cc
>             Manager"
>             <ADSM-L AT VM DOT MARI                                           
> bcc
>             ST.EDU>
>                                                                   Subject
>                                    progressive backup vs. full +
>             05/03/2003             incremental
>             15.21
>
>
>             Please respond
>             to "ADSM: Dist
>              Stor Manager"
>
>
>
>
>
>
>Hello everyone!
>
>I was wondering why the full + incremental would result in a longer
restore
>time than the progressive backup methodology?  From several co-workers
>point of view they thought that it would be quicker on the full +
>incremental because you wouldn't have to go back to the beginning
>backups of the file and restore all of the incrementals, you would just
>go back to the most recent full backup and apply the incrementals after
>that point. When I went to explain the reasoning behind this, I had
>some problems understanding the concept myself, so I was hoping someone
>could explain both methods and why they differ in restore time and why
>progressive is better than the full + incremental.  Thank you so much
>for any help you
can
>lend on this matter!
>
>
>
>Joni Moyer
>Systems Programmer
>joni.moyer AT highmark DOT com
>(717)975-8338
>


*********************************************************************
This message and any attachments are solely for the intended recipient. If you 
are not the intended recipient, disclosure, copying, use or distribution of the 
information included in this message is prohibited -- Please immediately and 
permanently delete.