ADSM-L

Re: progressive backup vs. full + incremental

2003-03-07 07:37:16
Subject: Re: progressive backup vs. full + incremental
From: GUILLAUMONT Etienne <eguillau AT RGB-TECHNOLOGIE DOT FR>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Fri, 7 Mar 2003 13:31:49 +0100
In fact, I have two tape pools : one for the small clients and the other
one for the big clients. In your case, that's what I would do : one
tapepool with something like 5 tapes for the 90 clients backing up less
than 1 Gb (I am not sure for the number of tapes, it could be 3 like it
could be 10) and another tapepool for the 30 other clients. I think that
both tapepools could be collocated, but it could be a problem for the
performances. The tapepool for the small clients could be nocollocated. As
you can see, it could contain only 5 tapes so one client could be restored
with at most 5 mounts. But collocation could lead to one client on one or
two tapes.

>From my experience, reading the data of one client on one tape on a dlt1
drive last about 1 or 2 hours, it depends on how it is spread on the tape.

Th problem is with DR because you have to put your tapes in a vault each
day. there are few possibilities : either you want to manipulate the fewer
tapes each day and the data will be spread on a lot of tapes (that's what I
did) or you accept to manipulate a lot of tape and then collocation will be
ok

For example with a primary storage pool on 10 tapes with 1/2 a tape beeing
written each day, you could leave 9 tapes of the copypool in the vault and
move only one. If you want a correct collocation, you should put each day
the 10 tapes in your library. Not very convenient and it's not sure that
your library allows you to change easily 10 tapes.

So from my point of view, I accept that the day one client fails and I have
a problem with my primary storage pool, I will have to wait for hours
before restoring my client. In case of a fire in my library, it's not a
real problem because I will have to read all my copy pool anyway ! My point
of view is that in case of a major disaster like a fire in our office, the
main problem will be to be able to restore the computers and the time it
will take will no be so important. I know there are cases not covered by
that but I can't afford to cover them !

In my case, I have to primary tape pools but only one copy pool. So I have
only one tape to move each day.


Etienne GUILLAUMONT
e-mail : etienne AT rgb-technologie DOT fr

RGB Technologie
Parc d'Innovation, Bâtiment PYTHAGORE
11 Rue Jean SAPIDUS
67400 ILLKIRCH
Tél :  03 90 40 60 60
Fax : 03 90 40 60 61


                                                                                
                                                                   
                    William Rosette                                             
                                                                   
                    <Bill_Rosette@PAPA        To:     ADSM-L AT VM.MARIST DOT 
EDU                                                                         
                    JOHNS.COM>                cc:                               
                                                                   
                    Sent by: "ADSM:           Subject:     Re: progressive 
backup vs. full + incremental                                           
                    Dist Stor Manager"                                          
                                                                   
                    <[email protected].                                          
                                                                   
                    EDU>                                                        
                                                                   
                                                                                
                                                                   
                                                                                
                                                                   
                    07/03/2003 12:49                                            
                                                                   
                    Please respond to                                           
                                                                   
                    "ADSM: Dist Stor                                            
                                                                   
                    Manager"                                                    
                                                                   
                                                                                
                                                                   
                                                                                
                                                                   




So, do you have multiple tape pools? and multiple offsite (copypool) tape
pools?  My main concern is the mixing up of data on the 40 GB tapes (our
compression ends up being 100 GB tapes) so that the restores are forever
like the incremental is forever, if ya know what I mean.  We have 117 nodes
and probably 85 to 90 backup 1 GB or less.  What worries me is the smaller
the backup the more nocollocated these tapes get, and more time to restore,
and more time to do Backupsets, and the more go's on.  I did a
"showvolumeusage xxxxxx" on my clients in February of this year and found
out that on average each node is (we call this the TSM scatter effect)
spread over 121 tapes.  What I see is a full restore on an average of 1
client will take 121 tape mounts (minimum).  Now, we currently are
nocollocate on  Tapepool and Copypool (sorry about the disk pool
misunderstanding) .  I am trying to solve this issue without using an
exorbent amount of tapes, but taking your example on the 20 computers, how
long would it take for you to restore say 1 computer of the 20 like in a DR
situation? and why if you don't mind explaining?

Thank You,
Bill Rosette
Data Center/IS/Papa Johns International
WWJD



                      GUILLAUMONT
                      Etienne                   To:
ADSM-L AT VM.MARIST DOT EDU
                      <eguillau@RGB-TECH        cc:
                      NOLOGIE.FR>               Subject:  Re: progressive
backup vs. full + incremental
                      Sent by: "ADSM:
                      Dist Stor Manager"
                      <[email protected].
                      EDU>


                      03/07/2003 03:42
                      AM
                      Please respond to
                      "ADSM: Dist Stor
                      Manager"






It is right if you have 50 tapes in the tape pool. If you have only 20
tapes, TSM is fortunately not going to save 20 clients and let the others
unsaved, it will put several clients on each tape.
For example, I use TSM to backup several portable computers, each has about
2 gb to save, I wouldn't like to have one 40 GB tape containing 2 GB for
each client.

So, I don't see why collocation should use far more tapes than
nocollocation.

And as long as I know, It is the tape pool which is collocated and not the
disk pool.


Etienne GUILLAUMONT
e-mail : etienne AT rgb-technologie DOT fr

RGB Technologie
Parc d'Innovation, Bâtiment PYTHAGORE
11 Rue Jean SAPIDUS
67400 ILLKIRCH
Tél :  03 90 40 60 60
Fax : 03 90 40 60 61



                    William Rosette
                    <Bill_Rosette@PAPA        To:     ADSM-L AT VM.MARIST DOT 
EDU
                    JOHNS.COM>                cc:
                    Sent by: "ADSM:           Subject:     Re: progressive
                    backup vs. full + incremental
                    Dist Stor Manager"
                    <[email protected].
                    EDU>


                    06/03/2003 19:34
                    Please respond to
                    "ADSM: Dist Stor
                    Manager"






I do not understand how you can get 50 clients on 1 disk pool to go to 20
tapes.   I would think that 50 clients on 1 (collocated) disk pool will
take 50 tapes.  Is that not right?

Thank You,
Bill Rosette
Data Center/IS/Papa Johns International
WWJD



                      GUILLAUMONT
                      Etienne                   To:
ADSM-L AT VM.MARIST DOT EDU
                      <eguillau@RGB-TECH        cc:
                      NOLOGIE.FR>               Subject:  Re: progressive
backup vs. full + incremental
                      Sent by: "ADSM:
                      Dist Stor Manager"
                      <[email protected].
                      EDU>


                      03/06/2003 12:49
                      PM
                      Please respond to
                      "ADSM: Dist Stor
                      Manager"






It has got nothing to see with disk pools but with collocation. When you
set collocation on for the tape storage pool, it automatically tries to put
the data of each client on a different tape. So if 5 clients backup at the
same time on th disk pool, they will automatically be migrated to 5 tapes
instead of one. So the backup will need more tape mounts but the restore
less tape mounts.
But I must admit I didn't understand fully your question, what do you mean
by other disk pools messing up ?


Etienne GUILLAUMONT
e-mail : etienne AT rgb-technologie DOT fr

RGB Technologie
Parc d'Innovation, Bâtiment PYTHAGORE
11 Rue Jean SAPIDUS
67400 ILLKIRCH
Tél :  03 90 40 60 60
Fax : 03 90 40 60 61



                    William Rosette
                    <Bill_Rosette@PAPA        To:     ADSM-L AT VM.MARIST DOT 
EDU
                    JOHNS.COM>                cc:
                    Sent by: "ADSM:           Subject:     Re: progressive
                    backup vs. full + incremental
                    Dist Stor Manager"
                    <[email protected].
                    EDU>


                    06/03/2003 18:11
                    Please respond to
                    "ADSM: Dist Stor
                    Manager"






How do you get a disk pool to go to a certain amount of tapes without the
other disk pools messing them up?

Thank You,
Bill Rosette
Data Center/IS/Papa Johns International
WWJD



                      GUILLAUMONT
                      Etienne                   To:
ADSM-L AT VM.MARIST DOT EDU
                      <eguillau@RGB-TECH        cc:
                      NOLOGIE.FR>               Subject:  Re: progressive
backup vs. full + incremental
                      Sent by: "ADSM:
                      Dist Stor Manager"
                      <[email protected].
                      EDU>


                      03/06/2003 12:00
                      PM
                      Please respond to
                      "ADSM: Dist Stor
                      Manager"






It depends on the size of your disk pool and on the size of the backup you
have to do each day. Just to be more precise, imagine you have 50 clients
with an average on 10 Gb on each client and a LTO library. If you set
collocate to yes, each client will backup on the disk pool and TSM will put
the data of each client on a separate tape, you just need to have enough
disk to accumulate the files while TSM changes the tapes, it's not
mandatory but it will allow you not having clients waiting a tape mount
before ending their schedule. When it reaches the last tape, it will put
the others clients on the used tapes. For example client1 on tape1 client2
on tape2 ... client20 on tape20 and then client21 on tape1 client22 on
tape2 ... and so on. but if you don't set collocation, each day, all the
clients will backup on one tape and then, each client will need 20 mounts
to restore instead of one or two. That's what I did on my site, I have some
clients with little to backup, one or two GB and they are put on the same
tapes than others. The only thing is that you have no empty tape in you
storage pool but a lot of filling tapes.
But TSM is clear about that : you must choose between rapid backup and
rapid restore so between nocollocation and collocation.
I agree that if you have a full backup with all the data of the client
contiguous on the tape, it will be faster to restore. But most of time,
several clients backups at the same time and so the data is not contiguous,
even with a full backup.
I didn't wrote that one drive would be faster but easier for disaster
recovery. I said easier because you don't have to worry : everithing is on
the same tape. I even made once a system backup followed by the data backup
on AIX. Even if the computer burnt, you just needed a single tape an no
software to restore completely the server, booting on the tape. Could you
imagine more easy ? I wish each OS had the same functionalities. And I must
admit I was thinking of smaller servers which can be backed up completely
on one single tape, not of servers with one terabyte for which I agree with
you : you must have a backup software if you want to backup these servers
or wait LTO 5 in some years :-)

Regards


Etienne GUILLAUMONT
e-mail : etienne AT rgb-technologie DOT fr

RGB Technologie
Parc d'Innovation, Bâtiment PYTHAGORE
11 Rue Jean SAPIDUS
67400 ILLKIRCH
Tél :  03 90 40 60 60
Fax : 03 90 40 60 61



                    William Rosette
                    <Bill_Rosette@PAPA        To:     ADSM-L AT VM.MARIST DOT 
EDU
                    JOHNS.COM>                cc:
                    Sent by: "ADSM:           Subject:     Re: progressive
                    backup vs. full + incremental
                    Dist Stor Manager"
                    <[email protected].
                    EDU>


                    06/03/2003 17:14
                    Please respond to
                    "ADSM: Dist Stor
                    Manager"






A collocate with 50 clients to 20 tapes sounds like nocollocate.  How do
you do this and keep separate schedules for each client?
On the Incremental/full, if full is done weekly then the maximum possible
tapes for restore will be 5 tapes (4 M-Th nights, +1 for full) versus the
possible maximum of 20 tapes is 20 tapes above.
On the Differential/full, with full done weekly then the maximum possible
tapes for restore is 2 tapes (last Differential  + 1 for full) versus 5 and
20 above.  Add minimum 3 minutes per tape (1 minute to load tape, 1 minute
to dismount, & 1 minute to locate) and I see a 2 tape restore take 30
minutes, 5 tape restore take 39 minutes and the 20 tape restore take 1 hour
24 minutes.  This also depends on how the restore go's.  That 3 minute per
tape could be more if a tape is used more than once which I have seen
happen depending which file is being restored.
On the last item I disagree with the fastest way is 1 drive per computer.
That means you can only restore the speed of the tape drive which might not
be good if you have 1 TB of disk to restore.  I think the TSM way  is the
best by splitting your resources.  We have 10 tape drives that average 50
GB an hour per drive.  If the Server and Client could handle it (which
seems to be my most bottlenecks) I could go 500 GB per hour and that 1 TB
would restore in 2 hours.

What I can't get over is how at TSM classes they try to beat it into my
head that Restores are #1 priority, why can't that be with TSM software?

Thank You,
Bill Rosette
Data Center/IS/Papa Johns International
WWJD



                      GUILLAUMONT
                      Etienne                   To:
ADSM-L AT VM.MARIST DOT EDU
                      <eguillau@RGB-TECH        cc:
                      NOLOGIE.FR>               Subject:  Re: progressive
backup vs. full + incremental
                      Sent by: "ADSM:
                      Dist Stor Manager"
                      <[email protected].
                      EDU>


                      03/06/2003 04:24
                      AM
                      Please respond to
                      "ADSM: Dist Stor
                      Manager"






I don't understand why collocation should use more tapes. Of course it
would be probably better to have one tape by client but if you manage to
backup for example 50 small clients on 20 tapes, you could use collocation
without buying more tapes. The difference would be that instead of backing
up each day all the clients on the last tape, resulting in each client
having files on each tape, it will put 2 or 3 clients on each tape. In
fact, at the end, you will probabbly have something like each client having
files on 2 or three tapes.
With full+incremental, depending on the regularity of your full backups,
you could have a client on far more tapes, tipically 1 full per week + 4
incremental.
With full+differential, you would have each client on 2 tapes, but the
amount of data backed up would be bigger because a file modified just after
the last full backup would be on each subsequent differential tape.
And in fact, as long as I know, most software based on full+incremental are
unable of collocation and use more tapes than TSM.
But in all cases, the most easy way to restore a full disk or a full
computer is to attach a single drive to this computer and make a full
backup of this computer each day. Most of my clients make that and in case
of a disk failure, everything is simple : you mount the last backup and
restore it completely. But if you need a single file .... and if you have
20 computers to backup ....



Etienne GUILLAUMONT
e-mail : etienne AT rgb-technologie DOT fr

RGB Technologie
Parc d'Innovation, Bâtiment PYTHAGORE
11 Rue Jean SAPIDUS
67400 ILLKIRCH
Tél :  03 90 40 60 60
Fax : 03 90 40 60 61



                    William Rosette
                    <Bill_Rosette@PAPA        To:     ADSM-L AT VM.MARIST DOT 
EDU
                    JOHNS.COM>                cc:
                    Sent by: "ADSM:           Subject:     Re: progressive
                    backup vs. full + incremental
                    Dist Stor Manager"
                    <[email protected].
                    EDU>


                    05/03/2003 16:21
                    Please respond to
                    "ADSM: Dist Stor
                    Manager"






This progressive incremental confuses me.  This is what I thought was going
on:

1. Differential is all changes from last FULL backup.
2. Incremental is all changes from last ANY backup
3. Full is all files not matter change on backup.

We used to do Differential with Weekend Fulls.  During a restore we would
restore Full if file had not changed, and then the last differential for
the other files that changed.  We never restored more than necessary.  It
depended on the restore.  Restore 1 file was the same on Differential as
Incremental (most current or before corruption).  The problem comes when
you are restoring directories or a whole system as in DR.  In the
Differential Weekend Full world you would restore Full and lay on top the
last Differential and your done.  Always 2 restores was all and restores
flew since data was all together.

Now the TSM world has its own database with its own reclamation,
expiration, migration, collocation, and the works. Come restoring 1 file it
is the same as above.  Come restoring directories or whole systems it will
depend where all the data is.  Our case, if the restore is not far back we
have a quick restore, but the further back we go in date the slower the
restore because of the no-collocation and the data is spread over 100's of
tapes.  This seems to be the same as an Incremental that the TSM database
keeps track of every file from every tape.

Thus, the reason we did not use Incrementals before was 1. Restores were
long, 2. No database to keep track of all Incremental tapes, 3.
Differential & Full used less tapes, and 4. money.  I am still dealing with
the progressive incremental that progressively eats resources/money.  My
suggestion would be for reclamations to reclaim to a collocate status or
somehow keeping the data together as it gets older.  Right now I am
probably going to run FULLs just to keep my restores to a minimum since the
tape issue will hurt us if we collocate

If I am off, I would appreciate anyone that can straighten out my
backups/restores.

Thank You,
Bill Rosette
Data Center/IS/Papa Johns International
WWJD



                      Gianluca Mariani1
                      <gianluca_mariani@        To:
ADSM-L AT VM.MARIST DOT EDU
                      IT.IBM.COM>               cc:
                      Sent by: "ADSM:           Subject:  Re: progressive
backup vs. full + incremental
                      Dist Stor Manager"
                      <[email protected].
                      EDU>


                      03/05/2003 09:47
                      AM
                      Please respond to
                      "ADSM: Dist Stor
                      Manager"






Progressive incremental backs up only new or changed files.  during the
initial backup the client  backs up all eligible files of course(full
backup). Subsequently, files are backed up again only if they are new or
have changed since the last backup. In TSMs case, a pointer to each version
of every file for every client is kept  in the database , so there is no
need for another full backup.
When  you  need to restore, you can choose the specific version of the file
or  point-in-time  to  restore,  and TSM will restore only  that particular
file or files. The approach used for full + incremental backups (NetBackup)
requires  an  initial  full  backup,  followed  by  regular  incremental or
differential  backups (usually once a day), with the complete cycle needing
a  full  backup  to  be  repeated on a regular (usually weekly) basis. This
backup  method  results in redundant weekly full backups of files that have
not  changed,  wasting  both  network  and  media resources. The multi-step
restore  process in this approach requires the software to restore the last
full  backup,  then to restore  incremental or differential backups  on top
of  that   in  order  to  recover the latest version of a file or an entire
system.  This  methodology  not  only involves restoring more data, it also
means  more  tape  mounts  and  tape  positioning and consumes more network
bandwidth all of which amounts to having longer restore times.



Cordiali saluti
Gianluca Mariani
Tivoli TSM Global Response Team, Roma
Via Sciangai 53, Roma
 phones : +39(0)659664598
                   +393351270554 (mobile)
gianluca_mariani AT it.ibm DOT com
----------------------------------------------------------------------------------------------------











The Hitch Hiker's Guide to the Galaxy says  of the Sirius Cybernetics
Corporation product that "it is very easy to be blinded to the essential
uselessness of  them by the sense of achievement you get from getting them
to work at all. In other words ? and this is the rock solid principle  on
which the  whole  of the Corporation's Galaxy-wide success is founded
-their fundamental design flaws are  completely  hidden  by  their
superficial design flaws"...



             Joni Moyer
             <joni.moyer@HIG
             HMARK.COM>                                                 To
             Sent by: "ADSM:        ADSM-L AT VM.MARIST DOT EDU
             Dist Stor                                                  cc
             Manager"
             <ADSM-L AT VM DOT MARI                                           
bcc
             ST.EDU>
                                                                   Subject
                                    progressive backup vs. full +
             05/03/2003             incremental
             15.21


             Please respond
             to "ADSM: Dist
              Stor Manager"






Hello everyone!

I was wondering why the full + incremental would result in a longer restore
time than the progressive backup methodology?  From several co-workers
point of view they thought that it would be quicker on the full +
incremental because you wouldn't have to go back to the beginning backups
of the file and restore all of the incrementals, you would just go back to
 the most recent full backup and apply the incrementals after that point.
When I went to explain the reasoning behind this, I had some problems
understanding the concept myself, so I was hoping someone could explain
both methods and why they differ in restore time and why progressive is
better than the full + incremental.  Thank you so much for any help you can
lend on this matter!



Joni Moyer
Systems Programmer
joni.moyer AT highmark DOT com
(717)975-8338