Amanda-Users

Re: multiple DLE with tar vs one big DLE with dump (million of files)

2008-10-01 19:54:08
Subject: Re: multiple DLE with tar vs one big DLE with dump (million of files)
From: Jon LaBadie <jon AT jgcomp DOT com>
To: amanda-users AT amanda DOT org
Date: Wed, 01 Oct 2008 18:48:02 -0400
On Wed, Oct 01, 2008 at 03:28:33PM -0400, FM wrote:
> 
> 
> Jon LaBadie wrote:
> > On Wed, Oct 01, 2008 at 10:34:01AM -0400, FM wrote:
> >   
> >> Hello,
> >> We have a lots of performance problems with our Amanda setup.
> >> One of our servers have a 760 GB file system that needs to be backup.
> >> There are 9,286,637 files in this partition. This is the biggest
> >> partition and the most important.
> >>
> >> Our holding disk has a size of  886 GB.
> >> We are using Amanda 2.5.0p2
> >> We would have no prob upgrading to 2.6 IF there is a performance boost.
> >>
> >> The dumptype for this partition is :
> >>
> >> define dumptype tar-high-span-lan {
> >>     program "GNUTAR"
> >>     comment "Partitions dumped with tar"
> >>     compress none
> >>     estimate calcsize
> >>     index yes
> >>     comment "For LAN Servers. High priority partitions dumped with tar
> >> with spanning of 40GB (10% of tape)"
> >>     tape_splitsize 40 Gb
> >>     priority high
> >> }
> >>
> >> OR
> >> define dumptype dump-high-lan {
> >>     program "DUMP"
> >>     comment "Partitions dumped with dump"
> >>     estimate calcsize
> >>     index yes
> >>     comment "For LAN Servers. High priority partitions dumped with dump"
> >>     priority high
> >> }
> >>    
> >>
> >>
> >> Is it better to :
> >> Use dump for the entire partition in one DLE ?
> >> Or split the partition in several partitions and use tar ?
> >>     
> >
> > Typically dump will perform better than tar for a file system DLE.
> > It uses system calls to go through the inode list rather than 
> > traversing the hierarchical file system using user level calls.
> >
> > However, and IMHO this is a big however, by splitting it up and
> > using tar you will not have the entire 768GB backed up on the
> > same day.
> >
> > Suppose you have a dumpcycle of one week and do daily dumps.
> > Further, suppose you break it up into about 20 DLEs so that
> > amanda can balance things nicely.  You can then expect to
> > dump only about 100-200GB/night.  This would be approx 1/7th
> > of the full dumps (110GB average) plus incrementals for all
> > 20 DLEs.
> >
> > With proper settings, you may also get some benefit from the
> > parallelism of more than one of the 20 DLE dumping to the holding
> > disk at the same time.
> >
> > I presume the holding disk (or disks) is not the same physical
> > disk as the 768 GB data disk.
> >
> > Of course the standard disclaimer applies,
> > this is all conjecture until you try it.
> >
> > jl
> >   
> 
> Interesting solution. But how can I force Amanda to do full dump on
> different day ?
> 

If you mean you want to select the day, you use "amadmin ... force"

If you mean how do you make amanda spread the full dumps around,
don't do anything.  That is what it was designed to do.

You may want to start things gracefully.  Each day add about 20%
of your DLE to the disklist.  Actually add them all as comments
and uncomment a few each day of week 1.

-- 
Jon H. LaBadie                  jon AT jgcomp DOT com
 JG Computing
 12027 Creekbend Drive          (703) 787-0884
 Reston, VA  20194              (703) 787-0922 (fax)

<Prev in Thread] Current Thread [Next in Thread>