Amanda-Users

Re: multiple DLE with tar vs one big DLE with dump (million of files)

2008-10-01 14:50:17
Subject: Re: multiple DLE with tar vs one big DLE with dump (million of files)
From: Jon LaBadie <jon AT jgcomp DOT com>
To: Mailing List Amanda User <amanda-users AT amanda DOT org>
Date: Wed, 01 Oct 2008 13:43:26 -0400
On Wed, Oct 01, 2008 at 10:34:01AM -0400, FM wrote:
> Hello,
> We have a lots of performance problems with our Amanda setup.
> One of our servers have a 760 GB file system that needs to be backup.
> There are 9,286,637 files in this partition. This is the biggest
> partition and the most important.
> 
> Our holding disk has a size of  886 GB.
> We are using Amanda 2.5.0p2
> We would have no prob upgrading to 2.6 IF there is a performance boost.
> 
> The dumptype for this partition is :
> 
> define dumptype tar-high-span-lan {
>     program "GNUTAR"
>     comment "Partitions dumped with tar"
>     compress none
>     estimate calcsize
>     index yes
>     comment "For LAN Servers. High priority partitions dumped with tar
> with spanning of 40GB (10% of tape)"
>     tape_splitsize 40 Gb
>     priority high
> }
> 
> OR
> define dumptype dump-high-lan {
>     program "DUMP"
>     comment "Partitions dumped with dump"
>     estimate calcsize
>     index yes
>     comment "For LAN Servers. High priority partitions dumped with dump"
>     priority high
> }
>    
> 
> 
> Is it better to :
> Use dump for the entire partition in one DLE ?
> Or split the partition in several partitions and use tar ?

Typically dump will perform better than tar for a file system DLE.
It uses system calls to go through the inode list rather than 
traversing the hierarchical file system using user level calls.

However, and IMHO this is a big however, by splitting it up and
using tar you will not have the entire 768GB backed up on the
same day.

Suppose you have a dumpcycle of one week and do daily dumps.
Further, suppose you break it up into about 20 DLEs so that
amanda can balance things nicely.  You can then expect to
dump only about 100-200GB/night.  This would be approx 1/7th
of the full dumps (110GB average) plus incrementals for all
20 DLEs.

With proper settings, you may also get some benefit from the
parallelism of more than one of the 20 DLE dumping to the holding
disk at the same time.

I presume the holding disk (or disks) is not the same physical
disk as the 768 GB data disk.

Of course the standard disclaimer applies,
this is all conjecture until you try it.

jl
-- 
Jon H. LaBadie                  jon AT jgcomp DOT com
 JG Computing
 12027 Creekbend Drive          (703) 787-0884
 Reston, VA  20194              (703) 787-0922 (fax)